+1 to Daniels's suggestion. On Tue, Nov 28, 2023 at 5:25 AM Amogh Desai <amoghdesai....@gmail.com> wrote:
> Agree with @Daniel Standish <daniel.stand...@astronomer.io> on this. > We should rename get_uri to reduce confusion. > > I am fine with renaming to to_uri_repr and then deprecating get_uri > > > Thanks & Regards, > Amogh Desai > > On Mon, Nov 27, 2023 at 8:56 PM Andrey Anshin <andrey.ans...@taragol.is> > wrote: > > > > Recently, a method was added `to_json_dict` which gives the json repr. > > > > This method is internal method which use in > > 1. Connection Serialisation, and it is a replacement *to_dict* method > which > > was used before https://github.com/apache/airflow/pull/35723 > > 2. Use in public method *as_json* > > > > Personally I do not have any personal preference for the naming but if it > > help to reduce confusion than: "why not?". > > > > *as_json* doesn't include in any released version of Airflow, so we could > > easily change it without any breaking changes: > > - get_uri -> to_uri_repr, and deprecate get_uri > > - as_json -> to_json_repr, we don't need to deprecate anything if it > added > > before 2.8.0 > > > > > > > > > > On Mon, 27 Nov 2023 at 18:47, Daniel Standish > > <daniel.stand...@astronomer.io.invalid> wrote: > > > > > Just some history ... and a renaming suggestion... > > > > > > When get_uri was added, generating the connection URI was a real pain, > > and > > > Airflow provided no way to generate one, and JSON repr was not yet > > > implemented. By adding get_uri, a user could define the connection > > object > > > we needed in python, and airflow would give us the URI we needed to set > > in > > > our env vars. This was the only purpose. > > > > > > I don't think at the time I saw the potential for confusion with the > > > sqlalchemy URI. In hindsight, get_uri may not be the best name. > > Recently, > > > a method was added `to_json_dict` which gives the json repr. Maybe we > > > should rename get_uri to be consistent and more indicative of the > > > intention. > > > > > > We could rename it `to_uri_str`. Or, maybe better we could make the > > names > > > consistent and do `to_json_repr` and `to_uri_repr`. Or something like > > > that? @andrey WDYT? Adding the _repr suffix may help indicate that > this > > > is the URI (or json) representation of the airflow connection object. > > > > > > > > > > > > > > > On Sun, Nov 26, 2023 at 10:52 PM Tzu-ping Chung > <t...@astronomer.io.invalid > > > > > > wrote: > > > > > > > Some context: I’ve been looking into putting the Connection URI to > more > > > > use, most importantly to consolidate it with OpenLineage’s URI > format. > > > > > > > > Skipping the details, one of the conclusions I had is get_uri on > > > > Connection is not actually that useful, due to how different > libraries > > > > (SQLAlchemy, Airflow Object Store, OpenLineage, etc.) have > drastically > > > > different URI formats, and it is not practical to implement get_uri > in > > > > Connection to be compatible with all of them. The correct layer to > > > generate > > > > a connection URI is in hooks instead, and the format does not need to > > be > > > > visible to the user, but internal between the hook and whatever > library > > > it > > > > talks to, since the URI semantics are too specific to a library to be > > > > generally useful as a serde format. > > > > > > > > Therefore I think deprecating the URI Connection representation would > > not > > > > be harmful, except maybe the URI can be more useful in simplistic > > cases. > > > > > > > > > > > > > On 27 Nov 2023, at 12:21, Amogh Desai <amoghdesai....@gmail.com> > > > wrote: > > > > > > > > > > A little late to the party here but convinced by what Daniel had to > > > say. > > > > > > > > > > I am ok with the idea of adding this into the deprecation bucket in > > the > > > > > coming time. > > > > > > > > > > Implementing the get_uri in the hook seems quite reasonable to me > > > > > on the grounds that the Connection class' get_uri is meant to be > > > general, > > > > > supporting the use case of serializing the Airflow connection on > > basis > > > of > > > > > host, port, schema, password, type etc. > > > > > > > > > > Thanks & Regards, > > > > > Amogh Desai > > > > > > > > > > On Mon, Nov 20, 2023 at 9:56 PM Daniel Standish > > > > > <daniel.stand...@astronomer.io.invalid> wrote: > > > > > > > > > >>> > > > > >>> I still think different people have different expectations of the > > > > Airflow > > > > >>> URI format. My expectation is that this is only a SerDe mechanism > > > > around > > > > >>> Airflow Connection. If we have a look at the codebase of Airflow > we > > > > found > > > > >>> that this is not always true and we try to use it as SA URI. > > > > >> > > > > >> > > > > >> Yes, this is in my view exactly the problem. I believe we run > into > > > > trouble > > > > >> only when we falsely think that we should be able to define an > > Airflow > > > > >> connection using, say, a jdbc connection string. It's not true in > > > > general. > > > > >> > > > > >> Observe that DbApiHook has a get_uri > > > > >> < > > > > >> > > > > > > > > > > https://github.com/apache/airflow/blob/064fc2b7751a44e37ccce97609cff7c496098e56/airflow/providers/common/sql/hooks/sql.py#L188-L196 > > > > >>> > > > > >> method. > > > > >> > > > > >> This method is used to produce a connection string valid for the > > > service > > > > >> you're connecting to, and importantly it may be different from the > > > > >> Connection.get_uri. > > > > >> > > > > >> The primary responsibility of Airflow connection URI is to > serialize > > > the > > > > >> airflow connection object which has type, login, pass, host, > schema > > > > port, > > > > >> extra and that's it. For simple cases it can serve both purposes > > but > > > > it's > > > > >> not true in general and folks need to understand that. > > > > >> > > > > >> I think part of the problem may simply be that `uri` is an > argument > > of > > > > the > > > > >> Connection object. It means not the connection string but the > > airflow > > > > conn > > > > >> URI serialization. Would probably be a good candidate for > > deprecation > > > > >> on the basis that it is confusing. > > > > >> > > > > >> Fundamentally, as a hook designer, if you need a conn string that > > is a > > > > URI, > > > > >> you must implement get_uri on the hook in the right way to be able > > to > > > > >> use the attributes available to you in Connection. > > Connection.get_uri > > > > will > > > > >> handle creating the Airflow Conn URI serialization and that's a > > > separate > > > > >> thing. > > > > >> > > > > >> The connection object has no clue how to generate a valid > connection > > > > string > > > > >> for your service, but the hook can have this understanding. > > > > >> > > > > >> There are many examples in the codebase where users implement > > get_uri > > > on > > > > >> the hook and it works just fine. A hook author could write it so > > you > > > > can > > > > >> optionally dump the full connection string in the extra object: > > > > {"extra": > > > > >> {"uri": "blah"}}. > > > > >> > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > > > > > > > > > > >