Just some history ... and a renaming suggestion... When get_uri was added, generating the connection URI was a real pain, and Airflow provided no way to generate one, and JSON repr was not yet implemented. By adding get_uri, a user could define the connection object we needed in python, and airflow would give us the URI we needed to set in our env vars. This was the only purpose.
I don't think at the time I saw the potential for confusion with the sqlalchemy URI. In hindsight, get_uri may not be the best name. Recently, a method was added `to_json_dict` which gives the json repr. Maybe we should rename get_uri to be consistent and more indicative of the intention. We could rename it `to_uri_str`. Or, maybe better we could make the names consistent and do `to_json_repr` and `to_uri_repr`. Or something like that? @andrey WDYT? Adding the _repr suffix may help indicate that this is the URI (or json) representation of the airflow connection object. On Sun, Nov 26, 2023 at 10:52 PM Tzu-ping Chung <t...@astronomer.io.invalid> wrote: > Some context: I’ve been looking into putting the Connection URI to more > use, most importantly to consolidate it with OpenLineage’s URI format. > > Skipping the details, one of the conclusions I had is get_uri on > Connection is not actually that useful, due to how different libraries > (SQLAlchemy, Airflow Object Store, OpenLineage, etc.) have drastically > different URI formats, and it is not practical to implement get_uri in > Connection to be compatible with all of them. The correct layer to generate > a connection URI is in hooks instead, and the format does not need to be > visible to the user, but internal between the hook and whatever library it > talks to, since the URI semantics are too specific to a library to be > generally useful as a serde format. > > Therefore I think deprecating the URI Connection representation would not > be harmful, except maybe the URI can be more useful in simplistic cases. > > > > On 27 Nov 2023, at 12:21, Amogh Desai <amoghdesai....@gmail.com> wrote: > > > > A little late to the party here but convinced by what Daniel had to say. > > > > I am ok with the idea of adding this into the deprecation bucket in the > > coming time. > > > > Implementing the get_uri in the hook seems quite reasonable to me > > on the grounds that the Connection class' get_uri is meant to be general, > > supporting the use case of serializing the Airflow connection on basis of > > host, port, schema, password, type etc. > > > > Thanks & Regards, > > Amogh Desai > > > > On Mon, Nov 20, 2023 at 9:56 PM Daniel Standish > > <daniel.stand...@astronomer.io.invalid> wrote: > > > >>> > >>> I still think different people have different expectations of the > Airflow > >>> URI format. My expectation is that this is only a SerDe mechanism > around > >>> Airflow Connection. If we have a look at the codebase of Airflow we > found > >>> that this is not always true and we try to use it as SA URI. > >> > >> > >> Yes, this is in my view exactly the problem. I believe we run into > trouble > >> only when we falsely think that we should be able to define an Airflow > >> connection using, say, a jdbc connection string. It's not true in > general. > >> > >> Observe that DbApiHook has a get_uri > >> < > >> > https://github.com/apache/airflow/blob/064fc2b7751a44e37ccce97609cff7c496098e56/airflow/providers/common/sql/hooks/sql.py#L188-L196 > >>> > >> method. > >> > >> This method is used to produce a connection string valid for the service > >> you're connecting to, and importantly it may be different from the > >> Connection.get_uri. > >> > >> The primary responsibility of Airflow connection URI is to serialize the > >> airflow connection object which has type, login, pass, host, schema > port, > >> extra and that's it. For simple cases it can serve both purposes but > it's > >> not true in general and folks need to understand that. > >> > >> I think part of the problem may simply be that `uri` is an argument of > the > >> Connection object. It means not the connection string but the airflow > conn > >> URI serialization. Would probably be a good candidate for deprecation > >> on the basis that it is confusing. > >> > >> Fundamentally, as a hook designer, if you need a conn string that is a > URI, > >> you must implement get_uri on the hook in the right way to be able to > >> use the attributes available to you in Connection. Connection.get_uri > will > >> handle creating the Airflow Conn URI serialization and that's a separate > >> thing. > >> > >> The connection object has no clue how to generate a valid connection > string > >> for your service, but the hook can have this understanding. > >> > >> There are many examples in the codebase where users implement get_uri on > >> the hook and it works just fine. A hook author could write it so you > can > >> optionally dump the full connection string in the extra object: > {"extra": > >> {"uri": "blah"}}. > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >