Pyasma commented on code in PR #57153:
URL: https://github.com/apache/airflow/pull/57153#discussion_r2462561933
##########
providers/apache/impala/src/airflow/providers/apache/impala/hooks/impala.py:
##########
@@ -45,3 +46,30 @@ def get_conn(self) -> Connection:
database=connection.schema,
**connection.extra_dejson,
)
+
+ @property
+ def sqlalchemy_url(self) -> URL:
+ """Return a `sqlalchemy.engine.URL` object constructed from the
connection."""
+ conn = self.get_connection(self.get_conn_id())
+ extra = conn.extra_dejson or {}
+
+ required_attrs = ["host", "login"]
+ for attr in required_attrs:
+ if getattr(conn, attr) is None:
+ raise ValueError(f"Impala Connection Error: '{attr}' is
missing in the connection")
+
+ query = {k: str(v) for k, v in extra.items() if v is not None and k
not in ["__extra__"]}
Review Comment:
Thanks for the review, @Lee-W.
I was going through the Airflow docs and came across the section on
*Handling of arbitrary dict in extra*.
From what I understood, `__extra__` gets created internally when
`Connection.get_uri()` encodes nested JSON in the
`extra` field.
Filtering it seems to help avoid passing that internal metadata key into
SQLAlchemy queries.
https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#handling-of-arbitrary-dict-in-extra
Let me know if I’m interpreting that correctly or if there’s a better way to
handle it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]