Joffreybvn commented on code in PR #36205:
URL: https://github.com/apache/airflow/pull/36205#discussion_r1425397577
##########
airflow/providers/databricks/hooks/databricks_sql.py:
##########
@@ -243,11 +244,15 @@ def run(
@staticmethod
def _make_serializable(result):
- """Transform the databricks Row objects into JSON-serializable
lists."""
+ """Transform the databricks Row objects into JSON-serializable
namedtuple."""
+ columns: list[str] | None = None
if isinstance(result, list):
- return [list(row) for row in result]
+ columns = result[0].__fields__
+ row_object = namedtuple("Row", columns)
Review Comment:
Indeed, this PR is not about preventing the breaking change. But a proposal
to break less workflows - _the simple tuple/list is too agressive_ - and reach
the goal of [this
ADR](https://github.com/apache/airflow/blob/main/airflow/providers/common/sql/doc/adr/0002-return-common-data-structure-from-dbapihook-derived-hooks.md).
The databricks.sql.Row object is a subclass of tuple, which implements:
- a dict-like interface -> breaking change - not available in a namedtuple
- a 'asDict()' method -> breaking change - but namedtuple implements
['_asdict()'](https://docs.python.org/3/library/collections.html#collections.somenamedtuple._asdict)
- fields accessible by name -> supported by namedtuple
Thus users of the dict interface have to rewrite their code. With a major
version bump of the provider, would you agree on this change ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]