dbarrundiag commented on PR #22692: URL: https://github.com/apache/airflow/pull/22692#issuecomment-1086974829
Just out of curiosity, shouldn't we try to leverage the official [Python Connector](https://github.com/delta-io/delta-sharing/tree/main/python) that `delta-sharing` has already implemented? I feel like I see here a bunch of duplicate or unnecessary code and might be a better design if we follow the "official" protocol so we don't have to keep up with API changes here? For example instead of re-writing all this code code and implementing a `_extract_delta_sharing_version` method as we do here: https://github.com/apache/airflow/blob/9547f8c718719f24d5b25b856c4b6839bf5d2524/airflow/providers/delta/sharing/hooks/delta_sharing.py#L223 couldn't we just call `query_table_version` which has already been implemented in the "official" REST client: https://github.com/delta-io/delta-sharing/blob/main/python/delta_sharing/rest_client.py#L229 or why introduce a `DeltaSharingQueryResult` in the Airflow provider if we already have the "official" `QueryTableMetadataResponse` or `QueryTableVersionResponse` here https://github.com/delta-io/delta-sharing/blob/main/python/delta_sharing/rest_client.py#L66 What do you all think? The only "con" I see is that it introduces a requirement to install the python module, but i feel like that's totally fair no? CC: @alexott @mik-laj @potiuk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
