JDarDagran opened a new issue, #35552: URL: https://github.com/apache/airflow/issues/35552
### Apache Airflow version main (development) ### What happened For SQL based operators there is `airflow.providers.openlineage.utils.sql` module used by `SQLParser` interface class. In short: it allows to parse table schemas based on input and output dataset parsed from SQL query. ### What you think should happen instead It should take into consideration if there is database/schema from connection setup detected from information schema query result. If there is one found it should stop adding other tables. ### How to reproduce Corner case is following: 1. use database connection with database and/or schema default set 2. refer to table name only in SQL query (e.g. `SELECT * FROM my_table` instead of `SELECT * FROM my_schema.my_table`) 3. if there's the same table name in other database/schema (or database+schema combination, it depends on database) OL integration will produce two datasets for tables. For instance if one uses postgres with search path set to `public` schema `SELECT * FROM my_table` would get data from `public.my_table` even if there is another table with the same name but different schema. OL integration will take both `my_schema.my_table` and `public.my_table`. ### Operating System macOS ### Versions of Apache Airflow Providers apache-airflow-providers-openlineage==1.2.0 ### Deployment Other Docker-based deployment ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
