luke-hoffman1 opened a new issue, #47587: URL: https://github.com/apache/airflow/issues/47587
### Apache Airflow Provider(s) google, openlineage ### Versions of Apache Airflow Providers apache-airflow-providers-google==10.26.0 apache-airflow-providers-openlineage==2.1.0 ### Apache Airflow version 2.10.5 ### Operating System macOS Sequoia Version 15.3.1 (24D70) ### Deployment Astronomer ### Deployment details FROM quay.io/astronomer/astro-runtime:12.7.1 ### What happened When I execute a CTAS statement that references a view, OpenLineage returns the underlying tables instead of the view name. I suspect this is because the BigQuery Job API provides the underlying table information more readily available than the view name itself. The only place I can find the view name is in the `configuration.query.query` property. It appears that the input tables are instead being retrieved from the `statistics.query.referencedTables` property. I believe this is the relevant [code](https://github.com/apache/airflow/blob/eb18f87f091116a9b7db5ae30fdb40f6e0a6377f/providers/google/src/airflow/providers/google/cloud/openlineage/mixins.py#L231) ### What you think should happen instead It would be beneficial to receive the view name as the OpenLineage input instead of the underlying table names, as this would ensure we capture the complete lineage. ### How to reproduce DAG: ``` from airflow import DAG from airflow.providers.google.cloud.operators.bigquery import ( BigQueryInsertJobOperator ) from datetime import datetime dag = DAG( dag_id="dag_execute_bq_ctas", schedule_interval=None, start_date=datetime(2025, 3, 4), # Start date ) task1 = BigQueryInsertJobOperator( task_id="task1", gcp_conn_id="bq_conn", configuration={ "query": { "query": f"CREATE OR REPLACE TABLE <bq-dataset>.table1 AS SELECT * FROM <bq-dataset>.<view-name>;", "useLegacySql": False, "priority": "BATCH", } }, dag=dag, ) task1 ``` ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
