nailo2c opened a new pull request, #61188: URL: https://github.com/apache/airflow/pull/61188
closes: #44228 # Why The older `ADLSListOperator` uses `AzureDataLakeHook`, which uses [Gen 1 SDK](https://github.com/Azure/azure-data-lake-store-python) is already [retired](https://learn.microsoft.com/en-us/lifecycle/products/azure-data-lake-storage-gen1). https://github.com/apache/airflow/blob/44d36789ec0fa866d253e15691582893d622ed2b/providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/data_lake.py#L133 # How Replace it with `AzureDataLakeStorageV2Hook`, which uses Gen 2 SDK. Given Gen1 is retired, the impact should be limited, but this is a breaking change. # What I created an object (blob) in an Azure Storage account. <img width="1914" height="389" alt="issue-44228-azure" src="https://github.com/user-attachments/assets/bb883f5c-a6a0-45f3-8407-d5555ba0d9c8" /> And I used this DAG to test whether I could fetch it. ```python from datetime import datetime from airflow import DAG from airflow.providers.microsoft.azure.operators.adls import ADLSListOperator from airflow.providers.standard.operators.python import PythonOperator with DAG( dag_id="test_adls_issue_44228_fixed", start_date=datetime(2026, 1, 1), schedule=None, catchup=False, ) as dag: list_files = ADLSListOperator( task_id="list_adls_files", file_system_name="testcontainer", path="", azure_data_lake_conn_id="adls_gen2_default", ) def print_files(ti): files = ti.xcom_pull(task_ids="list_adls_files") print("=" * 50) print(f"Files found: {files}") print(f"Total count: {len(files) if files else 0}") print("=" * 50) print_result = PythonOperator( task_id="print_result", python_callable=print_files, ) list_files >> print_result ``` It works pretty well. <img width="1913" height="510" alt="issue-44228-airflow-dag" src="https://github.com/user-attachments/assets/206556b7-6ae0-419f-b7d1-c9fd1d272bef" /> # Discussion It seems `AzureDataLakeHook` uses the Gen 1 SDK. Perhaps we need to add `@deprecated(...)` to it? https://github.com/apache/airflow/blob/44d36789ec0fa866d253e15691582893d622ed2b/providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/data_lake.py#L46 --- ##### Was generative AI tooling used to co-author this PR? <!-- If generative AI tooling has been used in the process of authoring this PR, please change below checkbox to `[X]` followed by the name of the tool, uncomment the "Generated-by". --> - [x] Yes (please specify the tool below) Claude Opus 4.5 <!-- Generated-by: [Tool Name] following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) --> --- * Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. Note: commit author/co-author name and email in commits become permanently public when merged. * For fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. * When adding dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). * For significant user-facing changes create newsfragment: `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
