nailo2c opened a new pull request, #61188:
URL: https://github.com/apache/airflow/pull/61188

   closes: #44228
   
   # Why
   
   The older `ADLSListOperator` uses `AzureDataLakeHook`, which uses [Gen 1 
SDK](https://github.com/Azure/azure-data-lake-store-python) is already 
[retired](https://learn.microsoft.com/en-us/lifecycle/products/azure-data-lake-storage-gen1).
   
   
https://github.com/apache/airflow/blob/44d36789ec0fa866d253e15691582893d622ed2b/providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/data_lake.py#L133
   
   # How
   
   Replace it with `AzureDataLakeStorageV2Hook`, which uses Gen 2 SDK.
   
   Given Gen1 is retired, the impact should be limited, but this is a breaking 
change.
   
   # What
   
   I created an object (blob) in an Azure Storage account.
   
   <img width="1914" height="389" alt="issue-44228-azure" 
src="https://github.com/user-attachments/assets/bb883f5c-a6a0-45f3-8407-d5555ba0d9c8";
 />
   
   And I used this DAG to test whether I could fetch it.
   ```python
   from datetime import datetime
   
   from airflow import DAG
   from airflow.providers.microsoft.azure.operators.adls import ADLSListOperator
   from airflow.providers.standard.operators.python import PythonOperator
   
   with DAG(
       dag_id="test_adls_issue_44228_fixed",
       start_date=datetime(2026, 1, 1),
       schedule=None,
       catchup=False,
   ) as dag:
   
       list_files = ADLSListOperator(
           task_id="list_adls_files",
           file_system_name="testcontainer",
           path="",
           azure_data_lake_conn_id="adls_gen2_default",
       )
   
       def print_files(ti):
           files = ti.xcom_pull(task_ids="list_adls_files")
           print("=" * 50)
           print(f"Files found: {files}")
           print(f"Total count: {len(files) if files else 0}")
           print("=" * 50)
   
       print_result = PythonOperator(
           task_id="print_result",
           python_callable=print_files,
       )
   
       list_files >> print_result
   ```
   
   It works pretty well.
   <img width="1913" height="510" alt="issue-44228-airflow-dag" 
src="https://github.com/user-attachments/assets/206556b7-6ae0-419f-b7d1-c9fd1d272bef";
 />
   
   # Discussion
   
   It seems `AzureDataLakeHook` uses the Gen 1 SDK. Perhaps we need to add 
`@deprecated(...)` to it?
   
   
https://github.com/apache/airflow/blob/44d36789ec0fa866d253e15691582893d622ed2b/providers/microsoft/azure/src/airflow/providers/microsoft/azure/hooks/data_lake.py#L46
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   <!--
   If generative AI tooling has been used in the process of authoring this PR, 
please
   change below checkbox to `[X]` followed by the name of the tool, uncomment 
the "Generated-by".
   -->
   
   - [x] Yes (please specify the tool below)
   Claude Opus 4.5
   
   <!--
   Generated-by: [Tool Name] following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   -->
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to