gfelot opened a new issue #16529:
URL: https://github.com/apache/airflow/issues/16529


   
   
   **Apache Airflow version**: 2.0.1
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: AWS
   - **OS** (e.g. from /etc/os-release): Ubuntu 20.04
   
   
   **What happened**:
   
   I trigger my dag with the API from a lambda function with a trigger on a 
file upload. I get the file path from the lambda context
   i.e. : 
`ingestion.archive.dev/yolo/PMS_2_DXBTD_RTBD_2021032800000020210328000000SD_20210329052822.XML`
   
   I put this variable in the API call to get it back as `"{{ 
dag_run.conf['file_path'] }}"`
   
   At some point, I need to extract information from this string by splitting 
it by `/` so inside the DAG to use the `S3CopyObjectOperator`.
   
   So here the first approach I had
   
   ```python
   from datetime import datetime
   
   from airflow import DAG
   from airflow.providers.amazon.aws.operators.s3_copy_object import 
S3CopyObjectOperator
   from airflow.operators.python_operator import PythonOperator
   
   
   default_args = {
       'owner': 'me',
   }
   
   s3_final_destination = {
       "bucket_name": "ingestion.archive.dev",
       "verification_failed": "validation_failed",
       "processing_failed": "processing_failed",
       "processing_success": "processing_success"
   }
   
   
   def print_var(file_path,
                 file_split,
                 source_bucket,
                 source_path,
                 file_name):
       data = {
           "file_path": file_path,
           "file_split": file_split,
           "source_bucket": source_bucket,
           "source_path": source_path,
           "file_name": file_name
       }
   
       print(data)
   
   
   with DAG(
           f"test_s3_transfer",
           default_args=default_args,
           description='Test',
           schedule_interval=None,
           start_date=datetime(2021, 4, 24),
           tags=['ingestion', "test", "context"],
   
   ) as dag:
       # {"file_path": 
"ingestion.archive.dev/yolo/PMS_2_DXBTD_RTBD_2021032800000020210328000000SD_20210329052822.XML"}
       file_path = "{{ dag_run.conf['file_path'] }}"
       file_split = file_path.split('/')
       source_bucket = file_split[0]
       source_path = "/".join(file_split[1:])
       file_name = file_split[-1]
   
       test_var = PythonOperator(
           task_id="test_var",
           python_callable=print_var,
           op_kwargs={
               "file_path": file_path,
               "file_split": file_split,
               "source_bucket": source_bucket,
               "source_path": source_path,
               "file_name": file_name
           }
       )
   
       file_verification_fail_to_s3 = S3CopyObjectOperator(
           task_id="file_verification_fail_to_s3",
           source_bucket_key=source_bucket,
           source_bucket_name=source_path,
           dest_bucket_key=s3_final_destination["bucket_name"],
           
dest_bucket_name=f'{s3_final_destination["verification_failed"]}/{file_name}'
       )
   
       test_var >> file_verification_fail_to_s3
   
   ```
   
   I use the `PythonOperator` to check the value I got to debug.
   I have the right value in `file_path` but I got in `file_split` -> 
`['ingestion.archive.dev/yolo/PMS_2_DXBTD_RTBD_2021032800000020210328000000SD_20210329052822.XML']`
   It's my str in a list and not each part splited like 
`["ingestion.archive.dev", "yolo", 
"PMS_2_DXBTD_RTBD_2021032800000020210328000000SD_20210329052822.XML"]`.
   
   So what's wrong here?
   So I started to read more about Jinja Templating and I find out this on 
Airflow : 
https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#rendering-fields-as-native-python-objects
   
   And try to use `render_template_as_native_obj=True` to solve my issue, but I 
got an error when the scheduler picked up my dag saying that this args isn't in 
the DAG object. Effectivly in the documentation, you cannot find it either : 
   
https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/models/dag/index.html?highlight=dag#module-airflow.models.dag
   
   I try to use this argument in the `jinja_environment_kwargs` arg, but it's 
not available again. So there is a regression and an error in the documentation.
   
   But my real question is how to split my jinja str ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to