galak75 edited a comment on issue #4743: [AIRFLOW-3871] render Operators template fields recursively URL: https://github.com/apache/airflow/pull/4743#issuecomment-473318089 > I'm leaning towards a -1 on this - it seems that there are going to be many more case where this won't Just Work leading to confusion on behalf of users. > > Can you give a concrete example of when you'd need this? In examples you've shown above why couldn't you create the `MyAwesomeDataFileTransformer` _inside_ the callable instead? Sure @ashb. Here is the real use case: We have a lot of different data sources to be imported, transformed and then exported to several destinations. So we tried to decouple our business logic (importing data, transforming it, and then exporting it) from airflow callable functions: First, we defined a template method to be used as a `PythonOperator` callable: ```python def process_data(dataImporter, dataTransformer, dataExporter): data = dataImporter.import_data() data = dataTransformer.transform_data(data) dataExporter.export_data(data) ``` Then we just have to "inject" the proper `data_importer`, `data_transformer` and `data_exporter` in our DAG task (without having to write a new callable function when the data source is changing or the transformation needs some more parameters...) let's look at some simple examples: ```python task1 = PythonOperator( task_id='task_1', python_callable=process_data, op_args=[ SomeFileDataImporter('/tmp/{{ ds }}/input_data'), SomeDataTransformer(some_value, path/to/other/file/{{ ds }}/file), SomeFileDataExporter(/data/output/{{ dag.dag_id }}/output_file) ], dag=dag ) task2 = PythonOperator( task_id='task_2', python_callable=process_data, op_args=[ SomeJoiningCsvFilesDataImporter( '/tmp/{{ ds }}/input_data_1', '/tmp/{{ ds }}/input_data_2', join_on='id' ), SomeOtherDataTransformer(some_value), SomeOtherDataExporter(execution_ts={{ ts }}) ], dag=dag ) ``` This approach has several benefits: - separation of concerns: the callable function does not need to know how to import data (it could come from one file, several files, a database, an API, etc...), how to transform it and how to export it. - respect single responsibility principle: we separate responsibilities in simple classes: a pickle file importer, a json file importer, a sql importer, etc... - use DAG declaration as an IOC tool: we just inject proper implementations, depending on each use case. When another transformation is required, just inject another data_transformer implementation without having to write another callable function. We do not want the callable function being responsible to instantiate the right implementation. To make this approach work properly, we would like operators to be able to render nested template fields. The `template_fields` solution may have some limitation (as [pointed out](https://github.com/apache/airflow/pull/4743#issuecomment-472735428) by @bjoernpollex-sc), but it would answer some needed use cases. I also suggested [adding another `render_template` hook function](https://github.com/apache/airflow/pull/4743#issuecomment-472996201) that people could implement to solve custom template rendering issues. Would it answer your concerns about this submitted pull request? Thank you for taking time to read this cheers
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
