EssKayz commented on issue #26619:
URL: https://github.com/apache/airflow/issues/26619#issuecomment-1255994705

   We tried taskflow aswell, but with taskflow we ran into the problem that we 
cannot dynamically give task_id to the generated tasks. We are using Airflow 
for an ETL migration process, so we have automagic task generation for data 
exports, transformations, filtering and imports.
   
   Concrete example pseudocode:
   ```
   for each table in database:
       generate following tasks, with table.table_name included in the task_ids 
(50+ tasks with the same name would get extremely confusing)
           task that exports data as list of json from database, and splits it 
into 1...n S3 bucket files,
           and then another task that dynamically generated task, that takes 
one of the split files, and performs a python_function on it. After this - 
collect all the converted objects from the dynamically generated tasks, and 
combine them into one S3 bucket file.
           
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to