[GitHub] [airflow] MatrixManAtYrService opened a new issue, #23020: Names for expanded tasks

GitBox Thu, 14 Apr 2022 09:25:44 -0700


MatrixManAtYrService opened a new issue, #23020:
URL: https://github.com/apache/airflow/issues/23020


   ### Description
   
   Airflow currently exposes `map_index` to the user as a way of distinguishing 
between tasks in an expansion.  The index is unlikely to be meaningful to the 
user.  They probably have their own label for this action.  I'm requesting that 
we allow them to add that label.
   
   To see the problem, consider a dag that sends email to a list of users which 
is generated at runtime:
   
   ```python3
   with DAG(...) as dag:
   
       @dag.task
       def get_account_status():
           return [
               {
                   "NAME": "Wintermute",
                   "EMAIL": "[email protected]",
                   "STATUS": "active",
               },
               {
                   "NAME": "Hojo",
                   "EMAIL": "[email protected]",
                   "STATUS": "delinquent",
               },
           ]
   
       BashOperator.partial(
           task_id="send_email",
           bash_command=dedent(
               """
               cat <<- EOF | tee | mailx -s "your account" $EMAIL
               Dear $NAME,
                   Your account status is $STATUS.
               EOF
               """
           ),
       ).expand(env=get_account_status())
   ```
   
   Notice that in the grid view, it's not obvious which task goes with which 
user:
   
   <img width="1106" alt="Screen Shot 2022-04-14 at 8 56 09 AM" 
src="https://user-images.githubusercontent.com/5834582/163418431-3180a29d-b9c0-4bbc-9a80-05ad5e4f34e7.png";>
   
   
   
   ### Use case/motivation
   
   I'd like to be able to explicitly assign a name to each expanded task, that 
way I can later go look at the right one.  I would like this name to be used 
(when available) anywhere that the user interacts with the expanded task.
   
   I'm not sure if we should replace `map_index` with `mapped_task_key`, or 
just add a name as a separate thing.  The replacement sounds more invasive, but 
it would allow us to generate better names in cases where more than one kwarg 
is mapped to. 
   
   For instance, this expansion generates four instances.
   
   ```python3
   BashOperator.partial(task_id="greet").expand(
       bash_command=["echo hello $USER", "echo goodbye $USER"],
       env=[{"USER": "foo"}, {"USER": "bar"}],
   )
   ```
   
   If the user doesn't supply names (via whatever API we come up with), do we 
want to call them `1` `2` `3` and `4`?  Or should they be more descriptive like 
   - `bash_command_1_env_1`
   - `bash_command_1_env_2`
   - `bash_command_2_env_1`
   - `bash_command_2_env_2`
   
   I don't know.  I'm creating this issue so we have a place to discuss it.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] MatrixManAtYrService opened a new issue, #23020: Names for expanded tasks

Reply via email to