Bjorn Olsen created AIRFLOW-6824:
------------------------------------

             Summary: EMRAddStepsOperator does not work well with multi-step 
XCom
                 Key: AIRFLOW-6824
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6824
             Project: Apache Airflow
          Issue Type: Bug
          Components: aws
    Affects Versions: 1.10.9
            Reporter: Bjorn Olsen
            Assignee: Bjorn Olsen


EmrAddStepsOperator allows you to add several steps to EMR for processing - the 
steps must be supplied as a list.
This works well when passing an actual Python list as the 'steps' value, but we 
want to be able to generate the list of steps from a previous task - using an 
XCom.

We must use the operator as follows, for the templating to work correctly and 
for it to resolve the XCom:

 
{code:java}
add_steps_task = EmrAddStepsOperator(
 task_id='add_steps',
 job_flow_id=job_flow_id,
 aws_conn_id='aws_default',
 provide_context=True,
 steps="{{task_instance.xcom_pull(task_ids='generate_steps')}}"
 ){code}
 

The value in XCom from the 'generate_steps' task looks like (simplified):
{code:java}
[{'Name':'Step1'}, {'Name':'Step2'}]
{code}

However this is passed as a string to the operator, which cannot be passed to 
the underlying boto3 library which expects a list object.

The following won't work either:
{code:java}
add_steps_task = EmrAddStepsOperator(
 task_id='add_steps',
 job_flow_id=job_flow_id,
 aws_conn_id='aws_default',
 provide_context=True,
 steps={{task_instance.xcom_pull(task_ids='generate_steps')}}
 ){code}
Since this is not valid Python.

We have to pass the steps as a string to the operator, and then convert it into 
a list after the render_template_fields has happened (immediately before the 
execute). Therefore the only option is to do the conversion from string to list 
in the operator's execute method.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to