swapz-z commented on issue #27979:
URL: https://github.com/apache/airflow/issues/27979#issuecomment-1342193953
@dacort
Use-case :
Its same as what we have in the AWS UI,
In an airflow Pipeline DAG, if we have a an EMR Operator step which is
basically stuck in Pending/Running for an unexpected amount of time and we want
to stop the current execution and mark it for retry.
In current behaviour, if a step is running on EMR and we maybe want to send
cancel to it, there is no provision for that ( Clear button in UI doesn't send
anything to emr, instead it adds another same step and will be in PENDING until
the previous step has either COMPLETED/FAILED (Step concurrency - 1)
> > Just add a new step - no cancel
>
> I'd be hesitant to do this as who knows how long that step may run for and
consume cluster resources.
Exactly, eating up unwanted resources. This is actually the current
behaviour when a clear is called on `EmrAddStepsOperator` - it just adds
another same step
> > If there is an existing named step
>
> Would we be able to keep track of the Step ID and use that to cancel it?
yes, based on the Name of the step, we can get the StepId to be cancelled
After clear this command will be triggered
- AddNewStep
` [{`'Name'`: 'import_jars', 'ActionOnFailure': 'CONTINUE', 'HadoopJarStep':
{'Jar': 'command-runner.jar', 'Args': ['bash', '-c', 'sudo cp
/usr/share/aws/redshift/jdbc/*.jar /usr/lib/spark/jars/']}}]`
We'll make a` list_steps() `call to emr on that cluster (via boto3) and
check if same `Name` step is in PENDING/RUNNING state, then corresponding
stepId will be marked for cancellation
- steps list to be cancelled `[('import_jars', 's-XV0CPI8KO65T')]`
- And new step will be added
After changes done in operator => demo behaviour


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]