I did the same (infact we combined launch cluster + add step + sensor, but 
that's another story) at a previous place.

https://github.com/apache/airflow/pull/6090/files (PR to add 
LivyOperator+Hook+Sensor) adds a `polling_interval` that will make the operator 
poll/wait for completion. Adding the same feature (with the same parameter 
name?) to EmrAddStepOperator would be good I think.

I don't know off the top of my head if there is anywhere else in the code base 
that we already do something similar so a different name we should use. Anyone?

-ash

> On 14 Oct 2019, at 20:42, Daniel Mateus Pires <dmate...@gmail.com> wrote:
> 
> Hi there!
> 
> Would it make sense to add an operator that is both the EmrAddStep operator
> and the step sensor?
> 
> In a past role we were using Airflow heavily for all things EMR, and I
> found myself writing an Operator that combined the emr_add_step operator
> and the sensor, it made the pipelines simpler (less operator instances per
> DAG) and retries were easy
> 
> There is still value in keeping those 2 other classes around when we don't
> care about the result of an EMR step or we are polling for the completion
> of an EMR step we did not start from Airflow, but for most tasks wouldn't a
> "merged operator" make sense?
> 
> Thanks!
> Daniel

Reply via email to