Would this change to EmrAddStepsOperator make sense? If so I can go ahead and create a ticket and a PR
On Wed, Nov 13, 2019 at 8:11 PM Aviem Zur <[email protected]> wrote: > Sure. > > While most EMR clusters are ephemeral, some of our use cases required > persistent EMR clusters since the apps they run are short and run on a > short interval so the overhead of creating a new EMR cluster is too high. > > In these cases I want to make sure that if the cluster dies and is > replaced by another one nothing needs to change in the DAG. > > So if I search by cluster name (In our use case we only have 1 cluster > alive for any given name) I can always find the correct cluster ID. > > Perhaps instead of a whole operator it can be added to EmrHook as you > suggested, then an option to pass either cluster name or id > to EmrAddStepsOperator (which today only accepts cluster id [param > job_flow_id]). > > On Wed, Nov 13, 2019 at 5:59 PM Ash Berlin-Taylor <[email protected]> wrote: > >> My initial thought is that doesn't quite sound like a whole operator, but >> a useful function to add to the EmrHook. >> >> Could you describe in a little bit more detail how you use it? >> >> -a >> >> > On 13 Nov 2019, at 15:40, Aviem Zur <[email protected]> wrote: >> > >> > Hi, >> > >> > I've created a new operator and want to check viability to contribute >> it to >> > airflow/contrib >> > >> > The operator is called: emr_cluster_name_to_id >> > >> > Given an EMR cluster name will return id of the first live cluster found >> > with a matching name. >> > This is useful for users with persistent EMR clusters they wish to add >> > steps to via airflow. >> > If the cluster dies and is replaced by a new cluster with the same name >> no >> > code or configuration needs to be changed since the operator will pick >> up >> > the correct id when the DAG is run. >> > >> > Is this a viable operator for airflow/contrib? >> > If so I'll create a JIRA task and a PR on GitHub. >> > >> > Thanks, >> > Aviem >> >>
