Thanks, that makes sense. For future reference: https://github.com/apache/incubator-airflow/blob/ff45d8f2218a8da9328161aa66d004c3db3b367e/airflow/operators/sensors.py#L71
On Tue, Sep 5, 2017 at 11:35 AM, Ash Berlin-Taylor < [email protected]> wrote: > The primary difference between those cases and the other Sensors is that > the sensors I've seen (EMR Job Flow, S3 Key) don't do anything _other_ than > the sensing task, where as the tasks you linked to also perform some other > action; it's just that they wait until that operation is complete before > returning. > > Additionally my understanding is that there Sensor's are just a API/python > class-level convention that don't make any difference to the scheduler, > i.e. this is what the BaseSensor class does: > > > def execute(self, context): > started_at = datetime.now() > while not self.poke(context): > if (datetime.now() - started_at).total_seconds() > self.timeout: > if self.soft_fail: > raise AirflowSkipException('Snap. Time is OUT.') > else: > raise AirflowSensorTimeout('Snap. Time is OUT.') > sleep(self.poke_interval) > logging.info("Success criteria met. Exiting.") > > i.e. there's not much difference in effect from an operator that loops and > sleeps itself to one which is a Sensor. > > -ash > > > On 5 Sep 2017, at 16:14, Richard Baron Penman <[email protected]> > wrote: > > > > Hello, > > > > I noticed some operators in contrib (ECS, databricks, dataproc) submit > > their task and then poll until complete: > > https://github.com/apache/incubator-airflow/blob/master/ > airflow/contrib/operators/ecs_operator.py > > https://github.com/apache/incubator-airflow/blob/master/ > airflow/contrib/operators/databricks_operator.py > > https://github.com/apache/incubator-airflow/blob/master/ > airflow/contrib/operators/dataproc_operator.py > > > > Would they be better designed as Sensors? > > > > I ask because I wrote a Sensor for an API and wondering whether there was > > an advantage to the Operator polling approach. > > > > Richard > >
