*"However a DAG with such a complicated name isn't referenced in the
examplecode (just "Pipeline" + i). My guess is that the DAG id is being
generatedin a non-deterministic or time-based way, and therefore the run
commandcan't find it once the generation criteria change. But hard to say
withoutmore detail."*
[response]  I am not sure what you mean by non-deterministic way.
We are dynamically creating DAGs (1 Dag per customer) . So if there are 100
customers with itds 1, 2,3 .... 100, there will be 100 pipelines/dags with
names:

*Pipeline_1, Pipeline_2, Pipeline_3 ...... Pipeline_100*

The customer names I get from our database table which stores this
information.

so the flow is:

customer_id_list ->  getCustomerIds( //run some sql )
for each customerId in  customer_id_list:
  dag = DAG("Pipeline_"+ customerId, default_args=default_args,
*schedule_interval= **datetime.timedelta(minutes=60)*)


Let me know if that helps(or confuses :) ) you in understanding the flow.
If there is some thing wrong in what I am doing, I would love to know what
is it. This seems to be a serious issue (if it is) esp while running
backfill.



On Fri, Apr 29, 2016 at 12:19 PM, Jeremiah Lowin <[email protected]> wrote:

> That error message usually means that an error took place inside Airflow
> before the task ran -- maybe something with setting up the task? The task's
> state is NONE, meaning it never even started, but the executor is reporting
> that it successfully sent the command to start the task (SUCCESS)... the
> culprit is some failure in between.
>
> The error message seems to say that the DAG itself couldn't be loaded from
> the .py file:
>
> airflow.utils.AirflowException: DAG
> [Pipeline_DEVTEST_CDB_DEVTEST_00_B10C8DBE1CFA89C1F274B]
> could not be found in /usr/local/airflow/dags/pipeline.py
>
> However a DAG with such a complicated name isn't referenced in the example
> code (just "Pipeline" + i). My guess is that the DAG id is being generated
> in a non-deterministic or time-based way, and therefore the run command
> can't find it once the generation criteria change. But hard to say without
> more detail.
>
>
>
> On Fri, Apr 29, 2016 at 3:11 PM Bolke de Bruin <[email protected]> wrote:
>
> > I would really like to know what the use case is for a depends_on_past on
> > the *task* level.  What past are you trying to depend on?
> >
> > What I am currently assuming from just reading the example and replying
> on
> > my phone is that the depends_on_past prevents execution. Have t4 and t5
> > ever run?
> >
> > Bolke
> >
> > Sent from my iPhone
> >
> > > On 29 apr. 2016, at 21:01, Chris Riccomini <[email protected]>
> > wrote:
> > >
> > > @Bolke/@Jeremiah, do you guys think this is related? Full thread is
> here:
> > >
> https://groups.google.com/forum/?pli=1#!topic/airbnb_airflow/y7wt3I24Rmw
> > >
> > >> On Fri, Apr 29, 2016 at 11:57 AM, Chris Riccomini <[email protected]>
> > wrote:
> > >>
> > >> Please subscribe to the dev@ mailing list. Sorry to make you jump
> > through
> > >> hoops--I know it's annoying--but it's for a good cause. ;)
> > >>
> > >> This looks like a bug. I'm wondering if it's related to
> > >> https://issues.apache.org/jira/browse/AIRFLOW-20. Perhaps the
> backfill
> > is
> > >> causing a mis-alignment between the dag runs, and depends_on_past
> logic
> > >> isn't seeing the prior execution?
> > >>
> >
>

Reply via email to