Re: 答复: How to know the DAG is starting to run

2018-05-11 Thread Chris Palmer
It's not even clear to me what it means for a DAG to start running. The creation of a DagRun for a specific execution date is completely independent of the scheduling of any TaskInstances for that DagRun. There could be a significant delay between those two events, either deliberately encoded into

Re: KubernetesPodOperator: Invalid arguments were passed to BaseOperator

2018-05-30 Thread Chris Palmer
In the example it imports BaseOperator as KubernetesPodOperator, when the kubernetes modules can't be found. On Wed, May 30, 2018 at 3:34 PM, Craig Rodrigues wrote: > For this

Re: Scheduler won't schedule past minimum end_date of tasks

2018-02-22 Thread Chris Palmer
om/apache/incubator-airflow/tree/master/ > airflow/ti_deps/deps> to ensure the exec date is less than the task end > date? > > -ash > > > On 21 Feb 2018, at 20:58, Chris Palmer <ch...@crpalmer.com> wrote: > > > > I was very surprised to find that if you set an

Re: Airflow at SREcon?

2018-02-23 Thread Chris Palmer
Not directly on topic to your email, but Fitbit has started using Airflow for some things. In particular the Data Engineering team, which I'm a member of, and is based in Boston, is starting to use it for much of or ETL processes. Chris On Fri, Feb 23, 2018 at 3:23 PM, James Meickle

Scheduler won't schedule past minimum end_date of tasks

2018-02-21 Thread Chris Palmer
I was very surprised to find that if you set an end_date on any of the tasks in a DAG, that the scheduler won't create DagRuns after the minimum end_date of tasks. The code that does this is the 6 or so lines starting here -

Re: Close SqlSensor Connection

2018-01-16 Thread Chris Palmer
I'm not sure this is the right solution. I haven't explored all the code but it seems to me that the actual database connections are managed in the hooks. Different databases will want to handle this differently, and modifying SqlSensor seems too heavy handed to me. If you look at the

Re: rename DAG, and keep the history

2018-03-07 Thread Chris Palmer
You'd have to connect to the database storing all the Airflow metadata, find all the tables with a 'dag_id' column and update all the rows for the old name to match the new name. Chris On Wed, Mar 7, 2018 at 2:30 PM, Michael Gong wrote: > hi, all, > due to some

Re: How to add hooks for strong deployment consistency?

2018-02-28 Thread Chris Palmer
I'll preface this with the fact that I'm relatively new to Airflow, and haven't played around with a lot of the internals. I find the idea of a DagFetcher interesting but would we worry about slowing down the scheduler significantly? If the scheduler is having to "fetch" multiple different DAG

Re: execution_date - can we stop the confusion?

2018-09-27 Thread Chris Palmer
While taking a step back makes some sense, we also need to identify what the issue is. Simply saying 'execution_date behavior is confusing to new users' isn't good enough. What is confusing about it? Is it what it represents, or just the name itself? There are a number of different timestamps

Re: Can a DAG be conditionally hidden from the UI?

2018-10-08 Thread Chris Palmer
I like James solution better, but the initial thought I had was to deploy airflowignore files to the environments to filter out files that should not be processed when filling the DagBag. Chris On Mon, Oct 8, 2018 at 10:22 AM James Meickle wrote: > As long as the Airflow process can't find the

Re: Solved: suppress PendingDeprecationWarning messages in airflow logs

2018-09-28 Thread Chris Palmer
Doesn't this warning come from the BaseOperator class - https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L2511 ? Are you passing unused arguments to the QuboleOperator, or do you not control the instantiation of those operators? Chris On Fri, Sep 28, 2018 at 7:39 PM

Re: Creating dynamic pool from task

2018-09-21 Thread Chris Palmer
What would cause multiple computation tasks to run on the cluster at the same time? Are you worried about concurrent DagRuns? Does setting dag concurrency and/or task concurrency appropriately solve your problem? Chris On Thu, Sep 20, 2018 at 8:31 PM David Szakallas wrote: > What I am doing is

Re: Creating dynamic pool from task

2018-09-25 Thread Chris Palmer
attribute get's used when the scheduler sends the task to the executor. Chris On Fri, Sep 21, 2018 at 6:12 PM Chris Palmer wrote: > I see, so for a given DagRun you want to limit the compute tasks that are > running. But I'm guessing you want multiple DagRuns to be able to run > con

Re: [External] Re: Dynamic tasks in a dag?

2018-09-14 Thread Chris Palmer
The relative paths might work from where ever you are evoking 'airflow list_tasks', but that doesn't mean they work from wherever the webserver is parsing the dags from. Does running 'airflow list_tasks' from some other running directory work? On Fri, Sep 14, 2018 at 12:35 PM Frank Maritato