Although the race condition doesn't explain why “num_runs = None” resolved the 
issue for you earlier, but it does give a clue now: the PR that introduced 
“num_runs = -1” was there to be able to work with empty dag dirs, maybe it 
wasn’t fully covered yet.

Bolke

> On 12 Feb 2017, at 12:26, Bolke de Bruin <[email protected]> wrote:
> 
> Ok great! Thanks! That sounds like a race condition: module not available yet 
> at time of reading. I would expect that it resolves itself after a while. 
> 
> After talking to some people at the Warsaw BigData conf I have some ideas 
> around syncing dags, Spoiler: no dependency on git.
> 
> - Bolke
> 
>> On 12 Feb 2017, at 11:17, Alex Van Boxel <[email protected]> wrote:
>> 
>> Running ok, in staging... @bolke I'm running patch-less. I've switched my
>> Kubernetes from:
>> 
>> - each container (webserver/scheduler/worker) had a git-sync'er (getting
>> the dags from git)
>>> this meant that the scheduler had 0 dags at startup, and should have
>> picked them up later
>> 
>> to
>> 
>> - single NFS share that shares airflow_home over each container
>>> the git sync'er is now a seperate container running before the other
>> containers
>> 
>> This resolved my mystery DAG crashes.
>> 
>> I'll be updating production to a patchless RC3 today, you get my vote after
>> that.
>> 
>> 
>> 
>> 
>> On Sun, Feb 12, 2017 at 4:59 AM Boris Tyukin <[email protected]> wrote:
>> 
>>> awesome! thanks Jeremiah
>>> 
>>> On Sat, Feb 11, 2017 at 12:53 PM, Jeremiah Lowin <[email protected]>
>>> wrote:
>>> 
>>>> Boris, I submitted a PR to address your second point --
>>>> https://github.com/apache/incubator-airflow/pull/2068. Thanks!
>>>> 
>>>> On Sat, Feb 11, 2017 at 10:42 AM Boris Tyukin <[email protected]>
>>>> wrote:
>>>> 
>>>>> I am running LocalExecutor and not doing crazy things but use DAG
>>>>> generation heavily - everything runs fine as before. As I mentioned in
>>>>> other threads only had a few issues:
>>>>> 
>>>>> 1) had to upgrade MySQL which was a PAIN. Cloudera CDH is running old
>>>>> version of MySQL which was compatible with 1.7.1 but not compatible now
>>>>> with 1.8 because of fractional seconds support PR.
>>>>> 
>>>>> 2) when you install airflow, there are two new example DAGs
>>>>> (last_task_only) which are going back very far in the past and
>>> scheduled
>>>> to
>>>>> run every hour - a bunch of dags triggered on the first start of
>>>> scheduler
>>>>> and hosed my CPU
>>>>> 
>>>>> Everything else was fine and I LOVE lots of small UI changes, which
>>>> reduced
>>>>> a lot my use of cli.
>>>>> 
>>>>> Thanks again for the amazing work and an awesome project!
>>>>> 
>>>>> 
>>>>> On Sat, Feb 11, 2017 at 9:17 AM, Jeremiah Lowin <[email protected]>
>>>> wrote:
>>>>> 
>>>>>> I was able to deploy successfully. +1 (binding)
>>>>>> 
>>>>>> On Fri, Feb 10, 2017 at 7:37 PM Maxime Beauchemin <
>>>>>> [email protected]> wrote:
>>>>>> 
>>>>>>> +1 (binding)
>>>>>>> 
>>>>>>> On Fri, Feb 10, 2017 at 3:44 PM, Arthur Wiedmer <
>>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> +1 (binding)
>>>>>>>> 
>>>>>>>> On Feb 10, 2017 3:13 PM, "Dan Davydov" <[email protected].
>>>>>> invalid>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Our staging looks good, all the DAGs there pass.
>>>>>>>>> +1 (binding)
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, Chris Riccomini <
>>>>>>> [email protected]
>>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Running in all environments. Will vote after the weekend to
>>>> make
>>>>>> sure
>>>>>>>>>> things are working properly, but so far so good.
>>>>>>>>>> 
>>>>>>>>>> On Fri, Feb 10, 2017 at 6:05 AM, Bolke de Bruin <
>>>>> [email protected]
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Dear All,
>>>>>>>>>>> 
>>>>>>>>>>> Let’s try again!
>>>>>>>>>>> 
>>>>>>>>>>> I have made the THIRD RELEASE CANDIDATE of Airflow 1.8.0
>>>>>> available
>>>>>>>> at:
>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/
>>> <
>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/>
>>> ,
>>>>>>> public
>>>>>>>>> keys
>>>>>>>>>>> are available at https://dist.apache.org/repos/
>>>>>>>> dist/release/incubator/
>>>>>>>>>>> airflow/ <
>>>>> https://dist.apache.org/repos/dist/release/incubator/
>>>>>>>>> airflow/>
>>>>>>>>>>> . It is tagged with a local version “apache.incubating” so
>>> it
>>>>>>> allows
>>>>>>>>>>> upgrading from earlier releases.
>>>>>>>>>>> 
>>>>>>>>>>> Two issues have been fixed since release candidate 2:
>>>>>>>>>>> 
>>>>>>>>>>> * trigger_dag could create dags with fractional seconds,
>>> not
>>>>>>>> supported
>>>>>>>>> by
>>>>>>>>>>> logging and UI at the moment
>>>>>>>>>>> * local api client trigger_dag had hardcoded execution of
>>>> None
>>>>>>>>>>> 
>>>>>>>>>>> Known issue:
>>>>>>>>>>> * Airflow on kubernetes and num_runs -1 (default) can
>>> expose
>>>>>> import
>>>>>>>>>> issues.
>>>>>>>>>>> 
>>>>>>>>>>> I have extensively discussed this with Alex (reporter) and
>>> we
>>>>>>>> consider
>>>>>>>>>>> this a known issue with a workaround available as we are
>>>> unable
>>>>>> to
>>>>>>>>>>> replicate this in a different environment. UPDATING.md has
>>>> been
>>>>>>>> updated
>>>>>>>>>>> with the work around.
>>>>>>>>>>> 
>>>>>>>>>>> As these issues are confined to a very specific area and
>>> full
>>>>>> unit
>>>>>>>>> tests
>>>>>>>>>>> were added I would also like to raise a VOTE for releasing
>>>>> 1.8.0
>>>>>>>> based
>>>>>>>>> on
>>>>>>>>>>> release candidate 3, i.e. just renaming release candidate 3
>>>> to
>>>>>>> 1.8.0
>>>>>>>>>>> release.
>>>>>>>>>>> 
>>>>>>>>>>> Please respond to this email by:
>>>>>>>>>>> 
>>>>>>>>>>> +1,0,-1 with *binding* if you are a PMC member or
>>>> *non-binding*
>>>>>> if
>>>>>>>> you
>>>>>>>>>> are
>>>>>>>>>>> not.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Bolke
>>>>>>>>>>> 
>>>>>>>>>>> My VOTE: +1 (binding)
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> -- 
>> _/
>> _/ Alex Van Boxel
> 

Reply via email to