Interesting -- I also run on Kubernetes with a git-sync sidecar, but the containers wait for the synced repo to apprar before starting since it contains some dependencies -- I assume that's why I didn't experience the same issue.
On Sun, Feb 12, 2017 at 6:29 AM Bolke de Bruin <[email protected]> wrote: > Although the race condition doesn't explain why “num_runs = None” resolved > the issue for you earlier, but it does give a clue now: the PR that > introduced “num_runs = -1” was there to be able to work with empty dag > dirs, maybe it wasn’t fully covered yet. > > Bolke > > > On 12 Feb 2017, at 12:26, Bolke de Bruin <[email protected]> wrote: > > > > Ok great! Thanks! That sounds like a race condition: module not > available yet at time of reading. I would expect that it resolves itself > after a while. > > > > After talking to some people at the Warsaw BigData conf I have some > ideas around syncing dags, Spoiler: no dependency on git. > > > > - Bolke > > > >> On 12 Feb 2017, at 11:17, Alex Van Boxel <[email protected]> wrote: > >> > >> Running ok, in staging... @bolke I'm running patch-less. I've switched > my > >> Kubernetes from: > >> > >> - each container (webserver/scheduler/worker) had a git-sync'er (getting > >> the dags from git) > >>> this meant that the scheduler had 0 dags at startup, and should have > >> picked them up later > >> > >> to > >> > >> - single NFS share that shares airflow_home over each container > >>> the git sync'er is now a seperate container running before the other > >> containers > >> > >> This resolved my mystery DAG crashes. > >> > >> I'll be updating production to a patchless RC3 today, you get my vote > after > >> that. > >> > >> > >> > >> > >> On Sun, Feb 12, 2017 at 4:59 AM Boris Tyukin <[email protected]> > wrote: > >> > >>> awesome! thanks Jeremiah > >>> > >>> On Sat, Feb 11, 2017 at 12:53 PM, Jeremiah Lowin <[email protected]> > >>> wrote: > >>> > >>>> Boris, I submitted a PR to address your second point -- > >>>> https://github.com/apache/incubator-airflow/pull/2068. Thanks! > >>>> > >>>> On Sat, Feb 11, 2017 at 10:42 AM Boris Tyukin <[email protected]> > >>>> wrote: > >>>> > >>>>> I am running LocalExecutor and not doing crazy things but use DAG > >>>>> generation heavily - everything runs fine as before. As I mentioned > in > >>>>> other threads only had a few issues: > >>>>> > >>>>> 1) had to upgrade MySQL which was a PAIN. Cloudera CDH is running old > >>>>> version of MySQL which was compatible with 1.7.1 but not compatible > now > >>>>> with 1.8 because of fractional seconds support PR. > >>>>> > >>>>> 2) when you install airflow, there are two new example DAGs > >>>>> (last_task_only) which are going back very far in the past and > >>> scheduled > >>>> to > >>>>> run every hour - a bunch of dags triggered on the first start of > >>>> scheduler > >>>>> and hosed my CPU > >>>>> > >>>>> Everything else was fine and I LOVE lots of small UI changes, which > >>>> reduced > >>>>> a lot my use of cli. > >>>>> > >>>>> Thanks again for the amazing work and an awesome project! > >>>>> > >>>>> > >>>>> On Sat, Feb 11, 2017 at 9:17 AM, Jeremiah Lowin <[email protected]> > >>>> wrote: > >>>>> > >>>>>> I was able to deploy successfully. +1 (binding) > >>>>>> > >>>>>> On Fri, Feb 10, 2017 at 7:37 PM Maxime Beauchemin < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>>> +1 (binding) > >>>>>>> > >>>>>>> On Fri, Feb 10, 2017 at 3:44 PM, Arthur Wiedmer < > >>>>>> [email protected]> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> +1 (binding) > >>>>>>>> > >>>>>>>> On Feb 10, 2017 3:13 PM, "Dan Davydov" <[email protected]. > >>>>>> invalid> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Our staging looks good, all the DAGs there pass. > >>>>>>>>> +1 (binding) > >>>>>>>>> > >>>>>>>>> On Fri, Feb 10, 2017 at 10:21 AM, Chris Riccomini < > >>>>>>> [email protected] > >>>>>>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Running in all environments. Will vote after the weekend to > >>>> make > >>>>>> sure > >>>>>>>>>> things are working properly, but so far so good. > >>>>>>>>>> > >>>>>>>>>> On Fri, Feb 10, 2017 at 6:05 AM, Bolke de Bruin < > >>>>> [email protected] > >>>>>>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Dear All, > >>>>>>>>>>> > >>>>>>>>>>> Let’s try again! > >>>>>>>>>>> > >>>>>>>>>>> I have made the THIRD RELEASE CANDIDATE of Airflow 1.8.0 > >>>>>> available > >>>>>>>> at: > >>>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/ > >>> < > >>>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/> > >>> , > >>>>>>> public > >>>>>>>>> keys > >>>>>>>>>>> are available at https://dist.apache.org/repos/ > >>>>>>>> dist/release/incubator/ > >>>>>>>>>>> airflow/ < > >>>>> https://dist.apache.org/repos/dist/release/incubator/ > >>>>>>>>> airflow/> > >>>>>>>>>>> . It is tagged with a local version “apache.incubating” so > >>> it > >>>>>>> allows > >>>>>>>>>>> upgrading from earlier releases. > >>>>>>>>>>> > >>>>>>>>>>> Two issues have been fixed since release candidate 2: > >>>>>>>>>>> > >>>>>>>>>>> * trigger_dag could create dags with fractional seconds, > >>> not > >>>>>>>> supported > >>>>>>>>> by > >>>>>>>>>>> logging and UI at the moment > >>>>>>>>>>> * local api client trigger_dag had hardcoded execution of > >>>> None > >>>>>>>>>>> > >>>>>>>>>>> Known issue: > >>>>>>>>>>> * Airflow on kubernetes and num_runs -1 (default) can > >>> expose > >>>>>> import > >>>>>>>>>> issues. > >>>>>>>>>>> > >>>>>>>>>>> I have extensively discussed this with Alex (reporter) and > >>> we > >>>>>>>> consider > >>>>>>>>>>> this a known issue with a workaround available as we are > >>>> unable > >>>>>> to > >>>>>>>>>>> replicate this in a different environment. UPDATING.md has > >>>> been > >>>>>>>> updated > >>>>>>>>>>> with the work around. > >>>>>>>>>>> > >>>>>>>>>>> As these issues are confined to a very specific area and > >>> full > >>>>>> unit > >>>>>>>>> tests > >>>>>>>>>>> were added I would also like to raise a VOTE for releasing > >>>>> 1.8.0 > >>>>>>>> based > >>>>>>>>> on > >>>>>>>>>>> release candidate 3, i.e. just renaming release candidate 3 > >>>> to > >>>>>>> 1.8.0 > >>>>>>>>>>> release. > >>>>>>>>>>> > >>>>>>>>>>> Please respond to this email by: > >>>>>>>>>>> > >>>>>>>>>>> +1,0,-1 with *binding* if you are a PMC member or > >>>> *non-binding* > >>>>>> if > >>>>>>>> you > >>>>>>>>>> are > >>>>>>>>>>> not. > >>>>>>>>>>> > >>>>>>>>>>> Thanks! > >>>>>>>>>>> Bolke > >>>>>>>>>>> > >>>>>>>>>>> My VOTE: +1 (binding) > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> -- > >> _/ > >> _/ Alex Van Boxel > > > >
