I think what I observed was doesn't every time otherwise we would see it every time. I'll see if it happens this night again.
On Sat, Feb 25, 2017 at 1:24 PM Jeremiah Lowin <[email protected]> wrote: > Interesting if this is related to what I was seeing -- but to be clear the > error I observed is non-deterministic and doesn't happen every time > (obviously, because otherwise there would be no passing Travis runs). Is > that the case for what you're describing, Dan/Alex? > > On Sat, Feb 25, 2017 at 4:13 AM Alex Van Boxel <[email protected]> wrote: > > About: Skipped tasks potentially cause a dagrun to be marked > failure/success prematurely. Isn't that related to the discussion I had > with Max about the ONE_SUCCESS trigger? When skipping tasks for now you > need to put ONE_SUCCESS. I had kind of a fix but it was rejected because it > changed behaviour. > > On Sat, Feb 25, 2017 at 9:19 AM Bolke de Bruin <[email protected]> wrote: > > > Not trying to muddy the waters, but the observation of Jeremiah (non > > deterministic outcomes) might have to do something with #3. I didn’t dive > > in deeper, yet. > > > > ====================================================================== > > ERROR: test_backfill_examples (tests.BackfillJobTest) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File "/home/travis/build/apache/incubator-airflow/tests/jobs.py", line > > 164, in test_backfill_examples > > job.run() > > File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", > line > > 200, in run > > self._execute() > > File "/home/travis/build/apache/incubator-airflow/airflow/jobs.py", > line > > 1999, in _execute > > raise AirflowException(err) > > AirflowException: --------------------------------------------------- > > Some task instances failed: > > set([('example_short_circuit_operator', 'condition_is_True', > > datetime.datetime(2016, 1, 1, 0, 0))]) > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt < > > https://s3.amazonaws.com/archive.travis-ci.org/jobs/204780706/log.txt> > > > > Bolke > > > > > On 25 Feb 2017, at 09:07, Bolke de Bruin <[email protected]> wrote: > > > > > > Hi Dan, > > > > > > - Backfill indeed runs only one dagrun at the time, see line 1755 of > > jobs.py. I’ll think about how to fix this over the weekend (I think it > was > > my change that introduced this). Suggestions always welcome. Depending > the > > impact it is a blocker or not. We don’t often use backfills and > definitely > > not at your size, so that is why it didn’t pop up with us. I’m assuming > > blocker for now, btw. > > > - Speculation on the High DB Load. I’m not sure what your benchmark is > > here (1.7.1 + multi processor dags?), but as you mentioned in the code > > dependencies are checked a couple of times for one run and even task > > instance. Dependency checking requires aggregation on the DB, which is a > > performance killer. Annoying but not a blocker. > > > - Skipped tasks potentially cause a dagrun to be marked failure/success > > prematurely. BranchOperators are widely used if it affects these > operators, > > then it is a blocker. > > > > > > - Bolke > > > > > >> On 25 Feb 2017, at 02:04, Dan Davydov <[email protected] > .INVALID> > > wrote: > > >> > > >> Update on old pending issues: > > >> - Black Squares in UI: Fix merged > > >> - Double Trigger Issue That Alex G Mentioned: Alex has a PR in flight > > >> > > >> New Issues: > > >> - Backfill seems to be having issues (only running one dagrun at a > > time), > > >> we are still investigating - might be a blocker > > >> - High DB Load (~8x more than 1.7) - We are still investigating but > it's > > >> probably not a blocker for the release > > >> - Skipped tasks potentially cause a dagrun to be marked as > > failure/success > > >> prematurely - not sure whether or not to classify this as a blocker > > (only > > >> really an issue for users who use the BranchingPythonOperator, which > > AirBnB > > >> does) > > >> > > >> On Thu, Feb 23, 2017 at 5:59 PM, siddharth anand <[email protected]> > > wrote: > > >> > > >>> IMHO, a DAG run without a start date is non-sensical but is not > > enforced > > >>> That said, our UI allows for the manual creation of DAG Runs without > a > > >>> start date as shown in the images below: > > >>> > > >>> > > >>> - https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot% > > >>> 202017-02-22%2016.00.40.png?dl=0 > > >>> <https://www.dropbox.com/s/3sxcqh04eztpl7p/Screenshot% > > >>> 202017-02-22%2016.00.40.png?dl=0> > > >>> - https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot% > > >>> 202017-02-22%2016.02.22.png?dl=0 > > >>> <https://www.dropbox.com/s/4q6rr9dwghag1yy/Screenshot% > > >>> 202017-02-22%2016.02.22.png?dl=0> > > >>> > > >>> > > >>> On Wed, Feb 22, 2017 at 2:26 PM, Maxime Beauchemin < > > >>> [email protected]> wrote: > > >>> > > >>>> Our database may have edge cases that could be associated with > running > > >>> any > > >>>> previous version that may or may not have been part of an official > > >>> release. > > >>>> > > >>>> Let's see if anyone else reports the issue. If no one does, one > > option is > > >>>> to release 1.8.0 as is with a comment in the release notes, and have > a > > >>>> future official minor apache release 1.8.1 that would fix these > minor > > >>>> issues that are not deal breaker. > > >>>> > > >>>> @bolke, I'm curious, how long does it take you to go through one > > release > > >>>> cycle? Oh, and do you have a documented step by step process for > > >>> releasing? > > >>>> I'd like to add the Pypi part to this doc and add committers that > are > > >>>> interested to have rights on the project on Pypi. > > >>>> > > >>>> Max > > >>>> > > >>>> On Wed, Feb 22, 2017 at 2:00 PM, Bolke de Bruin <[email protected]> > > >>> wrote: > > >>>> > > >>>>> So it is a database integrity issue? Afaik a start_date should > always > > >>> be > > >>>>> set for a DagRun (create_dagrun) does so I didn't check the code > > >>> though. > > >>>>> > > >>>>> Sent from my iPhone > > >>>>> > > >>>>>> On 22 Feb 2017, at 22:19, Dan Davydov <[email protected]. > > >>> INVALID> > > >>>>> wrote: > > >>>>>> > > >>>>>> Should clarify this occurs when a dagrun does not have a start > date, > > >>>> not > > >>>>> a > > >>>>>> dag (which makes it even less likely to happen). I don't think > this > > >>> is > > >>>> a > > >>>>>> blocker for releasing. > > >>>>>> > > >>>>>>> On Wed, Feb 22, 2017 at 1:15 PM, Dan Davydov < > > >>> [email protected]> > > >>>>> wrote: > > >>>>>>> > > >>>>>>> I rolled this out in our prod and the webservers failed to load > due > > >>> to > > >>>>>>> this commit: > > >>>>>>> > > >>>>>>> [AIRFLOW-510] Filter Paused Dags, show Last Run & Trigger Dag > > >>>>>>> 7c94d81c390881643f94d5e3d7d6fb351a445b72 > > >>>>>>> > > >>>>>>> This fixed it: > > >>>>>>> - </a> <span id="statuses_info" > > >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true" > > >>> title="Start > > >>>>> Date: > > >>>>>>> {{last_run.start_date.strftime('%Y-%m-%d %H:%M')}}"></span> > > >>>>>>> + </a> <span id="statuses_info" > > >>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> > > >>>>>>> > > >>>>>>> This is caused by assuming that all DAGs have start dates set, so > a > > >>>>> broken > > >>>>>>> DAG will take down the whole UI. Not sure if we want to make this > a > > >>>>> blocker > > >>>>>>> for the release or not, I'm guessing for most deployments this > > would > > >>>>> occur > > >>>>>>> pretty rarely. I'll submit a PR to fix it soon. > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> On Tue, Feb 21, 2017 at 9:49 AM, Chris Riccomini < > > >>>> [email protected] > > >>>>>> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Ack that the vote has already passed, but belated +1 (binding) > > >>>>>>>> > > >>>>>>>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke de Bruin < > > [email protected] > > >>>> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> IPMC Voting can be found here: > > >>>>>>>>> > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/ > > >>>>>>>> 201702.mbox/% > > >>>>>>>>> [email protected]%3e < > > >>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/ > > >>>>>>>> 201702.mbox/% > > >>>>>>>>> [email protected]%3E> > > >>>>>>>>> > > >>>>>>>>> Kind regards, > > >>>>>>>>> Bolke > > >>>>>>>>> > > >>>>>>>>>> On 21 Feb 2017, at 08:20, Bolke de Bruin <[email protected]> > > >>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>> Hello, > > >>>>>>>>>> > > >>>>>>>>>> Apache Airflow (incubating) 1.8.0 (based on RC4) has been > > >>> accepted. > > >>>>>>>>>> > > >>>>>>>>>> 9 “+1” votes received: > > >>>>>>>>>> > > >>>>>>>>>> - Maxime Beauchemin (binding) > > >>>>>>>>>> - Arthur Wiedmer (binding) > > >>>>>>>>>> - Dan Davydov (binding) > > >>>>>>>>>> - Jeremiah Lowin (binding) > > >>>>>>>>>> - Siddharth Anand (binding) > > >>>>>>>>>> - Alex van Boxel (binding) > > >>>>>>>>>> - Bolke de Bruin (binding) > > >>>>>>>>>> > > >>>>>>>>>> - Jayesh Senjaliya (non-binding) > > >>>>>>>>>> - Yi (non-binding) > > >>>>>>>>>> > > >>>>>>>>>> Vote thread (start): > > >>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator- > > >>>>>>>>> airflow-dev/201702.mbox/%3cD360D9BE-C358-42A1-9188- > > >>>>>>>>> [email protected]%3e <http://mail-archives.apache. > > >>>>>>>>> org/mod_mbox/incubator-airflow-dev/201702.mbox/%3C7EB7B6D6- > > >>>>>>>> 092E-48D2-AA0F- > > >>>>>>>>> [email protected]%3E> > > >>>>>>>>>> > > >>>>>>>>>> Next steps: > > >>>>>>>>>> 1) will start the voting process at the IPMC mailinglist. I do > > >>>> expect > > >>>>>>>>> some changes to be required mostly in documentation maybe a > > >>> license > > >>>>> here > > >>>>>>>>> and there. So, we might end up with changes to stable. As long > as > > >>>>> these > > >>>>>>>> are > > >>>>>>>>> not (significant) code changes I will not re-raise the vote. > > >>>>>>>>>> 2) Only after the positive voting on the IPMC and finalisation > I > > >>>> will > > >>>>>>>>> rebrand the RC to Release. > > >>>>>>>>>> 3) I will upload it to the incubator release page, then the > tar > > >>>> ball > > >>>>>>>>> needs to propagate to the mirrors. > > >>>>>>>>>> 4) Update the website (can someone volunteer please?) > > >>>>>>>>>> 5) Finally, I will ask Maxime to upload it to pypi. It seems > we > > >>> can > > >>>>>>>> keep > > >>>>>>>>> the apache branding as lib cloud is doing this as well ( > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package < > > >>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package>). > > >>>>>>>>>> > > >>>>>>>>>> Jippie! > > >>>>>>>>>> > > >>>>>>>>>> Bolke > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>> > > >>>> > > >>> > > > > > > > -- > _/ > _/ Alex Van Boxel > -- _/ _/ Alex Van Boxel
