Moved discussion to https://issues.apache.org/jira/browse/AIRFLOW-52 and updated the status of the task there.
On Tue, May 3, 2016 at 2:32 AM, Dan Davydov <[email protected]> wrote: > It's per DAG unfortunately (we have some pretty funky DAGs here). > On May 2, 2016 10:26 PM, "Bolke de Bruin" <[email protected]> wrote: > >> Hi dan >> >> Is that per dag or per dag bag? Multiprocessing should parallelize dag >> parsing so I am very curious. Let me know if I can help out. >> Bolke >> >> Sent from my iPhone >> >> > On 3 mei 2016, at 01:47, Dan Davydov <[email protected]> >> wrote: >> > >> > So a quick update, unfortunately we saw some DAGBag parsing time >> increases >> > (~10x for some DAGs) on the webservers with the 1.7.1rc3. Because of >> this I >> > will be working on a staging cluster that has a copy of our production >> > production DAGBag, and is a copy of our production airflow >> infrastructure, >> > just without the workers. This will let us debug the release outside of >> > production. >> > >> > On Thu, Apr 28, 2016 at 10:20 AM, Dan Davydov <[email protected]> >> > wrote: >> > >> >> Definitely, here were the issues we hit: >> >> - airbnb/airflow#1365 occured >> >> - Webservers/scheduler were timing out and stuck in restart cycles due >> to >> >> increased time spent on parsing DAGs due to airbnb/airflow#1213/files >> >> - Failed tasks that ran after the upgrade and the revert (after we >> >> reverted the upgrade) were unable to be cleared (but running the tasks >> >> through the UI worked without clearing them) >> >> - The way log files were stored on S3 was changed (airflow now >> requires a >> >> connection to be setup) which broke log storage >> >> - Some DAGs were broken (unable to be parsed) due to package >> >> reorganization in open-source (the import paths were changed) (the >> utils >> >> refactor commit) >> >> >> >> On Thu, Apr 28, 2016 at 12:17 AM, Bolke de Bruin <[email protected]> >> >> wrote: >> >> >> >>> Dan, >> >>> >> >>> Are you able to share some of the bugs you have been hitting and >> >>> connected commits? >> >>> >> >>> We could at the very least learn from them and maybe even improve >> testing. >> >>> >> >>> Bolke >> >>> >> >>> >> >>>>> Op 28 apr. 2016, om 06:51 heeft Dan Davydov >> >>>> <[email protected]> het volgende geschreven: >> >>>> >> >>>> All of the blockers were fixed as of yesterday (there was some issue >> >>> that >> >>>> Jeremiah was looking at with the last release candidate which I >> think is >> >>>> fixed but I'm not sure). I started staging the airbnb_1.7.1rc3 tag >> >>> earlier >> >>>> today, so as long as metrics look OK and the 1.7.1rc2 issues seem >> >>> resolved >> >>>> tomorrow I will release internally either tomorrow or Monday (we try >> to >> >>>> avoid releases on Friday). If there aren't any issues we can push the >> >>> 1.7.1 >> >>>> tag on Monday/Tuesday. >> >>>> >> >>>> @Sid >> >>>> I think we were originally aiming to deploy internally once every two >> >>> weeks >> >>>> but we decided to do it once a month in the end. I'm not too sure >> about >> >>>> that so Max can comment there. >> >>>> >> >>>> We have been running 1.7.0 in production for about a month now and it >> >>>> stable. >> >>>> >> >>>> I think what really slowed down this release cycle is some commits >> that >> >>>> caused severe bugs that we decided to roll-forward with instead of >> >>> rolling >> >>>> back. We can potentially try reverting these commits next time while >> the >> >>>> fixes are applied for the next version, although this is not always >> >>> trivial >> >>>> to do. >> >>>> >> >>>> On Wed, Apr 27, 2016 at 9:31 PM, Siddharth Anand < >> >>>> [email protected]> wrote: >> >>>> >> >>>>> Btw, is anyone of the committers running 1.7.0 or later in any >> staging >> >>> or >> >>>>> production env? I have to say that given that 1.6.2 was the most >> stable >> >>>>> release and is 4 or more months old does not say much for our >> release >> >>>>> cadence or process. What's our plan for 1.7.1? >> >>>>> >> >>>>> Sent from Sid's iPhone >> >>>>> >> >>>>>>> On Apr 27, 2016, at 9:05 PM, Chris Riccomini < >> [email protected]> >> >>>>>> wrote: >> >>>>>> >> >>>>>> Hey all, >> >>>>>> >> >>>>>> I just wanted to check in on the 1.7.1 release status. I know there >> >>> have >> >>>>>> been some major-ish bugs, as well as several people doing tests. >> >>> Should >> >>>>> we >> >>>>>> create a 1.7.1 release JIRA, and track outstanding issues there? >> >>>>>> >> >>>>>> Cheers, >> >>>>>> Chris >> >> >> >
