Hi dan Is that per dag or per dag bag? Multiprocessing should parallelize dag parsing so I am very curious. Let me know if I can help out. Bolke
Sent from my iPhone > On 3 mei 2016, at 01:47, Dan Davydov <[email protected]> wrote: > > So a quick update, unfortunately we saw some DAGBag parsing time increases > (~10x for some DAGs) on the webservers with the 1.7.1rc3. Because of this I > will be working on a staging cluster that has a copy of our production > production DAGBag, and is a copy of our production airflow infrastructure, > just without the workers. This will let us debug the release outside of > production. > > On Thu, Apr 28, 2016 at 10:20 AM, Dan Davydov <[email protected]> > wrote: > >> Definitely, here were the issues we hit: >> - airbnb/airflow#1365 occured >> - Webservers/scheduler were timing out and stuck in restart cycles due to >> increased time spent on parsing DAGs due to airbnb/airflow#1213/files >> - Failed tasks that ran after the upgrade and the revert (after we >> reverted the upgrade) were unable to be cleared (but running the tasks >> through the UI worked without clearing them) >> - The way log files were stored on S3 was changed (airflow now requires a >> connection to be setup) which broke log storage >> - Some DAGs were broken (unable to be parsed) due to package >> reorganization in open-source (the import paths were changed) (the utils >> refactor commit) >> >> On Thu, Apr 28, 2016 at 12:17 AM, Bolke de Bruin <[email protected]> >> wrote: >> >>> Dan, >>> >>> Are you able to share some of the bugs you have been hitting and >>> connected commits? >>> >>> We could at the very least learn from them and maybe even improve testing. >>> >>> Bolke >>> >>> >>>>> Op 28 apr. 2016, om 06:51 heeft Dan Davydov >>>> <[email protected]> het volgende geschreven: >>>> >>>> All of the blockers were fixed as of yesterday (there was some issue >>> that >>>> Jeremiah was looking at with the last release candidate which I think is >>>> fixed but I'm not sure). I started staging the airbnb_1.7.1rc3 tag >>> earlier >>>> today, so as long as metrics look OK and the 1.7.1rc2 issues seem >>> resolved >>>> tomorrow I will release internally either tomorrow or Monday (we try to >>>> avoid releases on Friday). If there aren't any issues we can push the >>> 1.7.1 >>>> tag on Monday/Tuesday. >>>> >>>> @Sid >>>> I think we were originally aiming to deploy internally once every two >>> weeks >>>> but we decided to do it once a month in the end. I'm not too sure about >>>> that so Max can comment there. >>>> >>>> We have been running 1.7.0 in production for about a month now and it >>>> stable. >>>> >>>> I think what really slowed down this release cycle is some commits that >>>> caused severe bugs that we decided to roll-forward with instead of >>> rolling >>>> back. We can potentially try reverting these commits next time while the >>>> fixes are applied for the next version, although this is not always >>> trivial >>>> to do. >>>> >>>> On Wed, Apr 27, 2016 at 9:31 PM, Siddharth Anand < >>>> [email protected]> wrote: >>>> >>>>> Btw, is anyone of the committers running 1.7.0 or later in any staging >>> or >>>>> production env? I have to say that given that 1.6.2 was the most stable >>>>> release and is 4 or more months old does not say much for our release >>>>> cadence or process. What's our plan for 1.7.1? >>>>> >>>>> Sent from Sid's iPhone >>>>> >>>>>>> On Apr 27, 2016, at 9:05 PM, Chris Riccomini <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Hey all, >>>>>> >>>>>> I just wanted to check in on the 1.7.1 release status. I know there >>> have >>>>>> been some major-ish bugs, as well as several people doing tests. >>> Should >>>>> we >>>>>> create a 1.7.1 release JIRA, and track outstanding issues there? >>>>>> >>>>>> Cheers, >>>>>> Chris >>
