Ignore that, must be something with splunk since stdiut doesn't have a date field; the same process writing to a file is printing that out and Filling is before that line...
On Mon, Mar 19, 2018, 5:35 PM David Capwell <[email protected]> wrote: > This is weird and hope not bad utc conversion tricking me.... > > > So splunk logs for worker shows the process logs were created at 9am > ("Logging into: ...."), the first entry of the log was at 14:00 ("Filling > up the DagBag"). If I go to the DB and calculate queue time this specific > dag was delayed 5 hours which matches the logs... > > > > On Mon, Mar 19, 2018, 9:10 AM David Capwell <[email protected]> wrote: > >> The major reason we have been waiting was mostly because 1.8.2 and 1.9 >> are backwards incompatible (don't remember off the top of my head but one >> operator broke important so everything failed for us), so neglected doing >> the work to support both versions (need to support both since different >> teams move at different rates). >> >> We need to do this anyways (frozen in time is very bad). >> >> On Mon, Mar 19, 2018, 1:47 AM Driesprong, Fokko <[email protected]> >> wrote: >> >>> Hi David, >>> >>> First I would update to Apache Airflow 1.9.0, there have been a lot of >>> fixes between 1.8.2 and 1.9.0. Just to see if the bug is still in there. >>> >>> Cheers, Fokko >>> >>> 2018-03-18 19:41 GMT+01:00 David Capwell <[email protected]>: >>> >>> > Thanks for the reply >>> > >>> > Our script doesn't set it so should be off; the process does not >>> normally >>> > restart (monitoring has a counter for number of restarts since deploy, >>> > currently as 0) >>> > >>> > At the point in time the UI showed the upstream tasks as green >>> (success); >>> > we manually ran tasks so no longer in the same state, so can't check UI >>> > right now >>> > >>> > On Sun, Mar 18, 2018, 11:34 AM Bolke de Bruin <[email protected]> >>> wrote: >>> > >>> > > Are you running with num_runs? If so disable it. We have seen this >>> > > behavior with num_runs. Also you can find out by clicking on the >>> task if >>> > > there is a dependency issue. >>> > > >>> > > B. >>> > > >>> > > Verstuurd vanaf mijn iPad >>> > > >>> > > > Op 18 mrt. 2018 om 19:08 heeft David Capwell <[email protected]> >>> het >>> > > volgende geschreven: >>> > > > >>> > > > We just started seeing this a few days ago after turning on SLA >>> for our >>> > > > tasks (not saying SLA did this, may have been happening before and >>> not >>> > > > noticing), but we have a dag that runs once a hour and we see that >>> 4-5 >>> > > dag >>> > > > runs are marked running but tasks are not getting scheduled. When >>> we >>> > get >>> > > > the SLA alert the action we are doing right now is going to the UI >>> and >>> > > > clicking run on tasks manually; this is only needed for the oldest >>> dag >>> > > run >>> > > > and the rest recover after that. In the past 3 days this has >>> happened >>> > > twice >>> > > > to us. >>> > > > >>> > > > We are running 1.8.2, are there any known jira about this? Don't >>> know >>> > > > scheduler well, what could I do to see why these tasks are getting >>> > > skipped >>> > > > without manual intervention? >>> > > > >>> > > > Thanks for your time. >>> > > >>> > >>> >>
