Re: Apache Airflow / Cloud Composer workshops Amsterdam

2018-10-12 Thread Ben Gregory
Hey Fokko! Sounds like a great event! Will any of the talks/workshops be streamed/livecast/recorded for those of us who can't make it to Amsterdam? - Ben On Fri, Oct 12, 2018 at 12:40 PM Driesprong, Fokko wrote: > Hi all, > > From October 15-19, 2018, GoDataFest takes place in Amsterdam, The

Apache Airflow / Cloud Composer workshops Amsterdam

2018-10-12 Thread Driesprong, Fokko
Hi all, >From October 15-19, 2018, GoDataFest takes place in Amsterdam, The Netherlands. This week is dedicated to data technology and features free talks, training sessions and workshops. Leading tech companies, like AWS (Monday, October 15), Dataiku (Tuesday, October 16), Databricks

Re: Ingest daily data, but delivery is always delayed by two days

2018-10-12 Thread James Meickle
For something to add to Airflow itself: I would love a more flexible mapping between data time and processing time. The default is "n-1" (day over day, you're aiming to process yesterday's data) but people post other use cases on this mailing list quite frequently. On Fri, Oct 12, 2018 at 7:46 AM

Re: Ingest daily data, but delivery is always delayed by two days

2018-10-12 Thread Faouz El Fassi
What about an exponential back off on the poke interval? On Fri, 12 Oct 2018, 13:01 Ash Berlin-Taylor, wrote: > That would work for some of our other uses cases (and has been an idea in > our backlog for months) but not this case as we're reading from someone > else's bucket so can't set up

Re: Ingest daily data, but delivery is always delayed by two days

2018-10-12 Thread Ash Berlin-Taylor
That would work for some of our other uses cases (and has been an idea in our backlog for months) but not this case as we're reading from someone else's bucket so can't set up notifications etc. :( -ash > On 12 Oct 2018, at 11:57, Bolke de Bruin wrote: > > S3 Bucket notification that

Ingest daily data, but delivery is always delayed by two days

2018-10-12 Thread Ash Berlin-Taylor
A lot of our dags are ingesting data (usually daily or weekly) from suppliers, and they are universally late. In the case I'm setting up now the delivery lag is about 30hours - data for 2018-10-10 turned up at 2018-10-12 05:43. I was going to just set this up with an S3KeySensor and a daily

Re: "setup.py test" is being naughty

2018-10-12 Thread Driesprong, Fokko
We're working hard to get rid of the tight Travis integration and moving to a Docker based setup. I think it should be very easy to get a Docker up and running which is packed with the required dependencies. Unfortunately we're not there yet. Also the tox layer feels a bit redundant to me, since