Hi Fokko,

I doubt we'll do a Behind The Scenes for data so soon after this one. We do
engineering sessions some time, if you're interested in
visiting our radically cool office, hit me up personally.


The questions in the audience were mostly about stability. Apparently
people are having problems getting Airflow to run consistently
and in a stable way.

One specifically mentioned the message queue as a potential issue, which I
responded as potentially having a relationship
between the number of workers and configured parallellism (which means a
larger amount of messages end up in the queue).

The other related to Kubernetes. I pointed out the work being done by
Daniel together with 2 google engineers. Later on I realized
this could potentially be related to violating the resource constraints for
a container, primarily because some processes could pick
up a lot of data.

Rgds,

Gerard


On Sun, Oct 29, 2017 at 1:20 PM, Driesprong, Fokko <[email protected]>
wrote:

> Hi Gerard,
>
> Thanks for sharing the presentation. Unfortunately I could not make it to
> the make it to the presentation. Will there be a follow-up? For example
> thinks that your team encountered when migrating from Azkaban to Airflow?
>
> Kind regards,
> Fokko Driesprong
>
> 2017-10-29 11:29 GMT+01:00 Gerard Toonstra <[email protected]>:
>
> > Hi all,
> >
> > Thursday the 26/10 my employer Coolblue organized a "Behind the Scenes"
> > event. It is an opportunity for engineers to talk about stuff they work
> on
> > and usually they provide two presentations.
> >
> > This event was about BigData and Processing. As (now) team lead of Data
> > Platform, I decided to talk about Apache Airflow, which we are now in the
> > process of migrating to (from Azkaban).
> >
> > Here are the slides:
> >
> > https://www.linkedin.com/feed/update/urn:li:activity:6330346647347875840
> >
> > It is a technical presentation, aimed at informing people who are new to
> > Airflow what the underlying architecture is and also presenting the why
> > you'd want to use it in the first place. I based the architectural
> diagrams
> > on AWS on the PoC we did some time.
> >
> > Important takeaway:
> >
> > Airflow is built around some great design principles, which are the
> result
> > of important insights into data processing. These principles result in a
> > tool, when used correctly according to these principles, to reduce the
> ETL
> > effort and maintenance and make time to work on higher level intelligent
> > work like Machine Learning, Deep Learning and analysis of your data.
> >
> > It is very similar to the talk I gave at BigData Week London 2017:
> >
> > https://youtu.be/Ch2AQhOhefw
> >
> > Rgds,
> >
> > Gerard
> >
>

Reply via email to