Good morning

2016-10-18 Thread Alex
Good morning , I have a project i want to bring to you and i want you to help me discuss it . Please reply to me as soon as possible for more details . Thanks & Regards

Re: [RESULT] [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc4

2017-02-22 Thread Alex Guziel
a > > >>> blocker > > >>>>> for the release or not, I'm guessing for most deployments this > would > > >>> occur > > >>>>> pretty rarely. I'll submit a PR to fix it soon. > > >>>>> > > >>

Re: [RESULT] [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc4

2017-02-24 Thread Alex Guziel
t; > > >>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke de Bruin > > >>> wrote: > > >>> > > >>>> IPMC Voting can be found here: > > >>>> > > >>>> http://mail-archives.apache.org/mod_mbox/incubator-general/

Re: Continuous Dag

2017-03-13 Thread Alex Guziel
FWIW, for our streaming jobs, we run a 5 minute schedule interval with max_active_runs=1 On Mon, Mar 13, 2017 at 2:00 PM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > Airflow isn't designed to work well with short schedule intervals. The > guarantees that we give in terms of schedulin

Re: Airflow Committers: Landscape checks doing more harm than good?

2017-03-16 Thread Alex Guziel
+1 also We have code review already and the amount of false positives makes this useless. On Thu, Mar 16, 2017 at 5:02 PM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > +1 as well > > I'm disappointed because the service is inches away from getting everything > right. As Bolke said, b

Re: SQLOperator?

2017-03-17 Thread Alex Guziel
I'm not sure if that one went away but there are different SQL operators, like MySqlOperator, MsSqlOperator, etc. that I see. Best, Alex On Fri, Mar 17, 2017 at 7:56 PM, Ruslan Dautkhanov wrote: > I can't find references to SQLOperator neither in the source code nor in > t

Re: Podling Report Reminder - April 2017

2017-04-03 Thread Alex Guziel
Did we do this? On Mon, Apr 3, 2017 at 5:53 PM, wrote: > Dear podling, > > This email was sent by an automated system on behalf of the Apache > Incubator PMC. It is an initial reminder to give you plenty of time to > prepare your quarterly board report. > > The board meeting is scheduled for Wed

Memory Issues with Airflow Subdag

2017-04-05 Thread Alex Keating
Hey Everyone, We recently added a subdag with over 200 tasks on it and any time airflow is running the cpu and memory usage spikes taking down the server. I am putting the subdag in a factory outside of the dags folder and importing it into a dag. We are using the Celery Executor, and running the

Re: Memory Issues with Airflow Subdag

2017-04-05 Thread Alex Guziel
Which container is using the memory? On Wed, Apr 5, 2017 at 2:23 PM, Alex Keating wrote: > Hey Everyone, > > We recently added a subdag with over 200 tasks on it and any time airflow > is running the cpu and memory usage spikes taking down the server. I am > putting the subd

Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-17 Thread Alex Guziel
I would say to include [1074] ( https://github.com/apache/incubator-airflow/pull/2221) so we don't have a regression in the release after. I would also say https://github.com/apache/incubator-airflow/pull/2241 is semi important but less so. On Mon, Apr 17, 2017 at 11:24 AM, Chris Riccomini wrote:

Re: [VOTE] Release Airflow 1.8.1 based on Airflow 1.8.1 RC0

2017-04-17 Thread Alex Guziel
wrote: > :(:(:( Why was this not included in 1.8.1 JIRA? I've been emailing the list > all last week > > On Mon, Apr 17, 2017 at 11:28 AM, Alex Guziel < > alex.guz...@airbnb.com.invalid> wrote: > > > I would say to include [1074] ( > > https://github.com/apache/inc

Re: dag file processing times

2017-04-24 Thread Alex Guziel
You can also use reflection in Python to read the modules all the way down. On Mon, Apr 24, 2017 at 3:05 PM, Dan Davydov wrote: > Was talking with Alex about the DB case offline, for those we could support > a force refresh arg with an interval param. > > Manifests would need to b

Re: dag file processing times

2017-04-24 Thread Alex Guziel
better than pickle). > > B. > > Sent from my iPhone > > > On 25 Apr 2017, at 00:07, Alex Guziel > wrote: > > > > You can also use reflection in Python to read the modules all the way > down. > > > > On Mon, Apr 24, 2017 at 3:05 PM, Dan Davydov i

Re: Discussion on Airflow 1.8.1 RC2

2017-05-04 Thread Alex Guziel
I don't think any of the fixes I did were regressions. On Thu, May 4, 2017 at 8:11 AM, Bolke de Bruin wrote: > I know of one that Alex wanted to get in, but wasn’t targeted for 1.8.1 in > Jira and thus didn’t make the cut at RC time. There is is another one out > that seems to

Re: Tasks Queued but never run

2017-06-01 Thread Alex Guziel
We've noticed this with celery, relating to this https://github.com/celery/celery/issues/3765 We also use `-n 5` option on the scheduler so it restarts every 5 runs, which will reset all queued tasks. Best, Alex On Thu, Jun 1, 2017 at 2:18 PM, Josef Samanek wrote: > Hi! > > We

Re: [VOTE] Release Airflow 1.8.2 based on Airflow 1.8.2 RC2

2017-06-26 Thread Alex Guziel
I'm not so sure this is a new issue. I think we've seen it on our production for quite a while. On Mon, Jun 26, 2017 at 2:31 PM, Chris Riccomini wrote: > I am seeing a strange UI behavior on 1.8.2.RC2. I've opened a JIRA here: > > https://issues.apache.org/jira/browse/AIRFLOW-1348 > > Has anyone

Re: [VOTE] Release Airflow 1.8.2 based on Airflow 1.8.2 RC2

2017-06-26 Thread Alex Guziel
s > pretty ugly. > > On Mon, Jun 26, 2017 at 2:34 PM, Alex Guziel invalid > > wrote: > > > I'm not so sure this is a new issue. I think we've seen it on our > > production for quite a while. > > > > On Mon, Jun 26, 2017 at 2:31 PM, Chris Ricco

Re: [VOTE] Release Airflow 1.8.2 based on Airflow 1.8.2 RC2

2017-06-26 Thread Alex Guziel
2:44 PM, Alex Guziel wrote: > There's no pagination in 1.8.1? Are you sure? > > On Mon, Jun 26, 2017 at 2:37 PM, Chris Riccomini > wrote: > >> It's not happening on 1.8.1 (since there's no pagination in that version), >> so I'd count this as a reg

Re: [VOTE] Release Airflow 1.8.2 based on Airflow 1.8.2 RC2

2017-06-26 Thread Alex Guziel
s, I did the 1.8.1 release. > > On Mon, Jun 26, 2017 at 2:44 PM, Alex Guziel invalid > > wrote: > > > There's no pagination in 1.8.1? Are you sure? > > > > On Mon, Jun 26, 2017 at 2:37 PM, Chris Riccomini > > wrote: > > > > > It's not happen

Re: [VOTE] Release Airflow 1.8.2 based on Airflow 1.8.2 RC2

2017-06-26 Thread Alex Guziel
1.8.2, the pagination was > broken in the sense that it defaulted to the whole list. We have 479 DAGs > in one env, and it shows them all. It looks like someone fixed the entry to > default to 25 now, which exposed the problem for our environments. > > On Mon, Jun 26, 2017 at 2:47 PM,

Re: Airflow profiling

2017-06-27 Thread Alex Guziel
Yeah, actually we have setup Newrelic for Airflow too at Airbnb, which gives decent insights into webserver perf. In terms of SQL queries, adding `echo=True` to the SQLAlchemy engine creation is pretty good for seeing which sql queries get created. I tried some Python profilers before but they were

Issues when worker loses database connectivity

2017-06-29 Thread Alex Wenckus
s though the worker becomes deadlocked and now longer processes work. Are there any settings we can update which will help mitigate this issue? As a workaround we have a script that detects when the queue has been larger than 5 and active has been 0 for 10 minutes or longer, but this presents its own issues. Thanks! Alex

Airflow bug when losing connectivity to Celery

2017-07-11 Thread Alex Wenckus
og messages simultaneously. Irregardless it seems like a situation Airflow should gracefully be able to handle. We are on Airflow 1.8.1 Thanks! Alex

Re: celereyd processes in the worker nodes

2017-07-12 Thread Alex Guziel
The celeryd processes exist even if they are idling On Wed, Jul 12, 2017 at 11:40 PM Niranda Perera wrote: > Hi, > > I am using the celery executor and in my worker node, I only have a single > task running currently (there were number of tasks completed already). But > when I check the processe

Re: AIRFLOW-1258

2017-07-16 Thread Alex Guziel
I think this may be related to a celery bug. I'll follow up with more details later. On Sun, Jul 16, 2017 at 12:56 AM Jawahar Panchal wrote: > Hi! > > I am currently running a couple of long-running tasks on a > database/dataset at school for a project that results in behavior/log > output simil

Re: Role Based Access Control for Airflow UI

2017-07-25 Thread Alex Guziel
Yeah, I could call in but I probably won't be able to come down that day. On Tue, Jul 25, 2017 at 1:36 PM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > Works for me! Dan said he might confcall in. Alex? > > Max > > On Mon, Jul 24, 2017 at 11:25 AM

[AIRFLOW-xxx] in commit messages

2017-07-25 Thread Alex Guziel
What is our actual ruling on this? I see quite a few commits without this tag. Also, what is our policy on changes to README (like the company list) and JIRA tickets? It seems like we apply this inconsistently, but we should probably make a firm standard and abide by it. Best, Alex

Re: Sensor slots utilization

2017-07-28 Thread Alex Guziel
I'm concerned that we would be making the logic more complex, unless the new sensor 'pokeonce' case is just a high number of retries. And the other overhead of course. Running the poke method inline wouldn't be great for perf either since it's a blocking I/O and would need to be handled async in or

Re: Email on last failed try

2017-07-28 Thread Alex Guziel
Sounds like unintended behavior. That should be what email_on_retry does. If you can repro, file a ticket. On Fri, Jul 28, 2017 at 11:44 AM, Andrew Maguire wrote: > Yeah - i have: > > 'email_on_failure': True > 'retries': 4 > > So i get emails on every try: e.g. Try 1 out of 5 > > Really what i'

Re: Stuck Tasks that don't report status

2017-08-09 Thread Alex Guziel
I know with a scheduler restart, tasks that may still report as running even though they are not. On Wed, Aug 9, 2017 at 6:07 PM, David Klosowski wrote: > Hi Gerard, > > The interesting thing is that we didn't see this issue in 1.7.1.3 but we > did when upgrading to 1.8.0. > > We aren't seeing a

Landscape broken - anyone knows what's up?

2017-08-10 Thread Alex Guziel
It seems Landscape is throwing 500s. When I look into it, https://landscape.io/github/apache/incubator-airflow/ "The repository *apache/incubator-airflow* is not being checked by Landscape." Does anyone have the perms to fix this?

Re: Bad Request CSRF

2017-08-17 Thread Alex Guziel
Curious, how did you fix this? We see this from time-to-time and we also have a single sign-on system. On Wed, Aug 16, 2017 at 10:29 AM, George Leslie-Waksman < geo...@cloverhealth.com.invalid> wrote: > I have further tracked the issue to our new single-sign-on system. Airflow > is fine. Please d

Re: As history grows UI gets slower

2017-08-28 Thread Alex Guziel
Here at Airbnb we delete old "completed" task instances. On Mon, Aug 28, 2017 at 3:01 PM, David Capwell wrote: > We are on 1.8.0 and have a monitor DAG that monitors the health of Airflow > and Celery every minute. This has been running for awhile now and at 26k > dag runs. We see that the UI f

Re: 1.9.0 test branch has been cut

2017-09-13 Thread Alex Guziel
Shouldn't we include everything on master? On Wed, Sep 13, 2017 at 12:45 PM, Chris Riccomini wrote: > Hey all, > > I've cut a 1.9.0 test branch. > > https://github.com/apache/incubator-airflow/tree/v1-9-test > > Here are the tickets that are being tracked on 1.9.0. > > ISSUE ID |DESCRIPTION

Re: 1.9.0 test branch has been cut

2017-09-13 Thread Alex Guziel
Nevermind, I misunderstood what you meant. (I thought you meant you were only including things with a fix version of 1.9.0, when you meant master cut + 1.9.0 fix versions) On Wed, Sep 13, 2017 at 1:19 PM, Alex Guziel wrote: > Shouldn't we include everything on master? > > On Wed

Re: Terminate task process through UI

2017-09-13 Thread Alex Guziel
Right now, there are a few layers of processes. Here's an example in the celery worker case. CeleryMainProcess -> CeleryPoolWorker -> Airflow run --local -> Airflow run --raw -> Bash command In the past, airflow run --raw would handle almost all logic, and --local would just handle heartbeating,

Re: Proposal: Set Celery 4.0 as a minimum as Celery 4 is unsupported

2017-09-19 Thread Alex Guziel
That's probably fine but I'd like to note two things. 1) The celery 3 config options are forwards compatible as far as I know 2) Still doesn't fix the bug where tasks get reserved even though it shouldn't. But I think it makes sense to upgrade the version in setup.py regardless. On Tue, Sep 19,

Re: Airflow 1.9.0 status

2017-09-20 Thread Alex Guziel
Can we get this in? https://issues.apache.org/jira/browse/AIRFLOW-1519 https://issues.apache.org/jira/browse/AIRFLOW-1621 https://github.com/apache/incubator-airflow/commit/b6d2e0a46978e93e16576604624f57d1388814f2 https://github.com/apache/incubator-airflow/commit/656d045e90bf67ca484a3778b2a07a41

Re: Runbook to upgrade Airflow

2017-10-03 Thread Alex Guziel
You won't be able to if there's a schema change. On Tue, Oct 3, 2017 at 12:33 PM, Thoralf Gutierrez < thoralfgutier...@gmail.com> wrote: > Hey everybody! > > Does anybody have some kind of runbook to upgrade airflow (with a Celery > backend) without having any downtime (i.e. tasks keep on running

Re: Runbook to upgrade Airflow

2017-10-04 Thread Alex Guziel
arted, scheduler can be handled manually, and webserver we do a rolling restart. On Wed, Oct 4, 2017 at 7:54 AM Thoralf Gutierrez wrote: > Thanks for your answer Alex. > > I guess you mean it won't work if there is a _breaking_ schema change > right? But for new patch and minor

Re: 1.9.0alpha1 published

2017-10-13 Thread Alex Guziel
AIRFLOW-976 should be marked resolved. It is fixed by https://github.com/apache/incubator-airflow/commit/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce (pardon my commit message), which is in v1.9. On Fri, Oct 13, 2017 at 11:52 AM, Chris Riccomini wrote: > Hey all, > > I have cut a 1.9.0alpha1 release

Re: Ignore Processing DAG Definition Python Files for Paused DAGs

2017-11-27 Thread Alex Guziel
Hmm, this may not apply to your implementation, but it sounds like for this it would not handle cases like: 1) a.py has dag A1 and A2, A1 is paused, A2 is not 2) b.py has dag B1, which is paused. Later B2 is added to b.py but does not get picked up since B1 is paused. On Mon, Nov 27, 2017 at 3:29

Re: Introducing a "LAUNCHED" state into airflow

2017-11-29 Thread Alex Guziel
It might be good enough to have RUNNING set immediately on the process run and not being dependent on the dag file being parsed. It is annoying here too when dags parse on the scheduler but not the worker, since queued tasks that don't heartbeat will not get retried, while running tasks will. On W

Re: Introducing a "LAUNCHED" state into airflow

2017-11-30 Thread Alex Guziel
I think the more sensible thing here is to just to set the state to RUNNING immediately in the airflow run process. I don't think the distinction between launched and running adds much value. On Thu, Nov 30, 2017 at 10:36 AM, Daniel Imberman wrote: > @Alex > > That could potentia

Re: Introducing a "LAUNCHED" state into airflow

2017-11-30 Thread Alex Guziel
Right now the scheduler re-launches all QUEUED tasks on restart (there are safeguards for duplicates). On Thu, Nov 30, 2017 at 11:13 AM, Grant Nicholas < grantnicholas2...@u.northwestern.edu> wrote: > @Alex > I agree setting the RUNNING state immediately when `airflow run` starts u

Re: Introducing a "LAUNCHED" state into airflow

2017-11-30 Thread Alex Guziel
See reset_state_for_orphaned_tasks in jobs.py On Thu, Nov 30, 2017 at 11:17 AM, Alex Guziel wrote: > Right now the scheduler re-launches all QUEUED tasks on restart (there are > safeguards for duplicates). > > On Thu, Nov 30, 2017 at 11:13 AM, Grant Nicholas northwestern.edu> wr

Re: Introducing a "LAUNCHED" state into airflow

2017-12-01 Thread Alex Guziel
gt; > Thanks, I see why that should work, I just know that from testing > this > > myself that I had to manually clear out old QUEUED task instances to > get > > them to reschedule. I'll do some more testing to confirm, it's > totally > > poss

Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed

2018-06-05 Thread Alex Guziel
I suggest reading the section on password complexity here https://pages.nist.gov/800-63-3/sp800-63b.html which recommends just a minimum length and a check against a list of the most common passwords. On Tue, Jun 5, 2018 at 3:14 PM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > Agreed,

Re: Broken DAG message won't go away in webserver

2018-08-09 Thread Alex Guziel
IIRC the scheduler sets these messages in the error table in the db. On Thu, Aug 9, 2018 at 2:13 PM, Ben Laird wrote: > The messages persist even after restarting the webserver. I've verified > with other airflow users in the office that they'd have to manually delete > records from the 'import_

Re: Fundamental change - Separate DAG name and id.

2018-09-24 Thread Alex Guziel
I think decoupling dag_id and display name could be confusing and cumbersome. As for readme, DAG already has a field called description which I think is close to what Alex is describing (I believe it is displayed by the UI). On Mon, Sep 24, 2018 at 3:12 PM Alex Tronchin-James 949-412-7220

Re: Pinning dependencies for Apache Airflow

2018-10-04 Thread Alex Guziel
You should run `pip check` to ensure no conflicts. Pip does not do this on its own. On Thu, Oct 4, 2018 at 9:20 AM Jarek Potiuk wrote: > Great that this discussion already happened :). Lots of useful things in > it. And yes - it means pinning in requirement.txt - this is how pip-tools > work. >

Re: Pinning dependencies for Apache Airflow

2018-10-04 Thread Alex Guziel
*is* coming... they *will* be fixed" > > > > > > > > needs to be > > > > > > > > "We'd like to propose a change... We would like to make them fixed." > > > > > > > > The first says that this decision has been

Re: programmatically creating and airflow quirks

2018-11-22 Thread Alex Guziel
I think this is what is going on. The dags are picked by local variables. I.E. if you do dag = Dag(...) dag = Dag(...) Only the second dag will be picked up. On Thu, Nov 22, 2018 at 2:04 AM Soma S Dhavala wrote: > Hey AirFlow Devs: > In our organization, we build a Machine Learning WorkBench wi

Re: programmatically creating and airflow quirks

2018-11-22 Thread Alex Guziel
It’s because of this “When searching for DAGs, Airflow will only consider files where the string “airflow” and “DAG” both appear in the contents of the .py file.” On Thu, Nov 22, 2018 at 2:27 AM soma dhavala wrote: > > > On Nov 22, 2018, at 3:37 PM, Alex Guziel wrote: > > I thi

Re: programmatically creating and airflow quirks

2018-11-22 Thread Alex Guziel
Yup. On Thu, Nov 22, 2018 at 3:16 PM soma dhavala wrote: > > > On Nov 23, 2018, at 3:28 AM, Alex Guziel wrote: > > It’s because of this > > “When searching for DAGs, Airflow will only consider files where the > string “airflow” and “DAG” both appear in the contents of t

Re: Airflow Developers Meeting - 08/03 Notes

2016-08-08 Thread Alex Van Boxel
Sorry, Bolke. What needs to be done for Google Cloud operators/hooks? If you bring me up-to-speed I can do this. I'm currently working an a CI setup for the Google stuff). On Sat, Aug 6, 2016 at 3:56 PM Bolke de Bruin wrote: > I have indeed a branch with cherry picked commits. This branch is >

Re: Airflow Developers Meeting - 08/03 Notes

2016-08-23 Thread Alex Van Boxel
e response, I was on holiday) > > I think the G* operators just need to be cherry picked. This will make us > deviate slightly from the > previous release, but makes sure we don’t have to ‘fix’ history afterwards. > > Anyone against this? > > - B. > > > Op 8

Re: Airflow Developers Meeting - 08/03 Notes

2016-08-28 Thread Alex Van Boxel
I can add them to `branch-1.7.2-apache` > and > > start moving towards the release. > > > > Max > > > > On Tue, Aug 23, 2016 at 11:13 AM, Alex Van Boxel > wrote: > > > >> All stuff is in master (I'm running pre-production on master), if it >

Re: Airflow Releases

2016-09-30 Thread Alex Van Boxel
I'll do the same. Nice to have the 1.8 on the horizon. On Fri, Sep 30, 2016 at 5:51 PM Chris Riccomini wrote: > > I'm not sure how other projects do this, but I propose that we let the > RC settle for a week. > > +1 > > > Who's on board!? > > Me. :) > > On Fri, Sep 30, 2016 at 7:56 AM, Maxime Be

Re: Next Airflow meet-up

2016-10-01 Thread Alex Van Boxel
Hey guys, about the date. There is good chance I'm in SF for summit (14-15 November), I could try to extend it a day (so including November 16th). So if you're looking for a date, think about November 16th (I'll volunteer for a talk then). Would be great. Thanks. On Sat, Oct 1, 2016 at 8:51 AM si

Re: Next Airflow meet-up

2016-10-07 Thread Alex Van Boxel
Chris Riccomini wrote: > I think WePay can do November 16. Alex, can you make it then? > > On Thu, Oct 6, 2016 at 8:38 PM, siddharth anand wrote: > > I think a sane process for deciding meet-ups is to go in order off this > > list : https://cwiki.apache.org/confluence/display

Re: Next Airflow meet-up

2016-10-12 Thread Alex Van Boxel
@bolke: it's the "Google Expert summit". It's a closed conference I'm afraid. Will work at my abstract this evening. On Wed, Oct 12, 2016 at 2:56 AM siddharth anand wrote: > Wonderful! > > We have 3 speakers already. Paul, Alex, can you send a short talk sum

Cloud Provider grouping into Plugins

2016-10-14 Thread Alex Van Boxel
Hi all, I'm starting to write some very exotic Operators that are a bit strange adding to contrib. Examples of this are: + See if a Compute snapshot of a disc is created + See if a string appears on the serial port of Compute instance but they would be a nice addition if we had a Google Compute

Re: Cloud Provider grouping into Plugins

2016-10-14 Thread Alex Van Boxel
compatibility. We have a bunch of DAGs which import all kinds of GCP hooks and operators. Wouldn't want those to move. On Fri, Oct 14, 2016 at 7:54 AM, Alex Van Boxel wrote: > Hi all, > > I'm starting to write some very exotic Operators that are a bit strange > adding to contri

Re: Cloud Provider grouping into Plugins

2016-10-14 Thread Alex Van Boxel
t; :) I do notice a lot of AWS/GCP code (e.g. the S3 Redshift operator). > > > > On Fri, Oct 14, 2016 at 8:16 AM, Alex Van Boxel > wrote: > >> Well, I wouldn't touch the on that exist (maybe we could mark them > >> deprecated, but that's all). But I woul

Re: Airflow Logging

2016-10-24 Thread Alex Van Boxel
My requirement would indeed be that I would be able to add my own logging handler (I did the same in my Luigi days), I included a python log handler that logged to Google Cloud Logging. But I also like the current logging to Cloud storage. So my ideal logging setup would be: All of Airflow (sched

Re: Next Release?

2016-10-27 Thread Alex Van Boxel
I thought that the 15 November deadline for PR was in preparation for the 1.8 release. Do you need help with the release? I'm dedicating each week some time on Airflow anyway (although it's more writing operators :-). On Thu, Oct 27, 2016 at 6:22 PM siddharth anand wrote: > I believe the release

Crazy Airflow DAG ideas

2016-11-04 Thread Alex Van Boxel
I think that I just made quite a crazy DAG. I'm wondering if people can top this one. But, it's not about complexity is about the strangeness. *So, who can top this? (I'm quite interested what sort of crazy DAG's are out in the wild).* First context: we have an on premises SQL Server (and I want

Re: Merging the experimental API Framework

2016-11-29 Thread Alex Van Boxel
Although I haven't had the time to dive deep into API (sorry Bolke) I do want to be part of the discussion. I hope to have a look at it soon. On Tue, Nov 29, 2016 at 10:43 AM Bolke de Bruin wrote: > Flask App Builder looks great at a first glance and experience counts > obviously, though I have

Airflow-GCP for Google Container Engine

2016-12-02 Thread Alex Van Boxel
Hi all, I think I pulled it off to have kind of "one-execute" install of Airflow on Google Container Engine. Personally I find this my preferred setup (because I can have a production environment and staging environment). It's what I use for our production/staging setup. I think someone else could

Re: Airflow-GCP for Google Container Engine

2016-12-05 Thread Alex Van Boxel
rces on the pods, > or you let the workers consume whatever's available on the nodes? > > > On Fri, Dec 2, 2016 at 10:14 PM, Chris Riccomini > wrote: > > > Nice, thanks! :) > > > > On Fri, Dec 2, 2016 at 4:55 AM, Alex Van Boxel wrote: > > > > >

Re: Integration test env

2016-12-19 Thread Alex Van Boxel
robably want to have ultimate > > > power on access to this environment - it’s my company’s money on the > line > > > after all. Major downside to this is that it is dependent on and > limited > > by > > > the budget I can make available. Upside is that it is not company > > property. > > > Also I personally have less exposure to public cloud environments due > to > > > company restrictions. > > > > > > Are there any other options? Any thoughts? > > > > > > Bolke > > > > > > > > > > > > > > > > > > > > > -- _/ _/ Alex Van Boxel

Re: Scheduler error - cannot allocate memory

2016-12-27 Thread Alex Van Boxel
> >self.pid = > > > os.fork() > > > > > > OSError: [Errno 12] Cannot allocate > > > memory > > > > > > Traceback (most recent call last): > > > File "/usr/local/bin/airflow", line 15, in > > > > > > > > > The dags which failed didn't show any log (there weren't stored on > > airflow > > > instance and there is no remote logs). So we don't have any idea of > what > > > would happened (only that there was not enough memory to fork) > > > It is well known that is recommended to restart the scheduler > > periodically > > > (according to this > > > <https://medium.com/handy-tech/airflow-tips-tricks-and- > > pitfalls-9ba53fba14eb#.80c6g1n1s>), > > > but... do you have any idea why this can happen? Is there something we > > can > > > do (or some bug we can fix)? > > > > > > > > > Thanks in advance! > > > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 Alpha 1

2017-01-02 Thread Alex Van Boxel
your feedback is required as it is entrenched in new processing code that you are running in production afaik - so I wonder what happens in your fork. Happy New Year! Bolke -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 Alpha 1

2017-01-03 Thread Alex Van Boxel
If they should make the first alpha, maybe they should be rebased so they can be merged in. On Tue, Jan 3, 2017 at 2:39 PM Dan Davydov wrote: > I have also started on this effort, recently Alex Guziel and I have been > pushing Airbnb's custom cherries onto master to get Airbnb back

Re: Airflow 1.8.0 Alpha 1

2017-01-04 Thread Alex Van Boxel
; > > New features: > > > * Schedule all pending DAGs in a single loop > > > * Add support for backfill true/false > > > * Impersonation > > > * CGroups > > > * Add Cloud Storage updated sensor > > > > > > Alpha2 I will package

Re: Wrong DAG state after failure inside a branch

2017-01-04 Thread Alex Van Boxel
.set_upstream(t2) > > process2 = PythonOperator( > task_id='process2', > python_callable=lambda: True, > dag=dag) > process2.set_upstream(process1) > > process3 = PythonOperator( > task_id='process3', > python_callable=lambda: True, > dag=dag) > process3.set_upstream(process2) > > > At moment, I want my privacy to be protected. > https://mytemp.email/ > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 alpha 2

2017-01-04 Thread Alex Van Boxel
s not working properly), confirmed regression > > * celery instability Alex > > * Wrong DAG state after failure inside branch > > > > So Alpha 2 is definitely not ready for production, but please do put in > > your canary dags and let them run. I am still quite

Trigger behaviour with skipped upstream tasks (request for opinions)

2017-01-08 Thread Alex Van Boxel
r should expect the skipped state to propagate. Thinking about it, the trigger rule all_success in a joining task is not logical. We could almost raise when/if detecting it. *Reply from Alex:* My understanding was that skipped == success, because when a skipped reaches the end the DAG is marked a

Re: Airflow Release Planning and Supported Release Lifetime

2017-01-08 Thread Alex Van Boxel
This looks good, except do we need a release manager that applies patches? On Sun, Jan 8, 2017, 14:36 Bolke de Bruin wrote: > Hi All, > > As part of the release process I have created "Airflow Release Planning > and Supported Release Lifetime” ( > https://cwiki.apache.org/confluence/display/AIRF

Re: Airflow Release Planning and Supported Release Lifetime

2017-01-08 Thread Alex Van Boxel
to me makes sense, but maybe an other option is > better? > > Bolke > > > Sent from my iPhone > > > On 8 Jan 2017, at 18:27, Alex Van Boxel wrote: > > > > This looks good, except do we need a release manager that applies > patches? > > > >> On Sun

Re: Refactoring Connection

2017-01-09 Thread Alex Van Boxel
ut > requires more work every time you create a new ConnectionType. > > Hope this proposal is clear enough, and I'm waiting for feebacks and > possible improvements. > > Regards > Gael Magnan de Bornier > -- _/ _/ Alex Van Boxel

Re: Refactoring Connection

2017-01-09 Thread Alex Van Boxel
coded connection type, but I though this might be > something that comes back regularly and having a simple way to plug in new > types of connection would make it easier for anyone to contribute a new > connection type. > > Hope this clarifies my proposal. > > Le lun. 9 janv. 2

Re: Refactoring Connection

2017-01-09 Thread Alex Van Boxel
e cases. > > Le lun. 9 janv. 2017 à 13:36, Alex Van Boxel a écrit : > > > Thanks a lot, yes it clarifies a lot and I do agree you really need to > hack > > inside Airflow to add a Connection type. While you're working at this > could > > you have a look at t

Re: Airflow 1.8.0 alpha 4

2017-01-11 Thread Alex Van Boxel
entation still required, > integration tests seem to pass flawlessly > > * Cgroups + impersonation: clean up of patches on going, more tests and > more elaborate documentation required. Integration tests not executed yet > > * Schedule all pending DAG runs in a single scheduler loop: no progress > (**) > > > > Cheers! > > Bolke > > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 alpha 5

2017-01-14 Thread Alex Van Boxel
t log location for Childs correctly > * fix systems scripts > * daemonizing of webserver > > Almost there! > Bolke -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 1

2017-01-16 Thread Alex Van Boxel
* Better support for sub second scheduling > * Rolling restart of web workers > * nvd3.js instead of highcharts > * New dependency engine making debugging why my task is running easier > * Many UI updates > * Many new operators > * Many, many, many bugfixes > > RELEASE PLANNING > > Beta 2: 20 Jan > Beta 3: 25 Jan > RC1: 2 Feb > > Cheers > Bolke > > > > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 1

2017-01-18 Thread Alex Van Boxel
Hey Max, As I'm missing the 1.7.2 labels I compared to the 172 branch. Can you have a look at PR 2000. Its also sanitised, removing some of the commits that doesn't bring value to the users. On Wed, Jan 18, 2017, 02:51 Maxime Beauchemin wrote: > Alex, for the CHANGELOG.md, I&

Re: Experiences with 1.8.0

2017-01-20 Thread Alex Van Boxel
gt;> - Checking was logged to INFO -> requires a fsync for every log message > >> making it very slow > >> - Checking would happen at every restart, but dag_runs’ states were not > >> being updated > >> - These dag_runs would never er be marked anything else than running for > >> some reason > >> -> Applied work around to update all dag_run in sql before a certain > date > >> to -> finished > >> -> need to investigate why dag_runs did not get marked “finished/failed” > >> > >> 5. Our umask is set to 027 > >> > >> > > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 2

2017-01-20 Thread Alex Van Boxel
t?). > > >> > > >> I would like to encourage everyone to try it out, to report back any > > >> issues so we get to a rock solid release of 1.8.0. When reporting > > issues a > > >> test case or even a fix is highly appreciated. > > >> > > >> Cgroups+impersonation are now in. This means we are feature complete > for > > >> 1.8.0! > > >> > > >> Cheers > > >> Bolke > > > -- _/ _/ Alex Van Boxel

Medium series: Airflow for Google Cloud

2017-01-20 Thread Alex Van Boxel
e will be about DataProc. -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 3

2017-01-25 Thread Alex Van Boxel
to the driver > * Poison pill taken while task has exited > * Keep cgroups optional > * Funcsigs pinned to 1.0.0 > > Issue(s) remaining (blocker for RC): > * Cgroups not py3 compatible > > If all goes well we should have a Release Candidate on Feb 2. Thanks for > reporting issues and keep on testing please :). Moving towards RC I tend to > like small bug fixes only. When we mark RC (do we need to vote on this?) > the procedure becomes even more strict. Please remember that the FINAL > release is dependent on a vote on the IPMC mailinglist. > > Cheers > Bolke -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 3

2017-01-25 Thread Alex Van Boxel
4 AM Bolke de Bruin wrote: > Mmm that is due to the reverting of one changes to the db. Need to look > into that how to fix it. > > Sent from my iPhone > > > On 26 Jan 2017, at 00:51, Alex Van Boxel wrote: > > > > I do seem to have a problem upgrading the MySQL da

Re: Airflow 1.8.0 BETA 3

2017-01-25 Thread Alex Van Boxel
The downgrade is ok for testing, but we can't release with this change (can't expect people to install a beta first). On Thu, Jan 26, 2017 at 8:36 AM Alex Van Boxel wrote: > Another thing that I noticed (but observed it in beta 2 as well). Is the > following: > > - The fo

Re: Airflow 1.8.0 BETA 3

2017-01-26 Thread Alex Van Boxel
> > On 26 Jan 2017, at 08:36, Alex Van Boxel wrote: > > > > Another thing that I noticed (but observed it in beta 2 as well). Is the > > following: > > > > - The following trigger should not fire. > > --- Trigger rule is ONE_SUCCESS > > --- upstream: U

Re: Airflow 1.8.0 BETA 3

2017-01-26 Thread Alex Van Boxel
About the database error: starting from scratch also gives the same error: Fresh install. Delete airflow.db sqllite db. And the : airflow initdb same error as above. On Thu, Jan 26, 2017 at 10:12 AM Alex Van Boxel wrote: > Not directly one I can share. I'll spend some time looking at

Re: Airflow 1.8.0 BETA 3

2017-01-26 Thread Alex Van Boxel
a5a9e6bf2b5' consistent as my cluster setup (that one is build in the docker container) and this one my mac (both give the same error) On Thu, Jan 26, 2017 at 11:40 AM Bolke de Bruin wrote: > Hi Alex > > Are you sure you are in a fully clean environment? Sometimes things remain >

Re: Airflow 1.8.0 BETA 4

2017-01-27 Thread Alex Van Boxel
I'm also abandoning the follow problem: * DAG marked success, with half of the Tasks never scheduled (Alex) I *can't simulate it locally*... I'll keep an eye on the production runs though. On Fri, Jan 27, 2017 at 8:45 PM Bolke de Bruin wrote: > 5.6.4 is the minimum. Y

Re: Airflow 1.8.0 BETA 5

2017-01-30 Thread Alex Van Boxel
t; >> * Scheduler would terminate immediately if no dag files present > >> > >> ** As this touches the scheduler logic I though it warranted another > beta. > >> > >> This should be the last beta in my opinion and we can prepare changelog, > >> upgrade notes and release notes for the RC (Feb 2). > >> > >> Cheers > >> Bolke > -- _/ _/ Alex Van Boxel

Re: Airflow 1.8.0 BETA 5

2017-01-30 Thread Alex Van Boxel
ine 264, in process_file m = imp.load_source(mod_name, filepath) File "/home/airflow/dags/marketing_segmentation.py", line 17, in import bqschema ImportError: No module named bqschema *I don't think this is incorrect?!* On Mon, Jan 30, 2017 at 11:46 PM Dan Davydov wrote:

  1   2   >