Re: Catchup By default = False vs LatestOnlyOperator

2018-07-21 Thread Ben Tallman
As the author of catch-up, the idea is that in many cases your data doesn't 
"window" nicely and you want instead to just run as if it were a brilliant 
Cron...

Ben

Sent from my iPhone

> On Jul 20, 2018, at 11:39 PM, Shah Altaf  wrote:
> 
> Hi my understanding is: if you use the LatestOnlyOperator then when you run
> the DAG for the first time you'll see a whole bunch of DAG runs queued up,
> and in each run the LatestOnlyOperator will cause the rest of the DAG run
> to be skipped.  Only the latest DAG will run in 'full'.
> 
> With catchup = False, you should just get just the latest DAG run.
> 
> 
> On Fri, Jul 20, 2018 at 10:58 PM Shubham Gupta 
> wrote:
> 
>> -- Forwarded message -
>> From: Shubham Gupta 
>> Date: Fri, Jul 20, 2018 at 2:38 PM
>> Subject: Catchup By default = False vs LatestOnlyOperator
>> To: 
>> 
>> 
>> Hi!
>> 
>> Can someone please explain the difference b/w catchup by default = False
>> and LatestOnlyOperator?
>> 
>> Regarding
>> Shubham Gupta
>> 


Re: [VOTE] Release Airflow 1.10.0

2018-07-21 Thread Bolke de Bruin
Hi Justin,

Thank you for the thorough review! I have created AIRFLOW-2779 to track most of 
the issues you have raised. 

On the GPL dependency you mentioned. We are not distributing GPL sources, not 
in source or in binary form. This has never been the case. In the third degree 
there potentially was a GPL issue during runtime. The author of the package in 
question (unidecode) when asked mentioned several times that he considered the 
usage equal to an API (ie. like the Linux kernel exposing a set of generic 
calls) and the API could be implemented by an alternative. This was discussed 
in LEGAL-362, which you took part in.

We managed to convince the upstream package maintainers (python-slugify and 
python-nvd3) to allow a patch that allowed switching to a different API 
implementation by setting a environment variable while installing their 
packages and to release new versions. However it is not the default for them. 
This means at least that the situation we are now in is an improvement over the 
previous releases (1.8.0 -> 1.8.1 -> 1.8.2 -> 1.9.0) as there was no way switch 
and avoid the package before.

As to our solution (for now). Python packages are often installed site-wide and 
can be part of the dependencies of other packages. While we maybe could enforce 
the installation of the non-GPL API it would/could 1) interfere with other 
packages on the same system that do not set this environment variable 
explicitly. 2) If any the other packages upgrades without setting this variable 
it would pull in the GPL API. So we decided that it would be better to educate 
the user and make it part of the install instructions.

We can reconsider, but we cannot solve #1 and #2. Which, in my opinion, would 
make it more opaque to the users. 

Given the current situation is at least improvement over the old situation can 
you reconsider your -1 for this release and preferably agree with our approach 
(or maybe have an improvement over it)?   

Cheers
Bolke



> On 21 Jul 2018, at 03:03, Justin Mclean  wrote:
> 
> Hi,
> 
> -1 (binding) because of GPL dependancy
> 
> I checked the source release:
> - incubating in name
> - signatures and hash good but please remove md5 hashes and don’t publish then
> - DISCLAIMER exists
> - Year in NOTICE is not correct "2016 and onwards” isn’t valid as copyright 
> has an expiry date
> - NOTICE and LICENSE have a couple of minor issues (see below)
> - Several files look to have incorrect headers with copyright lines 
> [8][9][10] Are these actually 3rd party files?
> - No unexpected binary files
> - Failed to install, probably my set up. Would be nice to note python version 
> required and supported OS’s in INSTALL.
> 
> LICENSE is:
> - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not 
> required to list them but it’s a good idea to do so.
> - missing the license for this [5]
> - this file [7] oddly has © 2016 GitHub, Inc.at the bottom of it
> 
> This files [1][2] seem to be 3rd party ALv2 licensed files that refers to a 
> NOTICE file, that information in that NOTICE file (at the very least the 
> copyright into) should be in your NOTICE file. This should also be noted in 
> LICENSE.
> 
> I also find it very odd that the GPL dependancy unidecode is opt out, rather 
> than opt in (ie the user has to do something to not get it) and that makes it 
> non optional IMO [6].  Can you explain why it was done this way and I’ll 
> consider changing my vote.
> 
> Thanks,
> Justin
> 
> 1. /airflow/security/utils.py
> 2. ./airflow/security/kerberos.py
> 3. ./airflow/www_rbac/static/jqClock.min.js
> 4. ./airflow/www/static/bootstrap3-typeahead.min.js
> 5. ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh
> 6. https://www.apache.org/legal/resolved.html#optional
> 7. ./docs/license.rst
> 8. airflow/contrib/auth/backends/google_auth.py
> 9. /airflow/contrib/auth/backends/github_enterprise_auth.py
> 10. /airflow/contrib/hooks/ssh_hook.py
> 11. /airflow/minihivecluster.py
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 



Re: Sep Airflow Bay Area Meetup @ Google

2018-07-21 Thread Feng Lu
Sounds great, thank you Ben.
When you get a chance, could you please send me your talk
title/abstract/session type(regular or lightening)?

On Fri, Jul 20, 2018 at 2:10 PM Ben Gregory  wrote:

> Hey Feng!
>
> Awesome to hear that you're hosting the next meetup! We'd love to give a
> talk (and potentially a lightning session if available) -- we have a number
> of topics we could speak on but off the top of our heads we're thinking
> "Running Cloud Native Airflow", tying in some of our work on the Kubernetes
> Executor. How does that sound?
>
> Also, if there ends up being an Airflow hackathon, you can absolutely
> count us in. Let us know how we can help coordinate if the need presents
> itself!
>
> -Ben
>
> On Thu, Jul 19, 2018 at 3:26 PM Feng Lu  wrote:
>
>> Hi all,
>>
>> Hope you are enjoying your summer!
>>
>> This is Feng Lu from Google and we'll host the next Airflow meetup in
>> our Sunnyvale
>> campus . We plan to add
>> a *lightening
>> session* this time for people to share their airflow ideas, work in
>> progress, pain points, etc.
>> Here's the meetup date and schedule:
>>
>> -- Sep 24 (Monday)  --
>> 6:00PM meetup starts
>> 6:00 - 8:00PM light dinner /mix-n-mingle
>> 8:00PM - 9:40PM: 5 sessions (20 minutes each)
>> 9:40PM to 10:10PM: 6 lightening sessions (5 minutes each)
>> 10:10PM to 11:00PM: drinks and social hour
>>
>> I've seen a lot of interesting discussions in the dev mailing-list on
>> security, scalability, event interactions, future directions, hosting
>> platform and others. Please feel free to send your talk proposal to us by
>> replying this email.
>>
>> The Cloud Composer team is also going to share their experience running
>> Apache Airflow as a managed solution and service roadmap.
>>
>> Thank you and looking forward to hearing from y'all soon!
>>
>> p.s., if folks are interested, we can also add a one-day Airflow hackathon
>> prior to the meet-up on the same day, please let us know.
>>
>> Feng
>>
>
>
> --
>
> [image: Astronomer Logo] 
>
> *Ben Gregory*
> Data Engineer
>
> Mobile: +1-615-483-3653 • Online: astronomer.io
> 
>
> Download our new ebook.  From
> Volume to Value - A Guide to Data Engineering.
>