[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-28 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Description: 
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be  unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) and it is 
described in 
[https://docs.python.org/3/library/multiprocessing.html#pipes-and-queues] when 
you go details of how data is synchronized between processes.

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py years ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was working on Linux but is not really acceptable because qsize() does not work 
on MacOS (throws NotImplementedError).

 

The proposed solution 1): 

Implement a more reliable queue (SynchronizedQueue) based on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (but we have to addjust initialisation to match 2.7 and 3.5+ syntax (since we 
want to backport to stable v1.10 releasE).

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

The solution 2)

Seems that this unreliable behaviour of Queue is only happening if the Queue is 
instantiated directly and the small delays between processes are gone when 
Shared Manager is used (because then Queue is a proxy to a central Queue 
started in a separate process - thus synchronisation is implemented in this 
single queue: 
[https://docs.python.org/3/library/multiprocessing.html#pipes-and-queues] . 
Switching to Managed queue should solve the problem as well.

 

 

  was:
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py years ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was working on Linux but is not really acceptable because qsize() does not work 
on MacOS (throws NotImplementedError).

The solution 1)

Implement a reliable queue (SynchronizedQueue) based on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has 

[GitHub] [airflow] benbenbang commented on issue #5137: [AIRFLOW-4363] Add try-catch for retrieving `status` from cli in docker operator

2019-04-28 Thread GitBox
benbenbang commented on issue #5137: [AIRFLOW-4363] Add try-catch for 
retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5137#issuecomment-487452686
 
 
   All done, thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5201: [Airlfow-XXXX] Add docs for Google Cloud Storage Sensors

2019-04-28 Thread GitBox
potiuk commented on issue #5201: [Airlfow-] Add docs for Google Cloud 
Storage Sensors
URL: https://github.com/apache/airflow/pull/5201#issuecomment-487449971
 
 
   Just the last commit is needed since the sensor is already merged. Can you 
please rebase just the last on on top of master 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jmcarp commented on issue #5196: [AIRFLOW-4423] Improve date handling in mysql to gcs operator.

2019-04-28 Thread GitBox
jmcarp commented on issue #5196: [AIRFLOW-4423] Improve date handling in mysql 
to gcs operator.
URL: https://github.com/apache/airflow/pull/5196#issuecomment-487444578
 
 
   @feluelle: I updated the tests and docs like you suggested.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on issue #4890: [AIRFLOW-4048] HttpSensor provide-context to response_check

2019-04-28 Thread GitBox
zhongjiajie commented on issue #4890: [AIRFLOW-4048] HttpSensor provide-context 
to response_check
URL: https://github.com/apache/airflow/pull/4890#issuecomment-487434271
 
 
   @raphaelauv  BTW, we are already remove `py2` in our CI, and you failed test 
is in `py2.7` part. Maybe we should check CI log after you rebase on master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on issue #4890: [AIRFLOW-4048] HttpSensor provide-context to response_check

2019-04-28 Thread GitBox
zhongjiajie commented on issue #4890: [AIRFLOW-4048] HttpSensor provide-context 
to response_check
URL: https://github.com/apache/airflow/pull/4890#issuecomment-487434079
 
 
   @raphaelauv You could use `git push -f` re-submit commit and the failed CI 
will restart.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] TV4Fun commented on issue #5169: [AIRFLOW-3143] Support Auto-Zone in DataprocClusterCreateOperator

2019-04-28 Thread GitBox
TV4Fun commented on issue #5169: [AIRFLOW-3143] Support Auto-Zone in 
DataprocClusterCreateOperator
URL: https://github.com/apache/airflow/pull/5169#issuecomment-487424425
 
 
   @fenglu-g @potiuk


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5201: [Airlfow-XXXX] Add docs for Google Cloud Storage Sensors

2019-04-28 Thread GitBox
codecov-io edited a comment on issue #5201: [Airlfow-] Add docs for Google 
Cloud Storage Sensors
URL: https://github.com/apache/airflow/pull/5201#issuecomment-487417345
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=h1) 
Report
   > Merging 
[#5201](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `93.61%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5201/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5201  +/-   ##
   ==
   - Coverage   78.54%   78.54%   -0.01% 
   ==
 Files 469  469  
 Lines   2989629896  
   ==
   - Hits2348323482   -1 
   - Misses   6413 6414   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/sensors/gcs\_sensor.py](https://codecov.io/gh/apache/airflow/pull/5201/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL3NlbnNvcnMvZ2NzX3NlbnNvci5weQ==)
 | `69.47% <93.61%> (ø)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5201/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=footer). 
Last update 
[9daad7e...84b45b8](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5201: [Airlfow-XXXX] Add docs for Google Cloud Storage Sensors

2019-04-28 Thread GitBox
codecov-io commented on issue #5201: [Airlfow-] Add docs for Google Cloud 
Storage Sensors
URL: https://github.com/apache/airflow/pull/5201#issuecomment-487417345
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=h1) 
Report
   > Merging 
[#5201](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `93.61%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5201/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5201  +/-   ##
   ==
   - Coverage   78.54%   78.54%   -0.01% 
   ==
 Files 469  469  
 Lines   2989629896  
   ==
   - Hits2348323482   -1 
   - Misses   6413 6414   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/sensors/gcs\_sensor.py](https://codecov.io/gh/apache/airflow/pull/5201/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL3NlbnNvcnMvZ2NzX3NlbnNvci5weQ==)
 | `69.47% <93.61%> (ø)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5201/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=footer). 
Last update 
[9daad7e...84b45b8](https://codecov.io/gh/apache/airflow/pull/5201?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on issue #5201: [Airlfow-XXXX] Add docs for Google Cloud Storage Sensors

2019-04-28 Thread GitBox
jaketf commented on issue #5201: [Airlfow-] Add docs for Google Cloud 
Storage Sensors
URL: https://github.com/apache/airflow/pull/5201#issuecomment-487414475
 
 
   @potiuk @OmerJog hopefully this addresses the documentation gap in 
`integrations.rst`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf opened a new pull request #5201: [Airlfow-XXXX] Add docs for Google Cloud Storage Sensors

2019-04-28 Thread GitBox
jaketf opened a new pull request #5201: [Airlfow-] Add docs for Google 
Cloud Storage Sensors
URL: https://github.com/apache/airflow/pull/5201
 
 
   This PR adds documentation for the Google Cloud Storage sensors to 
`integrations.rst`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] andriisoldatenko commented on a change in pull request #5048: [AIRFLOW-3370] Add stdout output options to Elasticsearch task log handler

2019-04-28 Thread GitBox
andriisoldatenko commented on a change in pull request #5048: [AIRFLOW-3370] 
Add stdout output options to Elasticsearch task log handler
URL: https://github.com/apache/airflow/pull/5048#discussion_r279211175
 
 

 ##
 File path: airflow/utils/log/es_task_handler.py
 ##
 @@ -119,7 +146,10 @@ def _read(self, ti, try_number, metadata=None):
 if offset != next_offset or 'last_log_timestamp' not in metadata:
 metadata['last_log_timestamp'] = str(cur_ts)
 
-message = '\n'.join([log.message for log in logs])
+# If we hit the end of the log, remove the actual end_of_log message
 
 Review comment:
   @KevinYang21 can you please demonstrate example?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf commented on issue #5166: [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

2019-04-28 Thread GitBox
jaketf commented on issue #5166: [AIRFLOW-4397] Add 
GCSUploadSessionCompleteSensor
URL: https://github.com/apache/airflow/pull/5166#issuecomment-487412971
 
 
   @potiuk @OmerJog Sure I'll add a PR for adding the docs for the GCS sensors. 
It looks like I inherited the lack of docs from the the other GCS sensors too :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5196: [AIRFLOW-4423] Improve date handling in mysql to gcs operator.

2019-04-28 Thread GitBox
codecov-io edited a comment on issue #5196: [AIRFLOW-4423] Improve date 
handling in mysql to gcs operator.
URL: https://github.com/apache/airflow/pull/5196#issuecomment-487343692
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=h1) 
Report
   > Merging 
[#5196](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `95.23%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5196/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5196  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  469  
 Lines   2989629898   +2 
   ==
   + Hits2348323486   +3 
   + Misses   6413 6412   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/operators/mysql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/5196/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9teXNxbF90b19nY3MucHk=)
 | `92.59% <95.23%> (+1.61%)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5196/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=footer). 
Last update 
[9daad7e...cadae0a](https://codecov.io/gh/apache/airflow/pull/5196?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-28 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828780#comment-16828780
 ] 

Jarek Potiuk commented on AIRFLOW-4401:
---

I had to also add some shutdowns to clean-up the extra processes running in 5200

> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.4
>
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top of my head. 
> We had a number of flaky tests in the local executor that resulted in 
> result_queue not being empty where it should have been emptied a moment 
> before. More details and discussion can be found in 
> [https://github.com/apache/airflow/pull/5159] 
> The problem turned out to be ... unreliability of multiprocessing.Queue 
> empty() implementation. It turned out that multiprocessing.Queue.empty() 
> implementation is not fully synchronized and it might return True even if 
> put() operation has been already completed in another process. What's more - 
> empty() might return True even if qsize() of the queue returns > 0 (!) 
> It's a bit mind-boggling but it is "as intended' as documented in 
> [https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 
> A few people have stumbled upon this problem. For example 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  and [https://github.com/keras-team/autokeras/issues/368] 
> Also we seemed to experienced that in Airflow before. In jobs.py years ago 
> (31.07.2016) - we can see the comment below (but we used 
> multiprocessing.queue empty() nevertheless):
> {code:java}
> # Not using multiprocessing.Queue() since it's no longer a separate
> # process and due to some unusual behavior. (empty() incorrectly
> # returns true?){code}
> The solution available in [https://bugs.python.org/issue23582]  using qsize() 
> was working on Linux but is not really acceptable because qsize() does not 
> work on MacOS (throws NotImplementedError).
> The working solution is to implement a reliable queue (SynchronizedQueue) 
> based on 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  (butwith a twist that __init__ of class deriving from Queue has to be 
> changed for python 3.4+ as described in 
> [https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].
> Luckily we are now Python3.5+
> We should replace all usages of multiprocessing.Queue where empty() is used 
> with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
> in similar way in the future.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] potiuk commented on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
potiuk commented on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where 
empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487394715
 
 
   Seems that https://github.com/apache/airflow/pull/5200/ solves it in a 
simpler way. Let's wait for the build to finish, but it looks good from local 
hosting.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
potiuk edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue used 
where empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487394715
 
 
   Seems that https://github.com/apache/airflow/pull/5200/ solves it in a 
simpler way. Let's wait for the build to finish, but it looks good from local 
testing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue 
used where empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487392721
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=h1) 
Report
   > Merging 
[#5199](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `81.48%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5199/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5199  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  470   +1 
 Lines   2989629932  +36 
   ==
   + Hits2348323514  +31 
   - Misses   6413 6418   +5
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.27% <100%> (ø)` | :arrow_up: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `80.41% <40%> (-0.65%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `88.57% <88.57%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=footer). 
Last update 
[9daad7e...462d7cd](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue 
used where empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487392721
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=h1) 
Report
   > Merging 
[#5199](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `81.48%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5199/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5199  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  470   +1 
 Lines   2989629932  +36 
   ==
   + Hits2348323514  +31 
   - Misses   6413 6418   +5
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.27% <100%> (ø)` | :arrow_up: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `80.41% <40%> (-0.65%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `88.57% <88.57%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=footer). 
Last update 
[9daad7e...462d7cd](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
codecov-io commented on issue #5199: [AIRFLOW-4401] SynchronizedQueue used 
where empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487392721
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=h1) 
Report
   > Merging 
[#5199](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `81.48%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5199/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5199  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  470   +1 
 Lines   2989629932  +36 
   ==
   + Hits2348323514  +31 
   - Misses   6413 6418   +5
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.27% <100%> (ø)` | :arrow_up: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `80.41% <40%> (-0.65%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `88.57% <88.57%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=footer). 
Last update 
[9daad7e...462d7cd](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk opened a new pull request #5200: [aIRFLOW-4401] Use managers for Queue synchronization

2019-04-28 Thread GitBox
potiuk opened a new pull request #5200: [aIRFLOW-4401] Use managers for Queue 
synchronization
URL: https://github.com/apache/airflow/pull/5200
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4401
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   It is a known problem https://bugs.python.org/issue23582 that
   multiprocessing.Queue empty() method is not reliable - sometimes it might
   return True even if another process already put something in the queue.
   
   This resulted in some of the tasks not picked up when sync() methods
   were called (in AirflowKubernetesScheduler, LocalExecutor,
   DagFileProcessor). This was less of a problem if the method was called in 
sync()
   - as the remaining jobs/files could be processed in next pass but it was a 
problem
   in tests and when graceful shutdown was executed (some tasks could be still
   unprocessed while the shutdown occured).
   
   We switched to Managers() managed queues to handle that - the queue in this 
case
   is run in a separate subprocess and each process using it uses a proxy to 
access
   this shared queue.
   
   All the cases impacted follow the same pattern now:
   
   while not queue.empty():
  res = queue.get()
  
   
   This loop runs always in single (main) process so it is safe to run it this 
way -
   there is no risk that some other process will retrieve the data from the 
queue in
   between empty() and get().
   
   In all these cases overhead for inter-processing locking is negligible
   comparing to the action executed (Parsing DAG, executing job)
   so it appears it should be safe to merge the change.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Tests were there (but flaky)
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-28 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828045#comment-16828045
 ] 

Jarek Potiuk commented on AIRFLOW-4401:
---

And we have now a competing (simpler) implementation 
[https://github.com/apache/airflow/pull/5200] following discussion with the 
mysterious @airflowuser . 

> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.4
>
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top of my head. 
> We had a number of flaky tests in the local executor that resulted in 
> result_queue not being empty where it should have been emptied a moment 
> before. More details and discussion can be found in 
> [https://github.com/apache/airflow/pull/5159] 
> The problem turned out to be ... unreliability of multiprocessing.Queue 
> empty() implementation. It turned out that multiprocessing.Queue.empty() 
> implementation is not fully synchronized and it might return True even if 
> put() operation has been already completed in another process. What's more - 
> empty() might return True even if qsize() of the queue returns > 0 (!) 
> It's a bit mind-boggling but it is "as intended' as documented in 
> [https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 
> A few people have stumbled upon this problem. For example 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  and [https://github.com/keras-team/autokeras/issues/368] 
> Also we seemed to experienced that in Airflow before. In jobs.py years ago 
> (31.07.2016) - we can see the comment below (but we used 
> multiprocessing.queue empty() nevertheless):
> {code:java}
> # Not using multiprocessing.Queue() since it's no longer a separate
> # process and due to some unusual behavior. (empty() incorrectly
> # returns true?){code}
> The solution available in [https://bugs.python.org/issue23582]  using qsize() 
> was working on Linux but is not really acceptable because qsize() does not 
> work on MacOS (throws NotImplementedError).
> The working solution is to implement a reliable queue (SynchronizedQueue) 
> based on 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  (butwith a twist that __init__ of class deriving from Queue has to be 
> changed for python 3.4+ as described in 
> [https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].
> Luckily we are now Python3.5+
> We should replace all usages of multiprocessing.Queue where empty() is used 
> with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
> in similar way in the future.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
codecov-io edited a comment on issue #5199: [AIRFLOW-4401] SynchronizedQueue 
used where empty() is used.
URL: https://github.com/apache/airflow/pull/5199#issuecomment-487392721
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=h1) 
Report
   > Merging 
[#5199](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/9daad7ecd14fabfc3f441074670f4b38aa93e8b0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `81.48%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5199/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5199  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  470   +1 
 Lines   2989629932  +36 
   ==
   + Hits2348323514  +31 
   - Misses   6413 6418   +5
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.27% <100%> (ø)` | :arrow_up: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `80.41% <40%> (-0.65%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `88.57% <88.57%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5199/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=footer). 
Last update 
[9daad7e...462d7cd](https://codecov.io/gh/apache/airflow/pull/5199?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4382) Limited Parallelism test is flaky

2019-04-28 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4382:
--
Description: 
-The SFTPOperator tests seems to fail many times.-

Limited parallelism test is flaky.

Previously the diagnosis of that test indicated that the problem is with 
SFTPOperator, but in fact all those log files there are expected. All the 
builds below fail because of limited parallelism test:

This is the failure message:

 
{code:java}
==
49) FAIL: test_execution_limited_parallelism 
(tests.executors.test_local_executor.LocalExecutorTest)
--
   Traceback (most recent call last):
    tests/executors/test_local_executor.py line 77 in 
test_execution_limited_parallelism
      self.execution_parallelism(parallelism=test_parallelism)
    tests/executors/test_local_executor.py line 61 in execution_parallelism
      self.assertEqual(len(executor.running), 0)
   AssertionError: 1 != 0
 
{code}
 

 Examples for PRs where this test fails:

[https://travis-ci.org/apache/airflow/jobs/522789194]

[https://travis-ci.org/apache/airflow/jobs/522931911]

[https://travis-ci.org/apache/airflow/jobs/523348590]

 

THIS IS NOT AN error: (mistakenly identified as one):
{code:java}
 

INFO [airflow.task] Dependencies all met for  INFO [airflow.task]  

 INFO [airflow.task] Starting attempt 1 of 1 INFO [airflow.task]  

 INFO [airflow.task] Executing  on 2019-04-22 
06:02:24.255945+00:00 INFO [paramiko.transport] Connected (version 2.0, client 
OpenSSH_7.2p2) INFO [paramiko.transport] Authentication (publickey) successful! 
INFO [paramiko.transport.sftp] [chan 0] Opened sftp connection (server version 
3) INFO [airflow.task.operators] Starting to transfer from 
/tmp/test_remote_file to /tmp/tmp2/test_local_file ERROR [airflow.task] Error 
while transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, 
error: [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file' 
Traceback (most recent call last):  File 
"/app/airflow/contrib/operators/sftp_operator.py", line 138, in execute  
sftp_client.get(self.remote_filepath, self.local_filepath)  File 
"/app/.tox/py35-backend_sqlite-env_docker/lib/python3.5/site-packages/paramiko/sftp_client.py",
 line 801, in get  with open(localpath, "wb") as fl: FileNotFoundError: [Errno 
2] No such file or directory: '/tmp/tmp2/test_local_file'   During handling of 
the above exception, another exception occurred:   Traceback (most recent call 
last):  File "/app/airflow/models/taskinstance.py", line 888, in _run_raw_task  
result = task_copy.execute(context=context)  File 
"/app/airflow/contrib/operators/sftp_operator.py", line 155, in execute  
.format(file_msg, str(e))) airflow.exceptions.AirflowException: Error while 
transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, error: 
[Errno 2] No such file or directory: '/tmp/tmp2/test_local_file'{code}
 

  was:
The SFTPOperator tests seems to fail many times.

 Examples for PRs where this test fails:

[https://travis-ci.org/apache/airflow/jobs/522789194]

[https://travis-ci.org/apache/airflow/jobs/522931911]

[https://travis-ci.org/apache/airflow/jobs/523348590]

 
{code:java}
 

INFO [airflow.task] Dependencies all met for  INFO [airflow.task]  

 INFO [airflow.task] Starting attempt 1 of 1 INFO [airflow.task]  

 INFO [airflow.task] Executing  on 2019-04-22 
06:02:24.255945+00:00 INFO [paramiko.transport] Connected (version 2.0, client 
OpenSSH_7.2p2) INFO [paramiko.transport] Authentication (publickey) successful! 
INFO [paramiko.transport.sftp] [chan 0] Opened sftp connection (server version 
3) INFO [airflow.task.operators] Starting to transfer from 
/tmp/test_remote_file to /tmp/tmp2/test_local_file ERROR [airflow.task] Error 
while transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, 
error: [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file' 
Traceback (most recent call last):  File 
"/app/airflow/contrib/operators/sftp_operator.py", line 138, in execute  
sftp_client.get(self.remote_filepath, self.local_filepath)  File 
"/app/.tox/py35-backend_sqlite-env_docker/lib/python3.5/site-packages/paramiko/sftp_client.py",
 line 801, in get  with open(localpath, "wb") as fl: FileNotFoundError: [Errno 
2] No such file or directory: '/tmp/tmp2/test_local_file'   During handling of 
the above exception, another exception occurred:   Traceback (most recent call 
last):  File "/app/airflow/models/taskinstance.py", line 888, in 

[jira] [Updated] (AIRFLOW-4382) Limited Parallelism test is flaky

2019-04-28 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4382:
--
Summary: Limited Parallelism test is flaky  (was: SFTPOperator tests are 
flaky)

> Limited Parallelism test is flaky
> -
>
> Key: AIRFLOW-4382
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4382
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: tests
>Reporter: jack
>Priority: Major
> Fix For: 1.10.4
>
>
> The SFTPOperator tests seems to fail many times.
>  Examples for PRs where this test fails:
> [https://travis-ci.org/apache/airflow/jobs/522789194]
> [https://travis-ci.org/apache/airflow/jobs/522931911]
> [https://travis-ci.org/apache/airflow/jobs/523348590]
>  
> {code:java}
>  
> INFO [airflow.task] Dependencies all met for  unit_teststest_schedule_dag_once.test_sftp 2019-04-22 06:02:24.255945+00:00 
> [None]> INFO [airflow.task]  
> 
>  INFO [airflow.task] Starting attempt 1 of 1 INFO [airflow.task]  
> 
>  INFO [airflow.task] Executing  on 2019-04-22 
> 06:02:24.255945+00:00 INFO [paramiko.transport] Connected (version 2.0, 
> client OpenSSH_7.2p2) INFO [paramiko.transport] Authentication (publickey) 
> successful! INFO [paramiko.transport.sftp] [chan 0] Opened sftp connection 
> (server version 3) INFO [airflow.task.operators] Starting to transfer from 
> /tmp/test_remote_file to /tmp/tmp2/test_local_file ERROR [airflow.task] Error 
> while transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, 
> error: [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file' 
> Traceback (most recent call last):  File 
> "/app/airflow/contrib/operators/sftp_operator.py", line 138, in execute  
> sftp_client.get(self.remote_filepath, self.local_filepath)  File 
> "/app/.tox/py35-backend_sqlite-env_docker/lib/python3.5/site-packages/paramiko/sftp_client.py",
>  line 801, in get  with open(localpath, "wb") as fl: FileNotFoundError: 
> [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file'   During 
> handling of the above exception, another exception occurred:   Traceback 
> (most recent call last):  File "/app/airflow/models/taskinstance.py", line 
> 888, in _run_raw_task  result = task_copy.execute(context=context)  File 
> "/app/airflow/contrib/operators/sftp_operator.py", line 155, in execute  
> .format(file_msg, str(e))) airflow.exceptions.AirflowException: Error while 
> transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, error: 
> [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file'{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-28 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828027#comment-16828027
 ] 

Jarek Potiuk commented on AIRFLOW-4401:
---

[~BasPH] [~ash] -> I think the new PR 
[https://github.com/apache/airflow/pull/5199] is likely to solve the issue 
finally :).

> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.4
>
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top of my head. 
> We had a number of flaky tests in the local executor that resulted in 
> result_queue not being empty where it should have been emptied a moment 
> before. More details and discussion can be found in 
> [https://github.com/apache/airflow/pull/5159] 
> The problem turned out to be ... unreliability of multiprocessing.Queue 
> empty() implementation. It turned out that multiprocessing.Queue.empty() 
> implementation is not fully synchronized and it might return True even if 
> put() operation has been already completed in another process. What's more - 
> empty() might return True even if qsize() of the queue returns > 0 (!) 
> It's a bit mind-boggling but it is "as intended' as documented in 
> [https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 
> A few people have stumbled upon this problem. For example 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  and [https://github.com/keras-team/autokeras/issues/368] 
> Also we seemed to experienced that in Airflow before. In jobs.py years ago 
> (31.07.2016) - we can see the comment below (but we used 
> multiprocessing.queue empty() nevertheless):
> {code:java}
> # Not using multiprocessing.Queue() since it's no longer a separate
> # process and due to some unusual behavior. (empty() incorrectly
> # returns true?){code}
> The solution available in [https://bugs.python.org/issue23582]  using qsize() 
> was working on Linux but is not really acceptable because qsize() does not 
> work on MacOS (throws NotImplementedError).
> The working solution is to implement a reliable queue (SynchronizedQueue) 
> based on 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  (butwith a twist that __init__ of class deriving from Queue has to be 
> changed for python 3.4+ as described in 
> [https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].
> Luckily we are now Python3.5+
> We should replace all usages of multiprocessing.Queue where empty() is used 
> with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
> in similar way in the future.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-4382) Limited Parallelism test is flaky

2019-04-28 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4382:
--
Description: 
-The SFTPOperator tests seems to fail many times.-

Limited parallelism test is flaky. AIRFLOW-4401 is the root cause.

Previously the diagnosis of that test indicated that the problem is with 
SFTPOperator, but in fact all those log files there are expected. All the 
builds below fail because of limited parallelism test:

This is the failure message:

 
{code:java}
==
49) FAIL: test_execution_limited_parallelism 
(tests.executors.test_local_executor.LocalExecutorTest)
--
   Traceback (most recent call last):
    tests/executors/test_local_executor.py line 77 in 
test_execution_limited_parallelism
      self.execution_parallelism(parallelism=test_parallelism)
    tests/executors/test_local_executor.py line 61 in execution_parallelism
      self.assertEqual(len(executor.running), 0)
   AssertionError: 1 != 0
 
{code}
 

 Examples for PRs where this test fails:

[https://travis-ci.org/apache/airflow/jobs/522789194]

[https://travis-ci.org/apache/airflow/jobs/522931911]

[https://travis-ci.org/apache/airflow/jobs/523348590]

 

THIS IS NOT AN error: (mistakenly identified as one):
{code:java}
 

INFO [airflow.task] Dependencies all met for  INFO [airflow.task]  

 INFO [airflow.task] Starting attempt 1 of 1 INFO [airflow.task]  

 INFO [airflow.task] Executing  on 2019-04-22 
06:02:24.255945+00:00 INFO [paramiko.transport] Connected (version 2.0, client 
OpenSSH_7.2p2) INFO [paramiko.transport] Authentication (publickey) successful! 
INFO [paramiko.transport.sftp] [chan 0] Opened sftp connection (server version 
3) INFO [airflow.task.operators] Starting to transfer from 
/tmp/test_remote_file to /tmp/tmp2/test_local_file ERROR [airflow.task] Error 
while transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, 
error: [Errno 2] No such file or directory: '/tmp/tmp2/test_local_file' 
Traceback (most recent call last):  File 
"/app/airflow/contrib/operators/sftp_operator.py", line 138, in execute  
sftp_client.get(self.remote_filepath, self.local_filepath)  File 
"/app/.tox/py35-backend_sqlite-env_docker/lib/python3.5/site-packages/paramiko/sftp_client.py",
 line 801, in get  with open(localpath, "wb") as fl: FileNotFoundError: [Errno 
2] No such file or directory: '/tmp/tmp2/test_local_file'   During handling of 
the above exception, another exception occurred:   Traceback (most recent call 
last):  File "/app/airflow/models/taskinstance.py", line 888, in _run_raw_task  
result = task_copy.execute(context=context)  File 
"/app/airflow/contrib/operators/sftp_operator.py", line 155, in execute  
.format(file_msg, str(e))) airflow.exceptions.AirflowException: Error while 
transferring from /tmp/test_remote_file to /tmp/tmp2/test_local_file, error: 
[Errno 2] No such file or directory: '/tmp/tmp2/test_local_file'{code}
 

  was:
-The SFTPOperator tests seems to fail many times.-

Limited parallelism test is flaky.

Previously the diagnosis of that test indicated that the problem is with 
SFTPOperator, but in fact all those log files there are expected. All the 
builds below fail because of limited parallelism test:

This is the failure message:

 
{code:java}
==
49) FAIL: test_execution_limited_parallelism 
(tests.executors.test_local_executor.LocalExecutorTest)
--
   Traceback (most recent call last):
    tests/executors/test_local_executor.py line 77 in 
test_execution_limited_parallelism
      self.execution_parallelism(parallelism=test_parallelism)
    tests/executors/test_local_executor.py line 61 in execution_parallelism
      self.assertEqual(len(executor.running), 0)
   AssertionError: 1 != 0
 
{code}
 

 Examples for PRs where this test fails:

[https://travis-ci.org/apache/airflow/jobs/522789194]

[https://travis-ci.org/apache/airflow/jobs/522931911]

[https://travis-ci.org/apache/airflow/jobs/523348590]

 

THIS IS NOT AN error: (mistakenly identified as one):
{code:java}
 

INFO [airflow.task] Dependencies all met for  INFO [airflow.task]  

 INFO [airflow.task] Starting attempt 1 of 1 INFO [airflow.task]  

 INFO [airflow.task] Executing  on 2019-04-22 
06:02:24.255945+00:00 INFO [paramiko.transport] Connected (version 2.0, client 
OpenSSH_7.2p2) INFO [paramiko.transport] Authentication 

[jira] [Commented] (AIRFLOW-2337) Broken Import Variables

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828018#comment-16828018
 ] 

jack commented on AIRFLOW-2337:
---

Can not reproduce with 1.10.3 - Export and Import works fine for me.

> Broken Import Variables
> ---
>
> Key: AIRFLOW-2337
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2337
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.9.0
>Reporter: Daniel Lamblin
>Priority: Major
>
> Importing variables that were encrypted seems to have produced a far too long 
> parameter value.
> Using the UI in v1.8.2 I selected all variables, then with selected exported 
> them.
> This produced a json file. it is only 1617 bytes in size total with 26 
> variables.
> Using the UI in v1.9.0 I selected the file and clicked to import variables.
> …/admin/airflow/varimport
> gave me the error:
> h2. Ooops.
> {code:bash}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
> Can this be briefer and cheerier?
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: 90f7f5d06c61
> ---
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in 
> wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in 
> full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in 
> handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in 
> full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in 
> dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 69, 
> in inner
> return self._run_view(f, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 
> 368, in _run_view
> return fn(self, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_login.py", line 758, in 
> decorated_view
> return func(*args, **kwargs)
>   File "/pang/service/airflow/airflow-src/airflow/www/utils.py", line 262, in 
> wrapper
> return f(*args, **kwargs)
>   File "/pang/service/airflow/airflow-src/airflow/www/views.py", line 1787, 
> in varimport
> models.Variable.set(k, v, serialize_json=isinstance(v, dict))
>   File "/pang/service/airflow/airflow-src/airflow/utils/db.py", line 55, in 
> wrapper
> result = func(*args, **kwargs)
>   File "/pang/service/airflow/airflow-src/airflow/models.py", line 4031, in 
> set
> session.query(cls).filter(cls.key == key).delete()
>   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
> 3236, in delete
> delete_op.exec_()
>   File 
> "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 
> 1326, in exec_
> self._do_exec()
>   File 
> "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 
> 1518, in _do_exec
> self._execute_stmt(delete_stmt)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 
> 1333, in _execute_stmt
> mapper=self.mapper)
>   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", 
> line 1176, in execute
> bind, close_with_result=True).execute(clause, params or {})
>   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", 
> line 1040, in _connection_for_bind
> engine, execution_options)
>   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", 
> line 388, in _connection_for_bind
> self._assert_active()
>   File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", 
> line 276, in _assert_active
> % self._rollback_exception
> InvalidRequestError: This Session's transaction has been rolled back due to a 
> previous exception during flush. To begin a new transaction with this 
> 

[jira] [Commented] (AIRFLOW-1832) xcom_push is not reliable

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828022#comment-16828022
 ] 

jack commented on AIRFLOW-1832:
---

There has been significant work for the xcom_push including  unification done 
in https://issues.apache.org/jira/browse/AIRFLOW-3249

Please check against newer airflow version and report back

> xcom_push is not reliable
> -
>
> Key: AIRFLOW-1832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1832
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Michelle Huang
>Priority: Major
>
> we have a few ETL jobs that are hitting a third party API, extracting data 
> onto the Airflow server and then transforming and inserting that data into 
> Redshift. we're using xcom_push in the extract function to store an 
> identifier appended to the filename to be referenced in the transform 
> function so we can grab the right file to transform. I'm noticing that 
> xcom_push is not successfully pushing the value and xcom_pull is returning 
> "None". this is happening across many of our scheduled jobs regularly and 
> only just started happening on 11/16. manual triggers of the jobs run 
> successfully to completion without issue. this only affects our scheduled 
> runs and doesn't happen to all of them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828017#comment-16828017
 ] 

ASF GitHub Bot commented on AIRFLOW-4401:
-

potiuk commented on pull request #5199: [AIRFLOW-4401] SynchronizedQueue used 
where empty() is used.
URL: https://github.com/apache/airflow/pull/5199
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4401
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   It is a known problem https://bugs.python.org/issue23582 that
   multiprocessing.Queue empty() method is not reliable - sometimes it might
   return True even if another process already put something in the queue.
   
   This resulted in some of the tasks not picked up when sync() methods
   were called (in AirflowKubernetesScheduler, LocalExecutor,
   DagFileProcessor). This was less of a problem if the method was called in 
sync()
   - as the remaining jobs/files could be processed in next pass but it was a 
problem
   in tests and when graceful shutdown was executed (some tasks could be still
   unprocessed while the shutdown occured).
   
   All the cases impacted follow the same pattern now:
   
   while not queue.empty():
  res = queue.get()
  
   
   This loop runs always in single (main) process so it is safe to run it this 
way -
   there is no risk that some other process will retrieve the data from the 
queue in
   between empty() and get().
   
   Note that unlike in the standard multiprocessing.Queue, you cannot rely
   on data being immediately available after empty() is False. You should be
   prepared that subsequent get_nowait() raises Empty, or (better) use get()
   to retrieve the data.
   
   In all these cases overhead for inter-processing locking is negligible
   comparing to the action executed (Parsing DAG, executing job)
   so it appears it should be safe to merge the change.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   No need. Lots of tests for that already (flaky ones).
   
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.4
>
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top 

[GitHub] [airflow] potiuk opened a new pull request #5199: [AIRFLOW-4401] SynchronizedQueue used where empty() is used.

2019-04-28 Thread GitBox
potiuk opened a new pull request #5199: [AIRFLOW-4401] SynchronizedQueue used 
where empty() is used.
URL: https://github.com/apache/airflow/pull/5199
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4401
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   It is a known problem https://bugs.python.org/issue23582 that
   multiprocessing.Queue empty() method is not reliable - sometimes it might
   return True even if another process already put something in the queue.
   
   This resulted in some of the tasks not picked up when sync() methods
   were called (in AirflowKubernetesScheduler, LocalExecutor,
   DagFileProcessor). This was less of a problem if the method was called in 
sync()
   - as the remaining jobs/files could be processed in next pass but it was a 
problem
   in tests and when graceful shutdown was executed (some tasks could be still
   unprocessed while the shutdown occured).
   
   All the cases impacted follow the same pattern now:
   
   while not queue.empty():
  res = queue.get()
  
   
   This loop runs always in single (main) process so it is safe to run it this 
way -
   there is no risk that some other process will retrieve the data from the 
queue in
   between empty() and get().
   
   Note that unlike in the standard multiprocessing.Queue, you cannot rely
   on data being immediately available after empty() is False. You should be
   prepared that subsequent get_nowait() raises Empty, or (better) use get()
   to retrieve the data.
   
   In all these cases overhead for inter-processing locking is negligible
   comparing to the action executed (Parsing DAG, executing job)
   so it appears it should be safe to merge the change.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   No need. Lots of tests for that already (flaky ones).
   
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2911) Add job cancellation capability to Dataflow hook

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827986#comment-16827986
 ] 

jack commented on AIRFLOW-2911:
---

[~pabloem] any progress?

> Add job cancellation capability to Dataflow hook
> 
>
> Key: AIRFLOW-2911
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2911
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, Dataflow, gcp
>Reporter: Wilson Lian
>Assignee: Pablo Estrada
>Priority: Minor
>
> The hook currently only supports starting and waiting on a job. One might 
> want to cancel a job when, for example, it exceeds a certain timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (AIRFLOW-3061) Update Outdated (4 years old) Screenshots

2019-04-28 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3061:
--
Comment: was deleted

(was: Wasn't this updated in 1.10.2? )

> Update Outdated (4 years old) Screenshots 
> --
>
> Key: AIRFLOW-3061
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3061
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Minor
>
> The following images are still pretty old (almost 4 years old) and need an 
> update
>  * 
>  ** 
> adhoc.png
>  ** 
> chart.png
>  ** 
> chart_form.png



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3061) Update Outdated (4 years old) Screenshots

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827977#comment-16827977
 ] 

jack commented on AIRFLOW-3061:
---

Though these files were marked to be updated in: 
[https://github.com/apache/airflow/pull/3865]

They were removed in [https://github.com/apache/airflow/pull/4599]

[~TaoFeng] is this ticket still needed?

> Update Outdated (4 years old) Screenshots 
> --
>
> Key: AIRFLOW-3061
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3061
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Minor
>
> The following images are still pretty old (almost 4 years old) and need an 
> update
>  * 
>  ** 
> adhoc.png
>  ** 
> chart.png
>  ** 
> chart_form.png



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-4364) Integrate Pylint

2019-04-28 Thread Bas Harenslak (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bas Harenslak reassigned AIRFLOW-4364:
--

Assignee: Bas Harenslak

> Integrate Pylint
> 
>
> Key: AIRFLOW-4364
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4364
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bas Harenslak
>Assignee: Bas Harenslak
>Priority: Major
>
> After a [vote on the mailing 
> list|https://lists.apache.org/thread.html/f4940d36e98ded96a2473bb2ccdfa4cc648faa2c1334b2aa901c0bba@%3Cdev.airflow.apache.org%3E]
>  everybody voted for pylint integration. It involves a big change to the 
> codebase, so let's do it in 2 steps:
>  # Check pylint only on changed code in the CI.
>  # After a while we should have a good pylint config, and the remaining 
> non-checked code should be made compatible with pylint, i.e. enable pylint in 
> the CI on the full project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-4364) Integrate Pylint

2019-04-28 Thread Bas Harenslak (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bas Harenslak reassigned AIRFLOW-4364:
--

Assignee: (was: Bas Harenslak)

> Integrate Pylint
> 
>
> Key: AIRFLOW-4364
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4364
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bas Harenslak
>Priority: Major
>
> After a [vote on the mailing 
> list|https://lists.apache.org/thread.html/f4940d36e98ded96a2473bb2ccdfa4cc648faa2c1334b2aa901c0bba@%3Cdev.airflow.apache.org%3E]
>  everybody voted for pylint integration. It involves a big change to the 
> codebase, so let's do it in 2 steps:
>  # Check pylint only on changed code in the CI.
>  # After a while we should have a good pylint config, and the remaining 
> non-checked code should be made compatible with pylint, i.e. enable pylint in 
> the CI on the full project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4094) XCom not pushed into database when size of content is greater than 65535 bytes

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827942#comment-16827942
 ] 

jack commented on AIRFLOW-4094:
---

Actually Airflow enforce max size of 49344 bytes.

[https://github.com/apache/airflow/blob/master/airflow/models/xcom.py#L37]

 

> XCom not pushed into database when size of content is greater than 65535 bytes
> --
>
> Key: AIRFLOW-4094
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4094
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: airflow version: 1.10.0
> python: 3.6.5
> postgresql: 9.6.1
>Reporter: Sai Varun Reddy Daram
>Priority: Major
>
> I happen to use airflow with postgres as backend, one of the things I tried 
> recently is use kubernetes pod operator, which will run and return some data 
> to next task via XCOMs, one thing I observed is, when the XCOM length is 
> greater than 65535 bytes it's not getting pushed into XCOM.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3157) Variable.set shouldn't accept types other than string when JSON mode is False

2019-04-28 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3157:
--
Fix Version/s: 1.10.4

> Variable.set shouldn't accept types other than string when JSON mode is False
> -
>
> Key: AIRFLOW-3157
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3157
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Minor
> Fix For: 1.10.4
>
>
> Assume the following code  (My xcom contains: {color:#33}7764615{color} ):
> {code:java}
> def set_last_orderid_variable(ds, **kwargs):
>     ti = kwargs['ti']
>     new_orderid = ti.xcom_pull(task_ids='get_max_order_id')
>     str = "Changed variable from {0} to 
> {1}".format(LAST_IMPORTED_ORDER_ID,new_orderid)
>     Variable.set('last_order_id_imported', new_orderid)
>     logging.info(str)
>     return{code}
>  
>  
> Log shows:
> {code:java}
> [2018-10-04 12:49:48,186] {base_task_runner.py:98} INFO - Subtask: 
> [2018-10-04 12:49:38,169] {orders_dag.py:400} INFO - Changed variable from 10 
> to 7764615{code}
>  
> However  When I go to variable
> {code:java}
>  'last_order_id_imported' {code}
> I see empty cell on the Value. *It changed it from 10 to nothing*.
>  
>  
> If I'll change my code to :
> {code:java}
> def set_last_orderid_variable(ds, **kwargs):
>     ti = kwargs['ti']
>     new_orderid = ti.xcom_pull(task_ids='get_max_order_id')
>     str = "Changed variable from {0} to 
> {1}".format(LAST_IMPORTED_ORDER_ID,new_orderid)
>     x = '{0}'.format(new_orderid)
>     logging.info(type(new_orderid))
>     Variable.set('last_order_id_imported', x)
>     logging.info(str)
>     return{code}
>  
>  
> This code works.
> When I go to variable
> {code:java}
>  'last_order_id_imported' {code}
> I see {color:#33}7764615{color} 
> The log shows me:
> {code:java}
> [2018-10-04 12:53:58,002] {base_task_runner.py:98} INFO - Subtask: 
> [2018-10-04 12:53:58,002] {orders_dag.py:399} INFO -  
> [2018-10-04 12:53:58,032] {base_task_runner.py:98} INFO - Subtask: 
> [2018-10-04 12:53:58,031] {orders_dag.py:401} INFO - Changed variable from 10 
> to 7764615{code}
>  
>  
> As you can see the type of new_orderid is
> {code:java}
>  long{code}
> .  Apparently the
> {code:java}
> Variable.set {code}
> can't handle it (let me remind that this value was received from xcom).
>  
>  
> If by design Variable.set can accept specific types (strings/ json etc...) it 
> should raise exception for others (long etc...) . It shouldn't deicide that 
> it puts empty string on it's own and mark the task as success.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4098) gcp_api dependency conflicts

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827941#comment-16827941
 ] 

jack commented on AIRFLOW-4098:
---

[~kaxilnaik] have you encountered this? 

> gcp_api dependency conflicts
> 
>
> Key: AIRFLOW-4098
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4098
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
>Affects Versions: 1.10.2
> Environment: Ubuntu 16.04.6 LTS
> Python 2.7 / Python 3.5 / Python 3.6
>Reporter: Manuel Rodríguez Guimeráns
>Priority: Major
>
> The {{gcp_api}} extra contains inconsistencies in its subdependencies. When 
> installed with {{pip}} it completes with a warning, but {{pipenv}} and 
> {{poetry}} are not so forgiving and it cannot be installed at all.
> Note that on the master branch as of 419e30 (where {{gcp_api}} is now called 
> just {{gcp}}) there are still dependency errors as well.
> h3. pip with python 2.7
> {code:java}
> $ mkvirtualenv -p /usr/bin/python2.7 airflow_pip
> ...
> $ pip --version
> pip 19.0.3 from 
> /home/manu/.local/share/virtualenvs/airflow_pip/local/lib/python2.7/site-packages/pip
>  (python 2.7)
> $ SLUGIFY_USES_TEXT_UNIDECODE=yes pip install 
> 'apache-airflow[gcp_api]==1.10.2'
> ...
> google-cloud-bigquery 1.10.0 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.28.1 
> which is incompatible.
> google-cloud-spanner 1.8.0 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.28.1 
> which is incompatible.
> ...
> {code}
> h3. pipenv with python 3.6
> {code:java}
> $ mkdir airflow_pipenv
> $ cd airflow_pipenv
> $ pipenv --version
> pipenv, version 2018.11.26
> $ pipenv --python 3.6 
> Creating a virtualenv for this project...
> Pipfile: /tmp/airflow_pipenv/Pipfile
> Using /home/manu/.pyenv/versions/3.6.6/bin/python3.6 (3.6.6) to create 
> virtualenv...
> ⠋ Creating virtual environment...Already using interpreter 
> /home/manu/.pyenv/versions/3.6.6/bin/python3.6
> Using base prefix '/home/manu/.pyenv/versions/3.6.6'
> New python executable in 
> /home/manu/.local/share/virtualenvs/airflow_pipenv-ePWbOumO/bin/python3.6
> Also creating executable in 
> /home/manu/.local/share/virtualenvs/airflow_pipenv-ePWbOumO/bin/python
> Installing setuptools, pip, wheel...done.
> ✔ Successfully created virtual environment! 
> Virtualenv location: 
> /home/manu/.local/share/virtualenvs/airflow_pipenv-ePWbOumO
> $ airflow_pipenv pipenv install 'apache-airflow[gcp_api]==1.10.2'
> Installing apache-airflow[gcp_api]==1.10.2...
> Adding apache-airflow to Pipfile's [packages]...
> ✔ Installation Succeeded 
> Pipfile.lock (e3de4b) out of date, updating to (ca72e7)...
> Locking [dev-packages] dependencies...
> Locking [packages] dependencies...
> ✘ Locking Failed! 
> [pipenv.exceptions.ResolutionFailure]:   File 
> "/home/manu/.pyenv/versions/3.6.6/lib/python3.6/site-packages/pipenv/resolver.py",
>  line 69, in resolve
> [pipenv.exceptions.ResolutionFailure]:   req_dir=requirements_dir
> [pipenv.exceptions.ResolutionFailure]:   File 
> "/home/manu/.pyenv/versions/3.6.6/lib/python3.6/site-packages/pipenv/utils.py",
>  line 726, in resolve_deps
> [pipenv.exceptions.ResolutionFailure]:   req_dir=req_dir,
> [pipenv.exceptions.ResolutionFailure]:   File 
> "/home/manu/.pyenv/versions/3.6.6/lib/python3.6/site-packages/pipenv/utils.py",
>  line 480, in actually_resolve_deps
> [pipenv.exceptions.ResolutionFailure]:   resolved_tree = 
> resolver.resolve()
> [pipenv.exceptions.ResolutionFailure]:   File 
> "/home/manu/.pyenv/versions/3.6.6/lib/python3.6/site-packages/pipenv/utils.py",
>  line 395, in resolve
> [pipenv.exceptions.ResolutionFailure]:   raise 
> ResolutionFailure(message=str(e))
> [pipenv.exceptions.ResolutionFailure]:   
> pipenv.exceptions.ResolutionFailure: ERROR: ERROR: Could not find a version 
> that matches google-cloud-core<0.29dev,<0.30dev,>=0.28.0,>=0.29.0
> [pipenv.exceptions.ResolutionFailure]:   Tried: 0.20.0, 0.20.0, 0.21.0, 
> 0.21.0, 0.22.0, 0.22.0, 0.22.1, 0.22.1, 0.23.0, 0.23.0, 0.23.1, 0.23.1, 
> 0.24.0, 0.24.0, 0.24.1, 0.24.1, 0.25.0, 0.25.0, 0.26.0, 0.26.0, 0.27.0, 
> 0.27.0, 0.27.1, 0.27.1, 0.28.0, 0.28.0, 0.28.1, 0.28.1, 0.29.0, 0.29.0, 
> 0.29.1, 0.29.1
> [pipenv.exceptions.ResolutionFailure]: Warning: Your dependencies could not 
> be resolved. You likely have a mismatch in your sub-dependencies.
>   First try clearing your dependency cache with $ pipenv lock --clear, then 
> try the original command again.
>  Alternatively, you can use $ pipenv install --skip-lock to bypass this 
> mechanism, then run $ pipenv graph to inspect the situation.
>   Hint: try $ pipenv lock --pre if it is a pre-release dependency.
> ERROR: ERROR: Could not 

[jira] [Commented] (AIRFLOW-4403) UI search query in main view should filters by either dag_id or owner

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827931#comment-16827931
 ] 

ASF subversion and git services commented on AIRFLOW-4403:
--

Commit 0fcc83c1a16e94d8f0754f740dbc49e2041176f4 in airflow's branch 
refs/heads/v1-10-test from Jarek Potiuk
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=0fcc83c ]

fixup! [AIRFLOW-4403] search by `dag_id` or `owners` in UI (#5184)


> UI search query in main view should filters by either dag_id or owner
> -
>
> Key: AIRFLOW-4403
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4403
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.3
>Reporter: Damien Avrillon
>Assignee: Vladimir Gavrilenko
>Priority: Minor
> Fix For: 1.10.4
>
>
> In airflow 1.10.2, In main dags view, UI search was filtering not only 'DAG' 
> but also 'Owner' column.
> For example, if there are multiple dags / owners:
> dag_id_1    /   owner_id_1
> dag_id_2    /   owner_id_2
> dag_id_3    /   owner_id_2
> Searching for '_2' would return 2 dags: 'dag_id_2' and 'dag_id_3' because 
> 'Owner' matched.
> As upgrading to 1.10.3, the same search yields one single result: 'dag_id_2'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3720) GoogleCloudStorageToS3Operator - incorrect folder compare

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827917#comment-16827917
 ] 

ASF subversion and git services commented on AIRFLOW-3720:
--

Commit 3d6f1ae7e47644d5fb7823a6fd1a9f861e1c077b in airflow's branch 
refs/heads/v1-10-stable from Rodrigo Chaparro Plata Hernandez
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3d6f1ae ]

[AIRFLOW-3720] Fix missmatch while comparing GCS and S3 files (#4766)

 (cherry picked from commit 60b9023ed92b31a75dbdf8b33ce7e9c2bc3637d1)


> GoogleCloudStorageToS3Operator -  incorrect folder compare
> --
>
> Key: AIRFLOW-3720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3720
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.10.0
>Reporter: Chaim
>Assignee: Chaim
>Priority: Major
> Fix For: 1.10.4, 2.0.0
>
>
> the code that compares folders from gcp to s3 is incorrect.
> the code is:
> files = set(files) - set(existing_files)
> but the list from gcp  has a "/" to the name, for example: "myfolder/", while 
> in s3 it does not have "/" so the folder is "myfolder"
> the result is that the code tries to recopy the folder name but fails since 
> it already exists



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3720) GoogleCloudStorageToS3Operator - incorrect folder compare

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827922#comment-16827922
 ] 

ASF subversion and git services commented on AIRFLOW-3720:
--

Commit fcfc8e963c78f31e71369246a5eb67099b2dd4a7 in airflow's branch 
refs/heads/v1-10-test from Rodrigo Chaparro Plata Hernandez
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=fcfc8e9 ]

[AIRFLOW-3720] Fix missmatch while comparing GCS and S3 files (#4766)

 (cherry picked from commit 60b9023ed92b31a75dbdf8b33ce7e9c2bc3637d1)


> GoogleCloudStorageToS3Operator -  incorrect folder compare
> --
>
> Key: AIRFLOW-3720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3720
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.10.0
>Reporter: Chaim
>Assignee: Chaim
>Priority: Major
> Fix For: 1.10.4, 2.0.0
>
>
> the code that compares folders from gcp to s3 is incorrect.
> the code is:
> files = set(files) - set(existing_files)
> but the list from gcp  has a "/" to the name, for example: "myfolder/", while 
> in s3 it does not have "/" so the folder is "myfolder"
> the result is that the code tries to recopy the folder name but fails since 
> it already exists



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4397) Add GCSUploadSessionCompleteSensor

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827921#comment-16827921
 ] 

ASF subversion and git services commented on AIRFLOW-4397:
--

Commit 97c31a1cb45d4ecee672b5257e055ca4f9effd95 in airflow's branch 
refs/heads/v1-10-test from Jacob Ferriero
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=97c31a1 ]

[AIRFLOW-4397] Add GCSUploadSessionCompleteSensor (#5166)

* [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

This commit add a GoogleCloudStorageUploadSessionCompleteSensor
to address the use case of accepting files from a third party vendor
who refuses to send a success indicator when providing data files
into a bucket and waiting until an inactivity period has passed to
indicate the end of an upload session.

 (cherry picked from commit 2bb79197cdba49b43a5a821674af1f11c0279d75)


> Add GCSUploadSessionCompleteSensor
> --
>
> Key: AIRFLOW-4397
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4397
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 1.10.4, 2.0.0
>
>
> I'd like to contribute a Sensor for Google Cloud Storage that can poke a 
> bucket until there has been sufficient time without a new file drop. Often 
> times there are cases where a third party vendor drops data to a bucket but 
> don't send a success flag when they are done. This sensor would allow you to 
> poke every n minutes to check if more files have been added since the last 
> poke, and if there had been `inactivity_period` minutes without a new file 
> drop, return `True`. This could allow SLA misses if data did not arrive by an 
> expected time, and have a configurable deadline past which the sensor would 
> fail. Optionally the user could specify a minimum number of files for the 
> sensor to succeed. This would be my first time contributing to an OSS 
> project, so please let me know if this is not the appropriate place to start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4168) Create Google Cloud Video Intelligence Operators

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827923#comment-16827923
 ] 

ASF subversion and git services commented on AIRFLOW-4168:
--

Commit 3b64582156cd9681ad6ce9305218e39d6dce6641 in airflow's branch 
refs/heads/v1-10-test from Antoni Smoliński
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3b64582 ]

[AIRFLOW-4168] Create Google Cloud Video Intelligence Operators (#4985)

 (cherry picked from commit df18b02d2abc98df2ef850f25b82f8202106a147)


> Create Google Cloud Video Intelligence Operators
> 
>
> Key: AIRFLOW-4168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4168
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Antoni Smoliński
>Assignee: Antoni Smoliński
>Priority: Major
> Fix For: 1.10.4, 2.0.0
>
>
> Create Video Intelligence operators:
> CloudVideoIntelligenceDetectVideoLabelsOperator,
> CloudVideoIntelligenceDetectVideoExplicitContentOperator,
> CloudVideoIntelligenceDetectVideoShotsOperator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4397) Add GCSUploadSessionCompleteSensor

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827920#comment-16827920
 ] 

ASF subversion and git services commented on AIRFLOW-4397:
--

Commit 97c31a1cb45d4ecee672b5257e055ca4f9effd95 in airflow's branch 
refs/heads/v1-10-test from Jacob Ferriero
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=97c31a1 ]

[AIRFLOW-4397] Add GCSUploadSessionCompleteSensor (#5166)

* [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

This commit add a GoogleCloudStorageUploadSessionCompleteSensor
to address the use case of accepting files from a third party vendor
who refuses to send a success indicator when providing data files
into a bucket and waiting until an inactivity period has passed to
indicate the end of an upload session.

 (cherry picked from commit 2bb79197cdba49b43a5a821674af1f11c0279d75)


> Add GCSUploadSessionCompleteSensor
> --
>
> Key: AIRFLOW-4397
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4397
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 1.10.4, 2.0.0
>
>
> I'd like to contribute a Sensor for Google Cloud Storage that can poke a 
> bucket until there has been sufficient time without a new file drop. Often 
> times there are cases where a third party vendor drops data to a bucket but 
> don't send a success flag when they are done. This sensor would allow you to 
> poke every n minutes to check if more files have been added since the last 
> poke, and if there had been `inactivity_period` minutes without a new file 
> drop, return `True`. This could allow SLA misses if data did not arrive by an 
> expected time, and have a configurable deadline past which the sensor would 
> fail. Optionally the user could specify a minimum number of files for the 
> sensor to succeed. This would be my first time contributing to an OSS 
> project, so please let me know if this is not the appropriate place to start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4397) Add GCSUploadSessionCompleteSensor

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827916#comment-16827916
 ] 

ASF subversion and git services commented on AIRFLOW-4397:
--

Commit f5f07de46bdc493dae60a479c90a95c8361b8827 in airflow's branch 
refs/heads/v1-10-stable from Jacob Ferriero
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=f5f07de ]

[AIRFLOW-4397] Add GCSUploadSessionCompleteSensor (#5166)

* [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

This commit add a GoogleCloudStorageUploadSessionCompleteSensor
to address the use case of accepting files from a third party vendor
who refuses to send a success indicator when providing data files
into a bucket and waiting until an inactivity period has passed to
indicate the end of an upload session.

 (cherry picked from commit 2bb79197cdba49b43a5a821674af1f11c0279d75)


> Add GCSUploadSessionCompleteSensor
> --
>
> Key: AIRFLOW-4397
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4397
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 1.10.4, 2.0.0
>
>
> I'd like to contribute a Sensor for Google Cloud Storage that can poke a 
> bucket until there has been sufficient time without a new file drop. Often 
> times there are cases where a third party vendor drops data to a bucket but 
> don't send a success flag when they are done. This sensor would allow you to 
> poke every n minutes to check if more files have been added since the last 
> poke, and if there had been `inactivity_period` minutes without a new file 
> drop, return `True`. This could allow SLA misses if data did not arrive by an 
> expected time, and have a configurable deadline past which the sensor would 
> fail. Optionally the user could specify a minimum number of files for the 
> sensor to succeed. This would be my first time contributing to an OSS 
> project, so please let me know if this is not the appropriate place to start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4397) Add GCSUploadSessionCompleteSensor

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827915#comment-16827915
 ] 

ASF subversion and git services commented on AIRFLOW-4397:
--

Commit f5f07de46bdc493dae60a479c90a95c8361b8827 in airflow's branch 
refs/heads/v1-10-stable from Jacob Ferriero
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=f5f07de ]

[AIRFLOW-4397] Add GCSUploadSessionCompleteSensor (#5166)

* [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

This commit add a GoogleCloudStorageUploadSessionCompleteSensor
to address the use case of accepting files from a third party vendor
who refuses to send a success indicator when providing data files
into a bucket and waiting until an inactivity period has passed to
indicate the end of an upload session.

 (cherry picked from commit 2bb79197cdba49b43a5a821674af1f11c0279d75)


> Add GCSUploadSessionCompleteSensor
> --
>
> Key: AIRFLOW-4397
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4397
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 1.10.4, 2.0.0
>
>
> I'd like to contribute a Sensor for Google Cloud Storage that can poke a 
> bucket until there has been sufficient time without a new file drop. Often 
> times there are cases where a third party vendor drops data to a bucket but 
> don't send a success flag when they are done. This sensor would allow you to 
> poke every n minutes to check if more files have been added since the last 
> poke, and if there had been `inactivity_period` minutes without a new file 
> drop, return `True`. This could allow SLA misses if data did not arrive by an 
> expected time, and have a configurable deadline past which the sensor would 
> fail. Optionally the user could specify a minimum number of files for the 
> sensor to succeed. This would be my first time contributing to an OSS 
> project, so please let me know if this is not the appropriate place to start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4168) Create Google Cloud Video Intelligence Operators

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827918#comment-16827918
 ] 

ASF subversion and git services commented on AIRFLOW-4168:
--

Commit b4b48b53b6de778cf957ee68b5ab08d291e8e3ee in airflow's branch 
refs/heads/v1-10-stable from Antoni Smoliński
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=b4b48b5 ]

[AIRFLOW-4168] Create Google Cloud Video Intelligence Operators (#4985)

 (cherry picked from commit df18b02d2abc98df2ef850f25b82f8202106a147)


> Create Google Cloud Video Intelligence Operators
> 
>
> Key: AIRFLOW-4168
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4168
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Antoni Smoliński
>Assignee: Antoni Smoliński
>Priority: Major
> Fix For: 1.10.4, 2.0.0
>
>
> Create Video Intelligence operators:
> CloudVideoIntelligenceDetectVideoLabelsOperator,
> CloudVideoIntelligenceDetectVideoExplicitContentOperator,
> CloudVideoIntelligenceDetectVideoShotsOperator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] OmerJog commented on issue #4830: [AIRFLOW-3929] Use anchor tag for dag filter link.

2019-04-28 Thread GitBox
OmerJog commented on issue #4830: [AIRFLOW-3929] Use anchor tag for dag filter 
link.
URL: https://github.com/apache/airflow/pull/4830#issuecomment-487368134
 
 
   @ashb There is a pending pr to fix the zoom in issue: 
https://github.com/apache/airflow/pull/4473


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5166: [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

2019-04-28 Thread GitBox
potiuk commented on issue #5166: [AIRFLOW-4397] Add 
GCSUploadSessionCompleteSensor
URL: https://github.com/apache/airflow/pull/5166#issuecomment-487364370
 
 
   @jaketf - indeed @OmerJog is right. Most of the docs are now auto-generated 
in Reference but integration still needs to be updated. Thanks for noticing 
@OmerJog.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] OmerJog commented on issue #5176: [AIRFLOW-2636] Fix Nonetype error for task failure instance duration

2019-04-28 Thread GitBox
OmerJog commented on issue #5176: [AIRFLOW-2636] Fix Nonetype error for task 
failure instance duration
URL: https://github.com/apache/airflow/pull/5176#issuecomment-487362582
 
 
   There is a pending previous PR https://github.com/apache/airflow/pull/4032  
and @ashb requested unit tests before the PR can be merged. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] OmerJog commented on issue #5166: [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

2019-04-28 Thread GitBox
OmerJog commented on issue #5166: [AIRFLOW-4397] Add 
GCSUploadSessionCompleteSensor
URL: https://github.com/apache/airflow/pull/5166#issuecomment-487362261
 
 
   @jaketf The PR is missing an edit to docs/integration.rst would you mind 
adding the it in a new PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-1860) When adding variables in Web GUI they are case-insensitive, but everywhere else they are case-sensitive

2019-04-28 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-1860:
--
Attachment: aIRFLOW11.PNG

> When adding variables in Web GUI they are case-insensitive, but everywhere 
> else they are case-sensitive
> ---
>
> Key: AIRFLOW-1860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 1.8.0
>Reporter: Zelko Nikolic
>Priority: Minor
> Attachments: aIRFLOW11.PNG
>
>
> 1. Open Web GUI -> Variables
> 2. Add a variable named "ABCD" and enter some value for it.
> 3. Now add a variable named "abcd", enter some value and save it. Oh, you 
> can't. It says that variable already exists. As if variable names are 
> case-insensitive. But they are not.
> 4. Because if you try accessing the first variable from the code, using 
> Variable.get("abcd"), it will say that variable doesn't exist.
> So, from the code variables are case-sensitive, but when Web GUI is adding 
> them, it treats them case-insensitive. To make things even weirder, when you 
> sort the variables by name in Web GUI, it sorts uppercase first, then 
> lowercase. Obviously, even Web GUI isn't consistent with itself, because it's 
> treating them sometimes case-sensitive, sometimes case-insensitive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] feluelle commented on a change in pull request #5162: [AIRFLOW-4358] Speed up test_jobs by not running tasks

2019-04-28 Thread GitBox
feluelle commented on a change in pull request #5162: [AIRFLOW-4358] Speed up 
test_jobs by not running tasks
URL: https://github.com/apache/airflow/pull/5162#discussion_r279185479
 
 

 ##
 File path: tests/executors/test_executor.py
 ##
 @@ -16,47 +16,73 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+
+from collections import defaultdict
+
 from airflow.executors.base_executor import BaseExecutor
 from airflow.utils.state import State
-
-from airflow import settings
+from airflow.utils.db import create_session
 
 
 class TestExecutor(BaseExecutor):
 """
 TestExecutor is used for unit testing purposes.
 """
 
-def __init__(self, do_update=False, *args, **kwargs):
+def __init__(self, do_update=True, *args, **kwargs):
 self.do_update = do_update
 self._running = []
+
+# A list of "batches" of tasks
 self.history = []
+# All the tasks, in a stable sort order
+self.sorted_tasks = []
+self.mock_task_results = defaultdict(lambda: State.SUCCESS)
 
 super().__init__(*args, **kwargs)
 
-def execute_async(self, key, command, queue=None):
-self.log.debug("{} running task instances".format(len(self.running)))
-self.log.debug("{} in queue".format(len(self.queued_tasks)))
-
 def heartbeat(self):
-session = settings.Session()
-if self.do_update:
+if not self.do_update:
 
 Review comment:
   Don't you want to _mock_ the `heartbeat` here? 
   If so you could remove that extra statement and just patch the 
`TestExecutor.heartbeat`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] RosterIn commented on issue #5196: [AIRFLOW-4423] Improve date handling in mysql to gcs operator.

2019-04-28 Thread GitBox
RosterIn commented on issue #5196: [AIRFLOW-4423] Improve date handling in 
mysql to gcs operator.
URL: https://github.com/apache/airflow/pull/5196#issuecomment-487360020
 
 
   My previous PR is somehow related as this one also address TIME field. I 
suggest going through the comments of the mysql lib commiter which kindly gave 
input:
   https://github.com/apache/airflow/pull/4802


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1860) When adding variables in Web GUI they are case-insensitive, but everywhere else they are case-sensitive

2019-04-28 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827864#comment-16827864
 ] 

jack commented on AIRFLOW-1860:
---

Can not reproduce on Airflow 1.10.3: !aIRFLOW11.PNG!

 

> When adding variables in Web GUI they are case-insensitive, but everywhere 
> else they are case-sensitive
> ---
>
> Key: AIRFLOW-1860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 1.8.0
>Reporter: Zelko Nikolic
>Priority: Minor
> Attachments: aIRFLOW11.PNG
>
>
> 1. Open Web GUI -> Variables
> 2. Add a variable named "ABCD" and enter some value for it.
> 3. Now add a variable named "abcd", enter some value and save it. Oh, you 
> can't. It says that variable already exists. As if variable names are 
> case-insensitive. But they are not.
> 4. Because if you try accessing the first variable from the code, using 
> Variable.get("abcd"), it will say that variable doesn't exist.
> So, from the code variables are case-sensitive, but when Web GUI is adding 
> them, it treats them case-insensitive. To make things even weirder, when you 
> sort the variables by name in Web GUI, it sorts uppercase first, then 
> lowercase. Obviously, even Web GUI isn't consistent with itself, because it's 
> treating them sometimes case-sensitive, sometimes case-insensitive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3791) Multiple jobs in run & check if already running

2019-04-28 Thread Chaim (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827863#comment-16827863
 ] 

Chaim commented on AIRFLOW-3791:


how can i get my pull request to go in?

> Multiple jobs in run & check if already running
> ---
>
> Key: AIRFLOW-3791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3791
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Dataflow
>Affects Versions: 1.10.2
>Reporter: Chaim
>Assignee: Chaim
>Priority: Major
> Fix For: 1.10.4
>
>
> # Support to check if job is already running before starting java job
>  # In case dataflow creates more than one job, we need to track all jobs for 
> status



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] tssujt commented on issue #5176: [AIRFLOW-2636] Fix Nonetype error for task failure instance duration

2019-04-28 Thread GitBox
tssujt commented on issue #5176: [AIRFLOW-2636] Fix Nonetype error for task 
failure instance duration
URL: https://github.com/apache/airflow/pull/5176#issuecomment-487359241
 
 
   Rebased @feng-tao 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #5196: [AIRFLOW-4423] Improve date handling in mysql to gcs operator.

2019-04-28 Thread GitBox
feluelle commented on a change in pull request #5196: [AIRFLOW-4423] Improve 
date handling in mysql to gcs operator.
URL: https://github.com/apache/airflow/pull/5196#discussion_r279184388
 
 

 ##
 File path: tests/contrib/operators/test_mysql_to_gcs_operator.py
 ##
 @@ -106,7 +121,7 @@ def _assert_upload(bucket, obj, tmp_filename, 
mime_type=None):
 op.execute(None)
 
 
mysql_hook_mock_class.assert_called_once_with(mysql_conn_id=MYSQL_CONN_ID)
-
mysql_hook_mock.get_conn().cursor().execute.assert_called_once_with(SQL)
+mysql_hook_mock.get_conn().cursor().execute.assert_called_with(SQL)
 
 Review comment:
   Please test that you executed the `tz_query` before you execute the actual 
`sql` statement. You can test that via 
[assert_has_calls](https://docs.python.org/3.6/library/unittest.mock.html#unittest.mock.Mock.assert_has_calls).
 Also see this https://github.com/apache/airflow/pull/5101#discussion_r275438792


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #5196: [AIRFLOW-4423] Improve date handling in mysql to gcs operator.

2019-04-28 Thread GitBox
feluelle commented on a change in pull request #5196: [AIRFLOW-4423] Improve 
date handling in mysql to gcs operator.
URL: https://github.com/apache/airflow/pull/5196#discussion_r279183742
 
 

 ##
 File path: airflow/contrib/operators/mysql_to_gcs.py
 ##
 @@ -265,27 +288,31 @@ def _upload_to_gcs(self, files_to_upload):
 tmp_file.get('file_handle').name,
 mime_type=tmp_file.get('file_mime_type'))
 
-@staticmethod
-def _convert_types(schema, col_type_dict, row):
+@classmethod
+def _convert_types(cls, schema, col_type_dict, row):
+return [
+cls._convert_type(value, col_type_dict.get(name))
+for name, value in zip(schema, row)
+]
+
+@classmethod
+def _convert_type(cls, value, schema_type):
 
 Review comment:
   The documentation for this conversion is good, but in my opinion we should 
always document the parameters as well + to have a cleaner sphinx generated 
documentation. :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4425) Add FacebookAdsHook

2019-04-28 Thread jack (JIRA)
jack created AIRFLOW-4425:
-

 Summary: Add FacebookAdsHook
 Key: AIRFLOW-4425
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4425
 Project: Apache Airflow
  Issue Type: Wish
  Components: hooks
Reporter: jack
 Fix For: 2.0.0


Add hook to interact with FacebookAds



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] OmerJog commented on issue #4546: AIRFLOW-3720

2019-04-28 Thread GitBox
OmerJog commented on issue #4546: AIRFLOW-3720
URL: https://github.com/apache/airflow/pull/4546#issuecomment-487355614
 
 
   This can be closed due to the merge of 
https://github.com/apache/airflow/pull/4766


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5195: [AIRFLOW-4348] Add GCP console link in BigQueryOperator

2019-04-28 Thread GitBox
codecov-io commented on issue #5195: [AIRFLOW-4348] Add GCP console link in 
BigQueryOperator
URL: https://github.com/apache/airflow/pull/5195#issuecomment-487352417
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=h1) 
Report
   > Merging 
[#5195](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/75fca09e6d691a3efec66af37cca855332558203?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5195/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5195  +/-   ##
   ==
   + Coverage   78.54%   78.55%   +<.01% 
   ==
 Files 469  469  
 Lines   2989629905   +9 
   ==
   + Hits2348323491   +8 
   - Misses   6413 6414   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/operators/bigquery\_operator.py](https://codecov.io/gh/apache/airflow/pull/5195/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9iaWdxdWVyeV9vcGVyYXRvci5weQ==)
 | `93.95% <100%> (+0.38%)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5195/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=footer). 
Last update 
[75fca09...6be12da](https://codecov.io/gh/apache/airflow/pull/5195?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5194: [AIRFLOW-4419] Refine concurrency check in scheduler

2019-04-28 Thread GitBox
feng-tao commented on issue #5194: [AIRFLOW-4419] Refine concurrency check in 
scheduler
URL: https://github.com/apache/airflow/pull/5194#issuecomment-487352281
 
 
   will take a look at the pr tomorrow


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 edited a comment on issue #5177: [AIRFLOW-4084] Fix bug downloading incomplete logs from ElasticSearch

2019-04-28 Thread GitBox
KevinYang21 edited a comment on issue #5177: [AIRFLOW-4084] Fix bug downloading 
incomplete logs from ElasticSearch
URL: https://github.com/apache/airflow/pull/5177#issuecomment-487350990
 
 
   @cong-zhu Do we have unit test for this feature before? If so we'll still 
need to update it so we capture the use case which this PR is fixing, if not we 
might want to add that test. Also the CI is failing now, if you're still 
working on it then you can mark the PR as a WIP PR by putting a `[WIP]` into 
the PR title.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 commented on issue #5177: [AIRFLOW-4084] Fix bug downloading incomplete logs from ElasticSearch

2019-04-28 Thread GitBox
KevinYang21 commented on issue #5177: [AIRFLOW-4084] Fix bug downloading 
incomplete logs from ElasticSearch
URL: https://github.com/apache/airflow/pull/5177#issuecomment-487350990
 
 
   @cong-zhu Do we have unit test for this feature before? If so we'll still 
need to update it so we capture the use case which this PR is fixing, if not we 
might want to add that test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-4228) DatabricksRunNowOperator does not show up under airflow docs

2019-04-28 Thread Tao Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng resolved AIRFLOW-4228.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> DatabricksRunNowOperator does not show up under airflow docs
> 
>
> Key: AIRFLOW-4228
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4228
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Thomas Dziedzic
>Assignee: Thomas Elvey
>Priority: Trivial
> Fix For: 2.0.0
>
>
> [https://airflow.apache.org/_modules/airflow/contrib/operators/databricks_operator.html]
>  contains DatabricksRunNowOperator but when you visit 
> [https://airflow.apache.org/integration.html?highlight=databricks#databricks] 
> there is no mention of the DatabricksRunNowOperator even though it is 
> available and working.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] feng-tao commented on issue #5177: [AIRFLOW-4084] Fix bug downloading incomplete logs from ElasticSearch

2019-04-28 Thread GitBox
feng-tao commented on issue #5177: [AIRFLOW-4084] Fix bug downloading 
incomplete logs from ElasticSearch
URL: https://github.com/apache/airflow/pull/5177#issuecomment-487350721
 
 
   cc @KevinYang21 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5195: [AIRFLOW-4348] Add GCP console link in BigQueryOperator

2019-04-28 Thread GitBox
feng-tao commented on issue #5195: [AIRFLOW-4348] Add GCP console link in 
BigQueryOperator
URL: https://github.com/apache/airflow/pull/5195#issuecomment-487350649
 
 
   cc GCP folks @fenglu-g @potiuk 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao merged pull request #5182: [AIRFLOW-4361] Fix flaky test_integration_run_dag_with_scheduler_failure

2019-04-28 Thread GitBox
feng-tao merged pull request #5182:  [AIRFLOW-4361] Fix flaky 
test_integration_run_dag_with_scheduler_failure
URL: https://github.com/apache/airflow/pull/5182
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5182: [AIRFLOW-4361] Fix flaky test_integration_run_dag_with_scheduler_failure

2019-04-28 Thread GitBox
feng-tao commented on issue #5182:  [AIRFLOW-4361] Fix flaky 
test_integration_run_dag_with_scheduler_failure
URL: https://github.com/apache/airflow/pull/5182#issuecomment-487350412
 
 
   lgtm


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao merged pull request #5171: [AIRFLOW-4228] DatabricksRunNowOperator missing from documentation

2019-04-28 Thread GitBox
feng-tao merged pull request #5171: [AIRFLOW-4228] DatabricksRunNowOperator 
missing from documentation
URL: https://github.com/apache/airflow/pull/5171
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4361) Fix flaky test_integration_run_dag_with_scheduler_failure

2019-04-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827844#comment-16827844
 ] 

ASF GitHub Bot commented on AIRFLOW-4361:
-

feng-tao commented on pull request #5182:  [AIRFLOW-4361] Fix flaky 
test_integration_run_dag_with_scheduler_failure
URL: https://github.com/apache/airflow/pull/5182
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix flaky test_integration_run_dag_with_scheduler_failure
> -
>
> Key: AIRFLOW-4361
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4361
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> test_integration_run_dag_with_scheduler_failure often fails with
> {code}
> ^[[33m   ^[[33m^[[1m^[[33mConnectionError^[[0m^[[0m^[[33m: 
> ^[[0m^[[33mHTTPConnectionPool(host='10.20.3.19', port=  30809): Max 
> retries exceeded with url: 
> /api/experimental/dags/example_kubernetes_executor_config/paused/false (Ca
>   used by NewConnectionError(' 0x7fb80986f208>: Failed to establish a n  ew connection: [Errno 111] 
> Connection refused',))^[[0m^M
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4361) Fix flaky test_integration_run_dag_with_scheduler_failure

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827845#comment-16827845
 ] 

ASF subversion and git services commented on AIRFLOW-4361:
--

Commit 9daad7ecd14fabfc3f441074670f4b38aa93e8b0 in airflow's branch 
refs/heads/master from Chao-Han Tsai
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=9daad7e ]

[AIRFLOW-4361] Fix flaky test_integration_run_dag_with_scheduler_failure (#5182)



> Fix flaky test_integration_run_dag_with_scheduler_failure
> -
>
> Key: AIRFLOW-4361
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4361
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> test_integration_run_dag_with_scheduler_failure often fails with
> {code}
> ^[[33m   ^[[33m^[[1m^[[33mConnectionError^[[0m^[[0m^[[33m: 
> ^[[0m^[[33mHTTPConnectionPool(host='10.20.3.19', port=  30809): Max 
> retries exceeded with url: 
> /api/experimental/dags/example_kubernetes_executor_config/paused/false (Ca
>   used by NewConnectionError(' 0x7fb80986f208>: Failed to establish a n  ew connection: [Errno 111] 
> Connection refused',))^[[0m^M
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4228) DatabricksRunNowOperator does not show up under airflow docs

2019-04-28 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827842#comment-16827842
 ] 

ASF subversion and git services commented on AIRFLOW-4228:
--

Commit e27492436aae56bf2119e953cf882723411e4667 in airflow's branch 
refs/heads/master from Tomme
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=e274924 ]

[AIRFLOW-4228] DatabricksRunNowOperator does not show up under airflow docs 
(#5171)

* Add DatabricksRunNowOperator to integration.rst documentation

> DatabricksRunNowOperator does not show up under airflow docs
> 
>
> Key: AIRFLOW-4228
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4228
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Thomas Dziedzic
>Assignee: Thomas Elvey
>Priority: Trivial
>
> [https://airflow.apache.org/_modules/airflow/contrib/operators/databricks_operator.html]
>  contains DatabricksRunNowOperator but when you visit 
> [https://airflow.apache.org/integration.html?highlight=databricks#databricks] 
> there is no mention of the DatabricksRunNowOperator even though it is 
> available and working.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4228) DatabricksRunNowOperator does not show up under airflow docs

2019-04-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827841#comment-16827841
 ] 

ASF GitHub Bot commented on AIRFLOW-4228:
-

feng-tao commented on pull request #5171: [AIRFLOW-4228] 
DatabricksRunNowOperator missing from documentation
URL: https://github.com/apache/airflow/pull/5171
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DatabricksRunNowOperator does not show up under airflow docs
> 
>
> Key: AIRFLOW-4228
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4228
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Thomas Dziedzic
>Assignee: Thomas Elvey
>Priority: Trivial
>
> [https://airflow.apache.org/_modules/airflow/contrib/operators/databricks_operator.html]
>  contains DatabricksRunNowOperator but when you visit 
> [https://airflow.apache.org/integration.html?highlight=databricks#databricks] 
> there is no mention of the DatabricksRunNowOperator even though it is 
> available and working.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] feng-tao merged pull request #5198: [AIRFLOW-XXX] Update readme for Lyft

2019-04-28 Thread GitBox
feng-tao merged pull request #5198: [AIRFLOW-XXX] Update readme for Lyft
URL: https://github.com/apache/airflow/pull/5198
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5198: [AIRFLOW-XXX] Update readme for Lyft

2019-04-28 Thread GitBox
feng-tao commented on issue #5198: [AIRFLOW-XXX] Update readme for Lyft
URL: https://github.com/apache/airflow/pull/5198#issuecomment-487348695
 
 
   PTAL @milton0825 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao opened a new pull request #5198: [AIRFLOW-XXX] Update readme for Lyft

2019-04-28 Thread GitBox
feng-tao opened a new pull request #5198: [AIRFLOW-XXX] Update readme for Lyft
URL: https://github.com/apache/airflow/pull/5198
 
 
   Add a few more folks from Lyft.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358

2019-04-28 Thread GitBox
feng-tao commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358
URL: https://github.com/apache/airflow/pull/5197#issuecomment-487348001
 
 
   thanks @XD-DENG :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358

2019-04-28 Thread GitBox
codecov-io commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358
URL: https://github.com/apache/airflow/pull/5197#issuecomment-487347426
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=h1) 
Report
   > Merging 
[#5197](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/df18b02d2abc98df2ef850f25b82f8202106a147?src=pr=desc)
 will **increase** coverage by `0.26%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5197/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5197  +/-   ##
   ==
   + Coverage   78.27%   78.54%   +0.26% 
   ==
 Files 469  469  
 Lines   2989629896  
   ==
   + Hits2340223482  +80 
   + Misses   6494 6414  -80
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `88.79% <0%> (+0.86%)` | :arrow_up: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `65.73% <0%> (+1.12%)` | :arrow_up: |
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `74.93% <0%> (+1.86%)` | :arrow_up: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `80.95% <0%> (+4.76%)` | :arrow_up: |
   | 
[airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=)
 | `100% <0%> (+100%)` | :arrow_up: |
   | 
[airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==)
 | `100% <0%> (+100%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=footer). 
Last update 
[df18b02...b9beb58](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG merged pull request #5197: [AIRFLOW-XXX] Fix CVE-2019-11358

2019-04-28 Thread GitBox
XD-DENG merged pull request #5197: [AIRFLOW-XXX] Fix CVE-2019-11358
URL: https://github.com/apache/airflow/pull/5197
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358

2019-04-28 Thread GitBox
codecov-io commented on issue #5197: [AIRFLOW-XXX] Fix CVE-2019-11358
URL: https://github.com/apache/airflow/pull/5197#issuecomment-487347427
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=h1) 
Report
   > Merging 
[#5197](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/df18b02d2abc98df2ef850f25b82f8202106a147?src=pr=desc)
 will **increase** coverage by `0.26%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5197/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5197  +/-   ##
   ==
   + Coverage   78.27%   78.54%   +0.26% 
   ==
 Files 469  469  
 Lines   2989629896  
   ==
   + Hits2340223482  +80 
   + Misses   6494 6414  -80
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.42% <0%> (-0.18%)` | :arrow_down: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `88.79% <0%> (+0.86%)` | :arrow_up: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `65.73% <0%> (+1.12%)` | :arrow_up: |
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `74.93% <0%> (+1.86%)` | :arrow_up: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `80.95% <0%> (+4.76%)` | :arrow_up: |
   | 
[airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=)
 | `100% <0%> (+100%)` | :arrow_up: |
   | 
[airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/5197/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==)
 | `100% <0%> (+100%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=footer). 
Last update 
[df18b02...b9beb58](https://codecov.io/gh/apache/airflow/pull/5197?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services