[jira] [Created] (AIRFLOW-4831) conf.has_option raises an error if the given section is missing instead of returning false

2019-06-20 Thread Hao Liang (JIRA)
Hao Liang created AIRFLOW-4831:
--

 Summary: conf.has_option raises an error if the given section is 
missing instead of returning false
 Key: AIRFLOW-4831
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4831
 Project: Apache Airflow
  Issue Type: Improvement
  Components: core
Affects Versions: 1.10.3
Reporter: Hao Liang
Assignee: Hao Liang


Currently, conf.has_option raises an error if the given section is missing. I 
think it should return false if either option or section is missing, which is 
also the behavior of ConfigParser.has_option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] zhongjiajie opened a new pull request #5454: [AIRFLOW-XXX] Add next/prev ds not correct in faq

2019-06-20 Thread GitBox
zhongjiajie opened a new pull request #5454: [AIRFLOW-XXX] Add next/prev ds not 
correct in faq
URL: https://github.com/apache/airflow/pull/5454
 
 
   ### Description
   
   Add `[next|prev]_ds[_nodash]` not correct in Faq.md
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4422) Stats about pool utilization

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869195#comment-16869195
 ] 

ASF GitHub Bot commented on AIRFLOW-4422:
-

milton0825 commented on pull request #5453: [AIRFLOW-4422] Pool utilization 
stats
URL: https://github.com/apache/airflow/pull/5453
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4422
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Stats about pool utilization
> 
>
> Key: AIRFLOW-4422
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4422
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently we have stats around number of starving tasks in the pool. We 
> should also add more stats for pool around used_slots/open_slots



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] milton0825 opened a new pull request #5453: [AIRFLOW-4422] Pool utilization stats

2019-06-20 Thread GitBox
milton0825 opened a new pull request #5453: [AIRFLOW-4422] Pool utilization 
stats
URL: https://github.com/apache/airflow/pull/5453
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4422
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jmcarp commented on issue #5259: [AIRFLOW-4478] Lazily instantiate default `Resources` object

2019-06-20 Thread GitBox
jmcarp commented on issue #5259: [AIRFLOW-4478] Lazily instantiate default 
`Resources` object
URL: https://github.com/apache/airflow/pull/5259#issuecomment-504288261
 
 
   Anything else I should update @potiuk?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on issue #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
zhongjiajie commented on issue #5448: [AIRFLOW-4827] Remove compatible test for 
python 2
URL: https://github.com/apache/airflow/pull/5448#issuecomment-504287974
 
 
   CI green, PTAL @XD-DENG thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
codecov-io edited a comment on issue #5448: [AIRFLOW-4827] Remove compatible 
test for python 2
URL: https://github.com/apache/airflow/pull/5448#issuecomment-504164379
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=h1) 
Report
   > Merging 
[#5448](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/8a89175ad938ef1c3200384d1ed48105b61c64de?src=pr=desc)
 will **increase** coverage by `0.1%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5448/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master#5448 +/-   ##
   =
   + Coverage   79.01%   79.11%   +0.1% 
   =
 Files 488  488 
 Lines   3056430564 
   =
   + Hits2415124182 +31 
   + Misses   6413 6382 -31
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `60.27% <0%> (+1.02%)` | :arrow_up: |
   | 
[airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=)
 | `70.18% <0%> (+1.24%)` | :arrow_up: |
   | 
[airflow/executors/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvX19pbml0X18ucHk=)
 | `66.66% <0%> (+4.16%)` | :arrow_up: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `80.95% <0%> (+4.76%)` | :arrow_up: |
   | 
[airflow/executors/sequential\_executor.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvc2VxdWVudGlhbF9leGVjdXRvci5weQ==)
 | `100% <0%> (+52.38%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=footer). 
Last update 
[8a89175...488c848](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-4828) Remove param python_version in PythonVirtualenvOperator

2019-06-20 Thread zhongjiajie (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhongjiajie closed AIRFLOW-4828.

Resolution: Won't Fix

According to Kurqq comment on Github

> I don't think we should remove `python_version `parameter.
> If I run Airflow on Python 3.5 and want to run a task with Python 3.7 I need 
> the `python_version `parameter.
> Moreover in the future Python 4 will be released. Airflow will likely support 
> both for a while
> 
> I think the `python_version `needs to stay it's not unique to Python 2.7
> 
> I also think that if someone wants to run something with Python 2.7 with the 
> `PythonVirtualenvOperator `he is more than welcome. There is no need to 
> prevent the user from doing so. Airflow can just remove the test for this so 
> if something is not working it's the user problem.

 

 

> Remove param python_version in PythonVirtualenvOperator
> ---
>
> Key: AIRFLOW-4828
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4828
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: operators
>Affects Versions: 1.10.3
>Reporter: zhongjiajie
>Assignee: zhongjiajie
>Priority: Major
>
> Remove param python_version in PythonVirtualenvOperator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4828) Remove param python_version in PythonVirtualenvOperator

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869114#comment-16869114
 ] 

ASF GitHub Bot commented on AIRFLOW-4828:
-

zhongjiajie commented on pull request #5449: [AIRFLOW-4828] Remove parameter 
python_version
URL: https://github.com/apache/airflow/pull/5449
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove param python_version in PythonVirtualenvOperator
> ---
>
> Key: AIRFLOW-4828
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4828
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: operators
>Affects Versions: 1.10.3
>Reporter: zhongjiajie
>Assignee: zhongjiajie
>Priority: Major
>
> Remove param python_version in PythonVirtualenvOperator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] XD-DENG commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version

2019-06-20 Thread GitBox
XD-DENG commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version
URL: https://github.com/apache/airflow/pull/5449#issuecomment-504269383
 
 
   Thanks @zhongjiajie 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove 
compatible test for python 2
URL: https://github.com/apache/airflow/pull/5448#discussion_r296072721
 
 

 ##
 File path: tests/hooks/test_hive_hook.py
 ##
 @@ -460,23 +458,7 @@ def test_get_results_data(self):
 results = hook.get_results(query, schema=self.database)
 self.assertListEqual(results['data'], [(1, 1), (2, 2)])
 
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION < 3.4, 'assertLogs not support 
before python 3.4')
-def test_to_csv_assertlogs(self):
-hook = HiveServer2Hook()
-query = "SELECT * FROM {}".format(self.table)
-csv_filepath = 'query_results.csv'
-with self.assertLogs() as cm:
-hook.to_csv(query, csv_filepath, schema=self.database,
-delimiter=',', lineterminator='\n', 
output_header=True, fetch_size=2)
-df = pd.read_csv(csv_filepath, sep=',')
-self.assertListEqual(df.columns.tolist(), self.columns)
-self.assertListEqual(df[self.columns[0]].values.tolist(), [1, 2])
-self.assertEqual(len(df), 2)
-self.assertIn('INFO:airflow.hooks.hive_hooks.HiveServer2Hook:'
-  'Written 2 rows so far.', cm.output)
-
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION >= 3.4, 'test could cover by 
test_to_csv_assertLogs')
-def test_to_csv_without_assertlogs(self):
 
 Review comment:
   You're right, remove wrong one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie closed pull request #5449: [AIRFLOW-4828] Remove parameter python_version

2019-06-20 Thread GitBox
zhongjiajie closed pull request #5449: [AIRFLOW-4828] Remove parameter 
python_version
URL: https://github.com/apache/airflow/pull/5449
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version

2019-06-20 Thread GitBox
zhongjiajie commented on issue #5449: [AIRFLOW-4828] Remove parameter 
python_version
URL: https://github.com/apache/airflow/pull/5449#issuecomment-504269187
 
 
   Will close it, thanks for the clarification @kurtqq @XD-DENG 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove 
compatible test for python 2
URL: https://github.com/apache/airflow/pull/5448#discussion_r296072721
 
 

 ##
 File path: tests/hooks/test_hive_hook.py
 ##
 @@ -460,23 +458,7 @@ def test_get_results_data(self):
 results = hook.get_results(query, schema=self.database)
 self.assertListEqual(results['data'], [(1, 1), (2, 2)])
 
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION < 3.4, 'assertLogs not support 
before python 3.4')
-def test_to_csv_assertlogs(self):
-hook = HiveServer2Hook()
-query = "SELECT * FROM {}".format(self.table)
-csv_filepath = 'query_results.csv'
-with self.assertLogs() as cm:
-hook.to_csv(query, csv_filepath, schema=self.database,
-delimiter=',', lineterminator='\n', 
output_header=True, fetch_size=2)
-df = pd.read_csv(csv_filepath, sep=',')
-self.assertListEqual(df.columns.tolist(), self.columns)
-self.assertListEqual(df[self.columns[0]].values.tolist(), [1, 2])
-self.assertEqual(len(df), 2)
-self.assertIn('INFO:airflow.hooks.hive_hooks.HiveServer2Hook:'
-  'Written 2 rows so far.', cm.output)
-
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION >= 3.4, 'test could cover by 
test_to_csv_assertLogs')
-def test_to_csv_without_assertlogs(self):
 
 Review comment:
   You're right, remove wrong one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
zhongjiajie commented on a change in pull request #5448: [AIRFLOW-4827] Remove 
compatible test for python 2
URL: https://github.com/apache/airflow/pull/5448#discussion_r296072633
 
 

 ##
 File path: tests/hooks/test_hive_hook.py
 ##
 @@ -460,23 +458,7 @@ def test_get_results_data(self):
 results = hook.get_results(query, schema=self.database)
 self.assertListEqual(results['data'], [(1, 1), (2, 2)])
 
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION < 3.4, 'assertLogs not support 
before python 3.4')
-def test_to_csv_assertlogs(self):
-hook = HiveServer2Hook()
-query = "SELECT * FROM {}".format(self.table)
-csv_filepath = 'query_results.csv'
-with self.assertLogs() as cm:
-hook.to_csv(query, csv_filepath, schema=self.database,
-delimiter=',', lineterminator='\n', 
output_header=True, fetch_size=2)
-df = pd.read_csv(csv_filepath, sep=',')
-self.assertListEqual(df.columns.tolist(), self.columns)
-self.assertListEqual(df[self.columns[0]].values.tolist(), [1, 2])
-self.assertEqual(len(df), 2)
-self.assertIn('INFO:airflow.hooks.hive_hooks.HiveServer2Hook:'
-  'Written 2 rows so far.', cm.output)
-
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION >= 3.4, 'test could cover by 
test_to_csv_assertLogs')
-def test_to_csv_without_assertlogs(self):
 
 Review comment:
   You right, a wrong remove


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version

2019-06-20 Thread GitBox
XD-DENG commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version
URL: https://github.com/apache/airflow/pull/5449#issuecomment-504258657
 
 
   Agree with @kurtqq .
   
   - To my understanding, [AIP-3 drop support Python 2] is to drop Py2 support 
to Airflow itself only.
   - Minor version ("x" in "3.x") matters.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
XD-DENG commented on a change in pull request #5448: [AIRFLOW-4827] Remove 
compatible test for python 2
URL: https://github.com/apache/airflow/pull/5448#discussion_r296070298
 
 

 ##
 File path: tests/hooks/test_hive_hook.py
 ##
 @@ -460,23 +458,7 @@ def test_get_results_data(self):
 results = hook.get_results(query, schema=self.database)
 self.assertListEqual(results['data'], [(1, 1), (2, 2)])
 
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION < 3.4, 'assertLogs not support 
before python 3.4')
-def test_to_csv_assertlogs(self):
-hook = HiveServer2Hook()
-query = "SELECT * FROM {}".format(self.table)
-csv_filepath = 'query_results.csv'
-with self.assertLogs() as cm:
-hook.to_csv(query, csv_filepath, schema=self.database,
-delimiter=',', lineterminator='\n', 
output_header=True, fetch_size=2)
-df = pd.read_csv(csv_filepath, sep=',')
-self.assertListEqual(df.columns.tolist(), self.columns)
-self.assertListEqual(df[self.columns[0]].values.tolist(), [1, 2])
-self.assertEqual(len(df), 2)
-self.assertIn('INFO:airflow.hooks.hive_hooks.HiveServer2Hook:'
-  'Written 2 rows so far.', cm.output)
-
-@unittest.skipIf(NOT_ASSERTLOGS_VERSION >= 3.4, 'test could cover by 
test_to_csv_assertLogs')
-def test_to_csv_without_assertlogs(self):
 
 Review comment:
   Not sure if you have removed wrong test case?
   The test case you removed will be skipped if Py version < 3.4, i.e., this 
test case is the one for Py 3.4+ which should be kept.
   (**I haven't looked into the test case details. Justing checking the 
`@unittest.skipIf` conditions**)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4830) Timezone ignored if default_args used for multiple dags

2019-06-20 Thread Dean (JIRA)
Dean created AIRFLOW-4830:
-

 Summary: Timezone ignored if default_args used for multiple dags
 Key: AIRFLOW-4830
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4830
 Project: Apache Airflow
  Issue Type: Bug
  Components: DAG
Affects Versions: 1.10.3
Reporter: Dean
 Attachments: Screen Shot 2019-06-20 at 4.43.16 PM.png, Screen Shot 
2019-06-20 at 4.45.15 PM.png

I created a {{default_args}} dict and passed it to two different dags. In the 
first dag, the job was scheduled for 'America/Los_Angeles' as specified in the 
dict, but for the second it was in 'UTC' (it should also be 
'America/Los_Angeles'.

{code:python}
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime
import logging
from pendulum import timezone

def msg(s, **kwargs):
logging.info('{}: {}'.format(s, str(kwargs['execution_date'])))

default_args = {
'start_date': datetime(2018, 1, 1, tzinfo=timezone('America/Los_Angeles'))
}

dag1 = DAG('tz_test_shared_1',
   default_args=default_args,
   catchup=True,
   schedule_interval='0 0 * * *')

t1 = PythonOperator(
task_id='t1',
provide_context=True,
op_args=['t1'],
python_callable=msg,
dag=dag1)

dag2 = DAG('tz_test_shared_2',
   default_args=default_args,
   catchup=True,
   schedule_interval='0 0 * * *')

t2 = PythonOperator(
task_id='t2',
provide_context=True,
op_args=['t2'],
python_callable=msg,
dag=dag2)
{code}

See the resulting task execution times in Screen Shot 2019-06-20 at 4.45.15 
PM.png. One job happens at 08:00 UTC (as expected), but the other at 00:00 UTC.

Compare that to this version, the only difference is the dag names and 
{{default_args}} is repeated:

{code:python}
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime
import logging
from pendulum import timezone

def msg(s, **kwargs):
logging.info('{}: {}'.format(s, str(kwargs['execution_date'])))

default_args = {
'start_date': datetime(2018, 1, 1, tzinfo=timezone('America/Los_Angeles'))
}

dag1 = DAG('tz_test_1',
   default_args=default_args,
   catchup=True,
   schedule_interval='0 0 * * *')

t1 = PythonOperator(
task_id='t1',
provide_context=True,
op_args=['t1'],
python_callable=msg,
dag=dag1)

default_args = {
'start_date': datetime(2018, 1, 1, tzinfo=timezone('America/Los_Angeles'))
}

dag2 = DAG('tz_test_2',
   default_args=default_args,
   catchup=True,
   schedule_interval='0 0 * * *')

t2 = PythonOperator(
task_id='t2',
provide_context=True,
op_args=['t2'],
python_callable=msg,
dag=dag2)
{code}

See the resulting task execution times in Screen Shot 2019-06-20 at 4.43.16 
PM.png. Both happen at 08:00 UTC (as expected).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] dstandish commented on a change in pull request #5332: [AIRFLOW-3391] Upgrade pendulum to latest major version.

2019-06-20 Thread GitBox
dstandish commented on a change in pull request #5332: [AIRFLOW-3391] Upgrade 
pendulum to latest major version.
URL: https://github.com/apache/airflow/pull/5332#discussion_r296034481
 
 

 ##
 File path: airflow/utils/timezone.py
 ##
 @@ -89,10 +89,11 @@ def convert_to_utc(value):
 if not value:
 return value
 
-if not is_localized(value):
-value = pendulum.instance(value, TIMEZONE)
-
-return value.astimezone(utc)
+return (
 
 Review comment:
   OK so i finally installed py3.5 locally to see what’s going on.
   
   `Timezone.convert()` does not work right in pendulum 2.0 on python 3.5.  It 
does not respect `dst_rule`.  So the problem is with the structure of the test 
`test_following_previous_schedule` itself.
   
   In `test_following_previous_schedule`, `start` is defined like so:  ```python
   start = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.PRE_TRANSITION)```
   
   On py3.5 with pendulum 2.0, `start` gets the wrong utcoffset: 1 hour;  in 
all other combinations (i.e. 3.5/1.5, 3.6/1.5, 3.7/1.5, 3.6/2.0, 3.7/2.0), the 
offset is 2 hours (which is the correct offset).
   
   So, I think what we need to do is instead define `start` like so:
   ```
   start = pendulum.datetime(2018, 10, 28, 2, 55, 
dst_rule=pendulum.PRE_TRANSITION, tz=local_tz)
   ```
   
   And additionally, we need to review the code, to see where we might be using 
`Timezone.convert`, and make changes as appropriate.
   
   Additional note:
   On py3.5 with pendulum 2.0 you can see that dst_rule is not respected in 
`convert`:
   ```python
   import pendulum, datetime
   
   local_tz = pendulum.timezone('Europe/Zurich')
   UTC = pendulum.timezone('UTC')
   start_post = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.POST_TRANSITION)
   start_pre = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.PRE_TRANSITION)
   print(start_post.isoformat())
   print(start_pre.isoformat())
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4791) SnowflakeOperator doesn't support schema argument

2019-06-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868988#comment-16868988
 ] 

ASF subversion and git services commented on AIRFLOW-4791:
--

Commit 8a89175ad938ef1c3200384d1ed48105b61c64de in airflow's branch 
refs/heads/master from Jake Gysland
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8a89175 ]

AIRFLOW-4791 add "schema" keyword arg to SnowflakeOperator (#5415)

* added role, schema and warehouse

added addition connection params - current hook does not allow for connecting 
with correct role ie security.  current  hook requires full qualification of 
database.schema.table_name in query - we should have the ability to have per 
schema/database connections in Airflow

* added extra connection paramaters to operator

Added role, warehouse and database parameters to the operator.  Currently 
unable to use operator without qualifying the database.schema.table_name in the 
sql statement.  Proper security is not applied if user id belongs to multiple 
roles, so roles is not defined when connecting.

* AIRFLOW-4791 readability

* AIRFLOW-4791 add to (and improve) docstring


> SnowflakeOperator doesn't support schema argument
> -
>
> Key: AIRFLOW-4791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4791
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks, operators
>Affects Versions: 1.10.3
>Reporter: Jake Gysland
>Priority: Major
>
> {{airflow.contrib.snowflake_operator.SnowflakeOperator}} doesn't provide a 
> way to set the schema to use for the connection.
> There was a [PR opened on 
> [Github|https://github.com/apache/airflow/pull/4306] that added this, but the 
> committer didn't make a Jira ticket or complete the PR message template 
> correctly, and the PR was later closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4791) SnowflakeOperator doesn't support schema argument

2019-06-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868989#comment-16868989
 ] 

ASF subversion and git services commented on AIRFLOW-4791:
--

Commit 8a89175ad938ef1c3200384d1ed48105b61c64de in airflow's branch 
refs/heads/master from Jake Gysland
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8a89175 ]

AIRFLOW-4791 add "schema" keyword arg to SnowflakeOperator (#5415)

* added role, schema and warehouse

added addition connection params - current hook does not allow for connecting 
with correct role ie security.  current  hook requires full qualification of 
database.schema.table_name in query - we should have the ability to have per 
schema/database connections in Airflow

* added extra connection paramaters to operator

Added role, warehouse and database parameters to the operator.  Currently 
unable to use operator without qualifying the database.schema.table_name in the 
sql statement.  Proper security is not applied if user id belongs to multiple 
roles, so roles is not defined when connecting.

* AIRFLOW-4791 readability

* AIRFLOW-4791 add to (and improve) docstring


> SnowflakeOperator doesn't support schema argument
> -
>
> Key: AIRFLOW-4791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4791
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks, operators
>Affects Versions: 1.10.3
>Reporter: Jake Gysland
>Priority: Major
>
> {{airflow.contrib.snowflake_operator.SnowflakeOperator}} doesn't provide a 
> way to set the schema to use for the connection.
> There was a [PR opened on 
> [Github|https://github.com/apache/airflow/pull/4306] that added this, but the 
> committer didn't make a Jira ticket or complete the PR message template 
> correctly, and the PR was later closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4791) SnowflakeOperator doesn't support schema argument

2019-06-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868987#comment-16868987
 ] 

ASF subversion and git services commented on AIRFLOW-4791:
--

Commit 8a89175ad938ef1c3200384d1ed48105b61c64de in airflow's branch 
refs/heads/master from Jake Gysland
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8a89175 ]

AIRFLOW-4791 add "schema" keyword arg to SnowflakeOperator (#5415)

* added role, schema and warehouse

added addition connection params - current hook does not allow for connecting 
with correct role ie security.  current  hook requires full qualification of 
database.schema.table_name in query - we should have the ability to have per 
schema/database connections in Airflow

* added extra connection paramaters to operator

Added role, warehouse and database parameters to the operator.  Currently 
unable to use operator without qualifying the database.schema.table_name in the 
sql statement.  Proper security is not applied if user id belongs to multiple 
roles, so roles is not defined when connecting.

* AIRFLOW-4791 readability

* AIRFLOW-4791 add to (and improve) docstring


> SnowflakeOperator doesn't support schema argument
> -
>
> Key: AIRFLOW-4791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4791
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks, operators
>Affects Versions: 1.10.3
>Reporter: Jake Gysland
>Priority: Major
>
> {{airflow.contrib.snowflake_operator.SnowflakeOperator}} doesn't provide a 
> way to set the schema to use for the connection.
> There was a [PR opened on 
> [Github|https://github.com/apache/airflow/pull/4306] that added this, but the 
> committer didn't make a Jira ticket or complete the PR message template 
> correctly, and the PR was later closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4791) SnowflakeOperator doesn't support schema argument

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868986#comment-16868986
 ] 

ASF GitHub Bot commented on AIRFLOW-4791:
-

jghoman commented on pull request #5415: AIRFLOW-4791 add "schema" keyword arg 
to SnowflakeOperator
URL: https://github.com/apache/airflow/pull/5415
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> SnowflakeOperator doesn't support schema argument
> -
>
> Key: AIRFLOW-4791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4791
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks, operators
>Affects Versions: 1.10.3
>Reporter: Jake Gysland
>Priority: Major
>
> {{airflow.contrib.snowflake_operator.SnowflakeOperator}} doesn't provide a 
> way to set the schema to use for the connection.
> There was a [PR opened on 
> [Github|https://github.com/apache/airflow/pull/4306] that added this, but the 
> committer didn't make a Jira ticket or complete the PR message template 
> correctly, and the PR was later closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] jghoman merged pull request #5415: AIRFLOW-4791 add "schema" keyword arg to SnowflakeOperator

2019-06-20 Thread GitBox
jghoman merged pull request #5415: AIRFLOW-4791 add "schema" keyword arg to 
SnowflakeOperator
URL: https://github.com/apache/airflow/pull/5415
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jghoman commented on issue #5415: AIRFLOW-4791 add "schema" keyword arg to SnowflakeOperator

2019-06-20 Thread GitBox
jghoman commented on issue #5415: AIRFLOW-4791 add "schema" keyword arg to 
SnowflakeOperator
URL: https://github.com/apache/airflow/pull/5415#issuecomment-504222581
 
 
   +1.  Thanks, Jake.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dstandish commented on a change in pull request #5332: [AIRFLOW-3391] Upgrade pendulum to latest major version.

2019-06-20 Thread GitBox
dstandish commented on a change in pull request #5332: [AIRFLOW-3391] Upgrade 
pendulum to latest major version.
URL: https://github.com/apache/airflow/pull/5332#discussion_r296034481
 
 

 ##
 File path: airflow/utils/timezone.py
 ##
 @@ -89,10 +89,11 @@ def convert_to_utc(value):
 if not value:
 return value
 
-if not is_localized(value):
-value = pendulum.instance(value, TIMEZONE)
-
-return value.astimezone(utc)
+return (
 
 Review comment:
   OK so i finally installed py3.5 locally to see what’s going on.
   
   `Timezone.convert()` does not work right in pendulum 2.0 on python 3.5.  It 
does not respect `dst_rule`.  So the problem is with the structure of the test 
`test_following_previous_schedule` itself.
   
   In `test_following_previous_schedule`, `start` is defined like so:  ```python
   start = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.PRE_TRANSITION)```
   
   On py3.5 with pendulum 2.0, `start` gets the wrong utcoffset: 1 hour;  in 
all other combinations (i.e. 3.5/1.5, 3.6/1.5, 3.7/1.5, 3.6/2.0, 3.7/2.0), the 
offset is 2 hours (which is the correct offset).
   
   So, I think what we need to do is instead define `start` like so:
   ```
   start = pendulum.datetime(2018, 10, 28, 2, 55, 
dst_rule=pendulum.PRE_TRANSITION, tz=local_tz)```
   
   And additionally, we need to review the code, to see where we might be using 
`Timezone.convert`, and make changes as appropriate.
   
   Additional note:
   On py3.5 with pendulum 2.0 you can see that dst_rule is not respected in 
`convert`:
   ```python
   import pendulum, datetime
   
   local_tz = pendulum.timezone('Europe/Zurich')
   UTC = pendulum.timezone('UTC')
   start_post = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.POST_TRANSITION)
   start_pre = local_tz.convert(datetime.datetime(2018, 10, 28, 2, 55), 
dst_rule=pendulum.PRE_TRANSITION)
   print(start_post.isoformat())
   print(start_pre.isoformat())
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296029509
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 
 Review comment:
   ```suggestion
if self.force_pull:
   ignore = ["Downloading", "Extracting", "Waiting"]
   self.log.info('Force pulling docker image %s', self.image)
   for output in self.cli.pull(self.image, stream=True, 
decode=True):
   if 'status' in output and not any(w in output['status'] for 
w in ignore):
   self.log.info("%s", output['status'])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296028365
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 self.log.info('Pulling docker image %s', self.image)
-for l in self.cli.pull(self.image, stream=True):
+for l in self.cli.pull(self.image, stream=True, decode=True):
 output = json.loads(l.decode('utf-8').strip())
 
 Review comment:
   ```suggestion
 if self.force_pull:
   ignore = ["Downloading", "Extracting", "Waiting"]
   self.log.info('Force pulling docker image %s', self.image)
   for output in self.cli.pull(self.image, stream=True, 
decode=True):
   if 'status' in output and not any(w in output['status'] for 
w in ignore):
   self.log.info("%s", output['status'])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296028365
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 self.log.info('Pulling docker image %s', self.image)
-for l in self.cli.pull(self.image, stream=True):
+for l in self.cli.pull(self.image, stream=True, decode=True):
 output = json.loads(l.decode('utf-8').strip())
 
 Review comment:
   ```suggestion
   if self.force_pull:
   ignore = ["Downloading", "Extracting", "Waiting"]
   self.log.info('Force pulling docker image %s', self.image)
   for output in self.cli.pull(self.image, stream=True, 
decode=True):
   if 'status' in output and not any(w in output['status'] for 
w in ignore):
   self.log.info("%s", output['status'])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296028365
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 self.log.info('Pulling docker image %s', self.image)
-for l in self.cli.pull(self.image, stream=True):
+for l in self.cli.pull(self.image, stream=True, decode=True):
 output = json.loads(l.decode('utf-8').strip())
 
 Review comment:
   ```suggestion
   if self.force_pull:
   ignore = ["Downloading", "Extracting", "Waiting"]
   self.log.info('Force pulling docker image %s', self.image)
   for output in self.cli.pull(self.image, stream=True, 
decode=True):
   if 'status' in output and not any(w in output['status'] for 
w in ignore):
   self.log.info("%s", output['status'])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296028365
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 self.log.info('Pulling docker image %s', self.image)
-for l in self.cli.pull(self.image, stream=True):
+for l in self.cli.pull(self.image, stream=True, decode=True):
 output = json.loads(l.decode('utf-8').strip())
 
 Review comment:
   ```suggestion
   if self.force_pull:
   ignore = ["Downloading", "Extracting", "Waiting"]
   self.log.info('Force pulling docker image %s', self.image)
   for output in self.cli.pull(self.image, stream=True, 
decode=True):
   if 'status' in output and not any(w in output['status'] for 
w in ignore):
   self.log.info("%s", output['status'])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json encoding error when retrieving `status` from cli in docker operator

2019-06-20 Thread GitBox
kpathak13 commented on a change in pull request #5356: [AIRFLOW-4363] Fix Json 
encoding error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356#discussion_r296027561
 
 

 ##
 File path: airflow/operators/docker_operator.py
 ##
 @@ -198,7 +198,7 @@ def execute(self, context):
 
 if self.force_pull or len(self.cli.images(name=self.image)) == 0:
 
 Review comment:
   Do you need `len(self.cli.images(name=self.image)) == 0` check? That is I 
think is default behavior.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] OmerJog commented on issue #1942: [AIRFLOW-697] Add exclusion of tasks.

2019-06-20 Thread GitBox
OmerJog commented on issue #1942: [AIRFLOW-697] Add exclusion of tasks.
URL: https://github.com/apache/airflow/pull/1942#issuecomment-504178925
 
 
   @YangLeoZhao  if you implemented it and can share with the community it 
would be appriciated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1148) Airflow cannot handle datetime(6) column values(execution_time, start_date, end_date)

2019-06-20 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868928#comment-16868928
 ] 

jack commented on AIRFLOW-1148:
---

Can you check this against newer airflow version?

> Airflow cannot handle datetime(6) column values(execution_time, start_date, 
> end_date)
> -
>
> Key: AIRFLOW-1148
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1148
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.8.0
> Environment: sql_alchemy_conn: cloudSQL via cloud_sql_proxy
> celery broker: amazon SQS
>Reporter: Maoya Sato
>Priority: Major
>
> Airflow cannot handle datetime(6) column values (execution_date, start_date, 
> end_date etc..)
> {code}
> mysql> select dag_id, execution_date from dag_run;
> +-++
> | dag_id  | execution_date |
> +-++
> | test_dag   | 2017-04-26 13:15:00.00 |
> +-++
> {code}
> {code}
> >>> from airflow import settings
> >>> session = settings.Session()
> >>> from airflow.models import DagRun
> >>> dag = session.query(DagRun).filter_by(dag_id='test_dag').first()
> >>> dag.execution_date
> >>>
> {code}
> execution_date gets None though it should be like datetime(2017, 4, 26, 13, 
> 15)
> The reason that I know is datetime(6) is the cause. if I try with datetime 
> without  fractional seconds precision, it works.
> It has something to do with this migration(adding fsp to datetime column)
> https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/4addfa1236f1_add_fractional_seconds_to_mysql_tables.py
> I've created a simple dag (python2)
> {code}
> import airflow
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import timedelta, datetime
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2017, 4, 26, 13, 15),
> 'retries': 1,
> 'retry_delay': timedelta(minutes=5),
> 'queue': 'airflow-dev',
> 'end_date': datetime(2017, 4, 27, 0, 0)
> }
> dag = DAG(
> 'test_dag',
> default_args=default_args,
> description='A simple tutorial DAG',
> schedule_interval=timedelta(minutes=1))
> t1 = BashOperator(
> task_id='print_date',
> bash_command='date',
> dag=dag)
> {code}
> Error below occurs
> {code}
> {jobs.py:354} DagFileProcessor3 ERROR - Got an exception! Propagating...
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 346, in 
> helper
> pickle_dags)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 53, 
> in wrapper
> result = func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 1583, 
> in process_file
> self._process_dags(dagbag, dags, ti_keys_to_schedule)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 1173, 
> in _process_dags
> dag_run = self.create_dag_run(dag)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 53, 
> in wrapper
> result = func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 803, in 
> create_dag_run
> while next_run_date <= last_run.execution_date:
> TypeError: can't compare datetime.datetime to NoneType
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1446) Have option to clear cross dag dependencies if you clear a task

2019-06-20 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868911#comment-16868911
 ] 

jack commented on AIRFLOW-1446:
---

This can be an extension of https://issues.apache.org/jira/browse/AIRFLOW-4005 

> Have option to clear cross dag dependencies if you clear a task
> ---
>
> Key: AIRFLOW-1446
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1446
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.8.0, 2.0.0
>Reporter: Angela Zhang
>Priority: Major
>
> If you have a task A that depends on another task B in a different Dag using 
> ExternalTaskSensor, when someone clears task B, we should also show option to 
> clear & re-run task A as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3146) TimeDeltaSensor - add end time stamp

2019-06-20 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3146:
--
Labels: sensors  (was: )

> TimeDeltaSensor  - add end time stamp
> -
>
> Key: AIRFLOW-3146
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3146
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: jack
>Priority: Minor
>  Labels: sensors
>
> Currently the TimeDeltaSensor  takes these arguments:
> {color:#4070a0}    :param delta: time length to wait after execution_date 
> before succeeding{color}
> {color:#4070a0}    :type delta: datetime.timedelta{color}{color:#4070a0}    
> """{color}{color:#404040}
>  
>      {color}{color:#55}@apply_defaults{color}{color:#404040}
>      {color}{color:#007020}def{color} 
> {color:#06287e}__init__{color}{color:#404040}({color}{color:#007020}self{color}{color:#404040},{color}
>  {color:#404040}delta{color}{color:#404040},{color} 
> {color:#66}*{color}{color:#404040}args{color}{color:#404040},{color} 
> {color:#66}**{color}{color:#404040}kwargs{color}{color:#404040}):{color}{color:#404040}
>      
> {color}{color:#007020}super{color}{color:#404040}({color}{color:#404040}TimeDeltaSensor{color}{color:#404040},{color}
>  
> {color:#007020}self{color}{color:#404040}){color}{color:#66}.{color}{color:#06287e}__init__{color}{color:#404040}({color}{color:#66}*{color}{color:#404040}args{color}{color:#404040},{color}
>  
> {color:#66}**{color}{color:#404040}kwargs{color}{color:#404040}){color}{color:#404040}
>      
> {color}{color:#007020}self{color}{color:#66}.{color}{color:#404040}delta{color}
>  {color:#66}={color} {color:#404040}delta{color}
>  
> *The problem:*
> The operator assumes that the user would like to wait execution_date + delta.
> This isn't always the case, The user might want to wait delta after the task 
> has finished running.
>  
> Lets take this example:
> {color:#4070a0}    2016-01-01 can only start running on 2016-01-02. The 
> timedelta here{color}{color:#4070a0}    represents the time after the 
> execution period has closed.{color}
> Now, say the task ran for +4 hours.+ So if we set
> {code:java}
> delta = 5 minutes {code}
> it actually doesn't wait at all. The sensor is useless in that senario.  What 
> the user may be intended is :
>  
> execution_date + duration to complete task + delta.
>  
> Or in simple words: Timestamp of execution end + delta.
>  
> *My suggestion:*
> Add another Boolean  parameter which will choose if the delta is from 
> execution_date or execution_date + duration of task



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io commented on issue #5448: [AIRFLOW-4827] Remove compatible test for python 2

2019-06-20 Thread GitBox
codecov-io commented on issue #5448: [AIRFLOW-4827] Remove compatible test for 
python 2
URL: https://github.com/apache/airflow/pull/5448#issuecomment-504164379
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=h1) 
Report
   > Merging 
[#5448](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/7bacddea45ff235f8b722d76d8c56eb7604792a4?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5448/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5448  +/-   ##
   ==
   - Coverage   79.12%   79.12%   -0.01% 
   ==
 Files 488  488  
 Lines   3055330553  
   ==
   - Hits2417624175   -1 
   - Misses   6377 6378   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/airflow/pull/5448/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `77.46% <0%> (-0.26%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=footer). 
Last update 
[7bacdde...ec05b9e](https://codecov.io/gh/apache/airflow/pull/5448?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ryw commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2019-06-20 Thread GitBox
ryw commented on issue #2460: [AIRFLOW-1424] make the next execution date of 
DAGs visible
URL: https://github.com/apache/airflow/pull/2460#issuecomment-504161646
 
 
   @ashb is this on your radar to move forward? Looks like nice feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4829) EMR job flow and step sensor should provide a reason why the job failed

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868887#comment-16868887
 ] 

ASF GitHub Bot commented on AIRFLOW-4829:
-

jzucker2 commented on pull request #5452: [AIRFLOW-4829] More descriptive 
exceptions for EMR sensors
URL: https://github.com/apache/airflow/pull/5452
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-4829\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4829
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-4829\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   No tests because it simply adds more info to the existing sensors that 
already fetch the data. It adds no new functionality, just explains more in 
cases of failure
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> EMR job flow and step sensor should provide a reason why the job failed
> ---
>
> Key: AIRFLOW-4829
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4829
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.3
>Reporter: Jordan Zucker
>Assignee: Jordan Zucker
>Priority: Minor
>
> Currently, when using the EmrJobFlowSensor and the EmrStepSensor there is an 
> exception raised when a cluster fails. But no information is provided on what 
> the failure was. The sensor that raises the exception already has the info in 
> the boto3 response from the EMR api and this could be easily provided by 
> extending the exception message with more details. This could be made a 
> boolean or just always provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] jzucker2 opened a new pull request #5452: [AIRFLOW-4829] More descriptive exceptions for EMR sensors

2019-06-20 Thread GitBox
jzucker2 opened a new pull request #5452: [AIRFLOW-4829] More descriptive 
exceptions for EMR sensors
URL: https://github.com/apache/airflow/pull/5452
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-4829\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4829
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-4829\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   No tests because it simply adds more info to the existing sensors that 
already fetch the data. It adds no new functionality, just explains more in 
cases of failure
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4829) EMR job flow and step sensor should provide a reason why the job failed

2019-06-20 Thread Jordan Zucker (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zucker updated AIRFLOW-4829:
---
Description: Currently, when using the EmrJobFlowSensor and the 
EmrStepSensor there is an exception raised when a cluster fails. But no 
information is provided on what the failure was. The sensor that raises the 
exception already has the info in the boto3 response from the EMR api and this 
could be easily provided by extending the exception message with more details. 
This could be made a boolean or just always provided.  (was: Currently, when 
using the EmrJobFlowSensor there is an exception raised when a cluster fails. 
But no information is provided on what the failure was. The sensor that raises 
the exception already has the info in the boto3 response from the EMR api and 
this could be easily provided by extending the exception message with more 
details. This could be made a boolean or just always provided.)

> EMR job flow and step sensor should provide a reason why the job failed
> ---
>
> Key: AIRFLOW-4829
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4829
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.3
>Reporter: Jordan Zucker
>Assignee: Jordan Zucker
>Priority: Minor
>
> Currently, when using the EmrJobFlowSensor and the EmrStepSensor there is an 
> exception raised when a cluster fails. But no information is provided on what 
> the failure was. The sensor that raises the exception already has the info in 
> the boto3 response from the EMR api and this could be easily provided by 
> extending the exception message with more details. This could be made a 
> boolean or just always provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-4829) EMR job flow and step sensor should provide a reason why the job failed

2019-06-20 Thread Jordan Zucker (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zucker updated AIRFLOW-4829:
---
Summary: EMR job flow and step sensor should provide a reason why the job 
failed  (was: EMR job flow sensor should provide a reason why the job failed)

> EMR job flow and step sensor should provide a reason why the job failed
> ---
>
> Key: AIRFLOW-4829
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4829
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.3
>Reporter: Jordan Zucker
>Assignee: Jordan Zucker
>Priority: Minor
>
> Currently, when using the EmrJobFlowSensor there is an exception raised when 
> a cluster fails. But no information is provided on what the failure was. The 
> sensor that raises the exception already has the info in the boto3 response 
> from the EMR api and this could be easily provided by extending the exception 
> message with more details. This could be made a boolean or just always 
> provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4805) Add py_file as templated field in DataflowPythonOperator

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868872#comment-16868872
 ] 

ASF GitHub Bot commented on AIRFLOW-4805:
-

eladkal commented on pull request #5451: [AIRFLOW-4805] Add py_file as 
templated field in DataflowPythonOperator
URL: https://github.com/apache/airflow/pull/5451
 
 
   
   ### Jira
   
   - [ ] My PR addresses the following
 https://issues.apache.org/jira/browse/AIRFLOW-4805
   
   ### Description
   
   - Add py_file as templated field in DataflowPythonOperator
   
   ### Tests
   
   - Not needed 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add py_file as templated field in DataflowPythonOperator
> 
>
> Key: AIRFLOW-4805
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4805
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.9.0, 1.10.1, 1.10.2
>Reporter: Wilson Lian
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] eladkal opened a new pull request #5451: [AIRFLOW-4805] Add py_file as templated field in DataflowPythonOperator

2019-06-20 Thread GitBox
eladkal opened a new pull request #5451: [AIRFLOW-4805] Add py_file as 
templated field in DataflowPythonOperator
URL: https://github.com/apache/airflow/pull/5451
 
 
   
   ### Jira
   
   - [ ] My PR addresses the following
 https://issues.apache.org/jira/browse/AIRFLOW-4805
   
   ### Description
   
   - Add py_file as templated field in DataflowPythonOperator
   
   ### Tests
   
   - Not needed 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4829) EMR job flow sensor should provide a reason why the job failed

2019-06-20 Thread Jordan Zucker (JIRA)
Jordan Zucker created AIRFLOW-4829:
--

 Summary: EMR job flow sensor should provide a reason why the job 
failed
 Key: AIRFLOW-4829
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4829
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib
Affects Versions: 1.10.3
Reporter: Jordan Zucker
Assignee: Jordan Zucker


Currently, when using the EmrJobFlowSensor there is an exception raised when a 
cluster fails. But no information is provided on what the failure was. The 
sensor that raises the exception already has the info in the boto3 response 
from the EMR api and this could be easily provided by extending the exception 
message with more details. This could be made a boolean or just always provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io commented on issue #5447: [AIRFLOW-4826] Fix warning in resetdb command

2019-06-20 Thread GitBox
codecov-io commented on issue #5447: [AIRFLOW-4826] Fix warning in resetdb 
command
URL: https://github.com/apache/airflow/pull/5447#issuecomment-504147830
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=h1) 
Report
   > Merging 
[#5447](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/7bacddea45ff235f8b722d76d8c56eb7604792a4?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5447/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5447  +/-   ##
   ==
   - Coverage   79.12%   79.11%   -0.02% 
   ==
 Files 488  488  
 Lines   3055330562   +9 
   ==
   + Hits2417624179   +3 
   - Misses   6377 6383   +6
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `90.09% <100%> (-0.01%)` | :arrow_down: |
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `75.71% <0%> (-0.22%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.02% <0%> (-0.17%)` | :arrow_down: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `91.41% <0%> (-0.16%)` | :arrow_down: |
   | 
[airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=)
 | `70.18% <0%> (-0.14%)` | :arrow_down: |
   | 
[airflow/api/common/experimental/pool.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9wb29sLnB5)
 | `100% <0%> (ø)` | :arrow_up: |
   | 
[airflow/models/baseoperator.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZW9wZXJhdG9yLnB5)
 | `94.44% <0%> (+0.01%)` | :arrow_up: |
   | 
[airflow/models/pool.py](https://codecov.io/gh/apache/airflow/pull/5447/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvcG9vbC5weQ==)
 | `97.05% <0%> (+0.08%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=footer). 
Last update 
[7bacdde...94bf29d](https://codecov.io/gh/apache/airflow/pull/5447?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] HaloKo4 commented on issue #2206: [AIRFLOW-922] Update PrestoHook to enable synchronous execution

2019-06-20 Thread GitBox
HaloKo4 commented on issue #2206: [AIRFLOW-922] Update PrestoHook to enable 
synchronous execution
URL: https://github.com/apache/airflow/pull/2206#issuecomment-504138100
 
 
   @patrickmckenna  is there a chance you will continue to work on that? It's 
shame that this amazing work will go to waste.  This PR is important without it 
we can not schedule Presto jobs on Airflow as everything is considered 
success... we can not set dependencies. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5255: [AIRFLOW-4462] Fix incorrect datetime column types when using MSSQL backend

2019-06-20 Thread GitBox
codecov-io edited a comment on issue #5255: [AIRFLOW-4462] Fix incorrect 
datetime column types when using MSSQL backend
URL: https://github.com/apache/airflow/pull/5255#issuecomment-490256665
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=h1) 
Report
   > Merging 
[#5255](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/2fd7567070db5cedbfd2ea83951ffa0868415739?src=pr=desc)
 will **decrease** coverage by `0.1%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5255/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5255  +/-   ##
   ==
   - Coverage   78.76%   78.66%   -0.11% 
   ==
 Files 481  470  -11 
 Lines   3021530006 -209 
   ==
   - Hits2380023603 -197 
   + Misses   6415 6403  -12
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/pig\_operator.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcGlnX29wZXJhdG9yLnB5)
 | `0% <0%> (-76.93%)` | :arrow_down: |
   | 
[airflow/security/utils.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9zZWN1cml0eS91dGlscy5weQ==)
 | `26.92% <0%> (-23.08%)` | :arrow_down: |
   | 
[airflow/contrib/operators/dataproc\_operator.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9kYXRhcHJvY19vcGVyYXRvci5weQ==)
 | `69.83% <0%> (-13.99%)` | :arrow_down: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `76.47% <0%> (-1.91%)` | :arrow_down: |
   | 
[...ample\_dags/example\_branch\_python\_dop\_operator\_3.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9icmFuY2hfcHl0aG9uX2RvcF9vcGVyYXRvcl8zLnB5)
 | `73.33% <0%> (-1.67%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_xcom.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV94Y29tLnB5)
 | `60.86% <0%> (-1.64%)` | :arrow_down: |
   | 
[airflow/dag/base\_dag.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9kYWcvYmFzZV9kYWcucHk=)
 | `66.66% <0%> (-1.34%)` | :arrow_down: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `83.87% <0%> (-0.96%)` | :arrow_down: |
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `75% <0%> (-0.95%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_trigger\_target\_dag.py](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX3RhcmdldF9kYWcucHk=)
 | `91.66% <0%> (-0.65%)` | :arrow_down: |
   | ... and [121 
more](https://codecov.io/gh/apache/airflow/pull/5255/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=footer). 
Last update 
[2fd7567...09b22e9](https://codecov.io/gh/apache/airflow/pull/5255?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kurtqq commented on issue #5450: [AIRFLOW-4775] - fix incorrect parameter order in GceHook

2019-06-20 Thread GitBox
kurtqq commented on issue #5450: [AIRFLOW-4775] - fix incorrect parameter order 
in GceHook
URL: https://github.com/apache/airflow/pull/5450#issuecomment-504120947
 
 
   @mik-laj  I'm not really sure what you are asking to test?
   The tests of `wait_for_operation_to_complete` are fine and complete. This 
bug happened because optional argument was missing in one of the function calls.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5450: [AIRFLOW-4775] - fix incorrect parameter order in GceHook

2019-06-20 Thread GitBox
mik-laj commented on issue #5450: [AIRFLOW-4775] - fix incorrect parameter 
order in GceHook
URL: https://github.com/apache/airflow/pull/5450#issuecomment-504117094
 
 
   Can you complete the tests to detect a similar case in the future?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kurtqq commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version

2019-06-20 Thread GitBox
kurtqq commented on issue #5449: [AIRFLOW-4828] Remove parameter python_version
URL: https://github.com/apache/airflow/pull/5449#issuecomment-504115895
 
 
   I don't think we should remove `python_version `parameter. 
   If I run Airflow on Python 3.5 and want to run a task with Python 3.7 I need 
the `python_version `parameter. 
Moreover in the future Python 4 will be released. Airflow will likely 
support both for a while
   
   I think the `python_version `needs to stay it's not unique to Python 2.7
   
   I also think that if someone wants to run something with Python 2.7 with the 
`PythonVirtualenvOperator `he is more than welcome. There is no need to prevent 
the user from doing so.  Airflow can just remove the test for this so if 
something is not working it's the user problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4775) GceHook ommited num_retries parameter

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868748#comment-16868748
 ] 

ASF GitHub Bot commented on AIRFLOW-4775:
-

kurtqq commented on pull request #5450: [AIRFLOW-4775] - fix incorrect 
parameter order in GceHook (#1)
URL: https://github.com/apache/airflow/pull/5450
 
 
   PR https://github.com/apache/airflow/pull/5117 introduced  num_retries for 
GCP. it seems like one function call was not changed acordingly resulted in 
skipping the num_retries value.
   This PR fixes it.
   
   
   
   ### Jira
   
 - https://issues.apache.org/jira/browse/AIRFLOW-4775
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> GceHook ommited num_retries parameter
> -
>
> Key: AIRFLOW-4775
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4775
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>
> The _check_global_operation_status method is not correctly executed in the 
> GceHook hook. The num_retries parameter has been omitted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] kurtqq opened a new pull request #5450: [AIRFLOW-4775] - fix incorrect parameter order in GceHook (#1)

2019-06-20 Thread GitBox
kurtqq opened a new pull request #5450: [AIRFLOW-4775] - fix incorrect 
parameter order in GceHook (#1)
URL: https://github.com/apache/airflow/pull/5450
 
 
   PR https://github.com/apache/airflow/pull/5117 introduced  num_retries for 
GCP. it seems like one function call was not changed acordingly resulted in 
skipping the num_retries value.
   This PR fixes it.
   
   
   
   ### Jira
   
 - https://issues.apache.org/jira/browse/AIRFLOW-4775


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-4591) Tag tasks with default pool

2019-06-20 Thread Tao Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng resolved AIRFLOW-4591.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Tag tasks with default pool
> ---
>
> Key: AIRFLOW-4591
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4591
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently the number of running tasks without a pool specified will be 
> limited by `non_pooled_task_slot_count`. It limits the number of tasks 
> launched per scheduler loop but does not limit the number of tasks running in 
> parallel.
> This ticket proposes that we assign tasks without a pool specified to default 
> pool which limits the number of running tasks in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4591) Tag tasks with default pool

2019-06-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868739#comment-16868739
 ] 

ASF subversion and git services commented on AIRFLOW-4591:
--

Commit 2c99ec624bd66e9fa38e9f0087d46ef4d7f05aec in airflow's branch 
refs/heads/master from Chao-Han Tsai
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=2c99ec6 ]

[AIRFLOW-4591] Make default_pool a real pool (#5349)

`non_pooled_task_slot_count` and `non_pooled_backfill_task_slot_count`
are removed in favor of a real pool, e.g. `default_pool`.

By default tasks are running in `default_pool`.
`default_pool` is initialized with 128 slots and user can change the
number of slots through UI/CLI. `default_pool` cannot be removed.

> Tag tasks with default pool
> ---
>
> Key: AIRFLOW-4591
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4591
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently the number of running tasks without a pool specified will be 
> limited by `non_pooled_task_slot_count`. It limits the number of tasks 
> launched per scheduler loop but does not limit the number of tasks running in 
> parallel.
> This ticket proposes that we assign tasks without a pool specified to default 
> pool which limits the number of running tasks in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-4591) Tag tasks with default pool

2019-06-20 Thread Tao Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng reopened AIRFLOW-4591:
---

> Tag tasks with default pool
> ---
>
> Key: AIRFLOW-4591
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4591
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently the number of running tasks without a pool specified will be 
> limited by `non_pooled_task_slot_count`. It limits the number of tasks 
> launched per scheduler loop but does not limit the number of tasks running in 
> parallel.
> This ticket proposes that we assign tasks without a pool specified to default 
> pool which limits the number of running tasks in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-4591) Tag tasks with default pool

2019-06-20 Thread Tao Feng (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng resolved AIRFLOW-4591.
---
Resolution: Fixed

> Tag tasks with default pool
> ---
>
> Key: AIRFLOW-4591
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4591
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently the number of running tasks without a pool specified will be 
> limited by `non_pooled_task_slot_count`. It limits the number of tasks 
> launched per scheduler loop but does not limit the number of tasks running in 
> parallel.
> This ticket proposes that we assign tasks without a pool specified to default 
> pool which limits the number of running tasks in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] feng-tao commented on issue #5349: [AIRFLOW-4591] Make default_pool a real pool

2019-06-20 Thread GitBox
feng-tao commented on issue #5349: [AIRFLOW-4591] Make default_pool a real pool
URL: https://github.com/apache/airflow/pull/5349#issuecomment-504109989
 
 
   ship


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao merged pull request #5349: [AIRFLOW-4591] Make default_pool a real pool

2019-06-20 Thread GitBox
feng-tao merged pull request #5349: [AIRFLOW-4591] Make default_pool a real pool
URL: https://github.com/apache/airflow/pull/5349
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4591) Tag tasks with default pool

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868738#comment-16868738
 ] 

ASF GitHub Bot commented on AIRFLOW-4591:
-

feng-tao commented on pull request #5349: [AIRFLOW-4591] Make default_pool a 
real pool
URL: https://github.com/apache/airflow/pull/5349
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Tag tasks with default pool
> ---
>
> Key: AIRFLOW-4591
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4591
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently the number of running tasks without a pool specified will be 
> limited by `non_pooled_task_slot_count`. It limits the number of tasks 
> launched per scheduler loop but does not limit the number of tasks running in 
> parallel.
> This ticket proposes that we assign tasks without a pool specified to default 
> pool which limits the number of running tasks in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] milton0825 commented on issue #5349: [AIRFLOW-4591] Make default_pool a real pool

2019-06-20 Thread GitBox
milton0825 commented on issue #5349: [AIRFLOW-4591] Make default_pool a real 
pool
URL: https://github.com/apache/airflow/pull/5349#issuecomment-504109166
 
 
   @feng-tao I have resolved the conflicts and now CI is green.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4825) BigQueryOperator execute a list of SQL queries doesn't work

2019-06-20 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868729#comment-16868729
 ] 

jack commented on AIRFLOW-4825:
---

It doesn't support list.

The BigQuery hook support only str:

[https://github.com/apache/airflow/blob/7bacddea45ff235f8b722d76d8c56eb7604792a4/airflow/contrib/hooks/bigquery_hook.py#L657]

Maybe it was intended that the operator will loop over the list of str and 
execute one by one?

[~kaxilnaik] might know.

> BigQueryOperator execute a list of SQL queries doesn't work
> ---
>
> Key: AIRFLOW-4825
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4825
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.2
>Reporter: Evgeny
>Priority: Major
>
> The documentation of bigquery_operator says that I can send to the field sql 
> in BigQueryOperator a list of strings where each string is a SQL query.
> 1. When i'm trying to run  DAG with the code below, i'm receiving the error 
> "TypeError: query argument must have a type (,) not  'list'>" .
> sql_1 = """select 1 from ...
> sql_2 = """select 2 from ...
> list_of_queries = list([sql_1,sql_2])
> updates = BigQueryOperator(
>  task_id="Updates_\{0}".format(new_profile_name),
>  sql=list_of_queries,
>  allow_large_results=True,
>  use_legacy_sql=False,
>  bigquery_conn_id="bigquery_default",
>  dag=dag
> )
> 2. Does the execution order of the queries is identical to the list indexes?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] OmerJog edited a comment on issue #5448: [AIRFLOW-4196] Remove compatible test for python 2

2019-06-20 Thread GitBox
OmerJog edited a comment on issue #5448: [AIRFLOW-4196] Remove compatible test 
for python 2
URL: https://github.com/apache/airflow/pull/5448#issuecomment-504103212
 
 
   Shouldnt you target https://issues.apache.org/jira/browse/AIRFLOW-4827 ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] OmerJog commented on issue #5448: [AIRFLOW-4196] Remove compatible test for python 2

2019-06-20 Thread GitBox
OmerJog commented on issue #5448: [AIRFLOW-4196] Remove compatible test for 
python 2
URL: https://github.com/apache/airflow/pull/5448#issuecomment-504103212
 
 
   Shouldnt you target https://issues.apache.org/jira/browse/AIRFLOW-4827?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4196) AIP-3 Drop support for Python 2

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868714#comment-16868714
 ] 

ASF GitHub Bot commented on AIRFLOW-4196:
-

zhongjiajie commented on pull request #5449: [AIRFLOW-4196] Remove parameter 
python_version
URL: https://github.com/apache/airflow/pull/5449
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4196
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Part of AIP-3 Drop support for Python 2
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AIP-3 Drop support for Python 2
> ---
>
> Key: AIRFLOW-4196
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4196
> Project: Apache Airflow
>  Issue Type: Task
>  Components: core
>Reporter: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-3+Drop+support+for+Python+2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] zhongjiajie opened a new pull request #5449: [AIRFLOW-4196] Remove parameter python_version

2019-06-20 Thread GitBox
zhongjiajie opened a new pull request #5449: [AIRFLOW-4196] Remove parameter 
python_version
URL: https://github.com/apache/airflow/pull/5449
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4196
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Part of AIP-3 Drop support for Python 2
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4828) Remove param python_version in PythonVirtualenvOperator

2019-06-20 Thread zhongjiajie (JIRA)
zhongjiajie created AIRFLOW-4828:


 Summary: Remove param python_version in PythonVirtualenvOperator
 Key: AIRFLOW-4828
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4828
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: operators
Affects Versions: 1.10.3
Reporter: zhongjiajie
Assignee: zhongjiajie


Remove param python_version in PythonVirtualenvOperator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4196) AIP-3 Drop support for Python 2

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868683#comment-16868683
 ] 

ASF GitHub Bot commented on AIRFLOW-4196:
-

zhongjiajie commented on pull request #5448: [AIRFLOW-4196] Remove compatible 
test for python 2
URL: https://github.com/apache/airflow/pull/5448
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4196
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AIP-3 Drop support for Python 2
> ---
>
> Key: AIRFLOW-4196
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4196
> Project: Apache Airflow
>  Issue Type: Task
>  Components: core
>Reporter: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-3+Drop+support+for+Python+2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] zhongjiajie opened a new pull request #5448: [AIRFLOW-4196] Remove compatible test for python 2

2019-06-20 Thread GitBox
zhongjiajie opened a new pull request #5448: [AIRFLOW-4196] Remove compatible 
test for python 2
URL: https://github.com/apache/airflow/pull/5448
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4196
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4827) Remove compatible test for python 2

2019-06-20 Thread zhongjiajie (JIRA)
zhongjiajie created AIRFLOW-4827:


 Summary: Remove compatible test for python 2
 Key: AIRFLOW-4827
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4827
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: tests
Affects Versions: 1.10.3
Reporter: zhongjiajie
Assignee: zhongjiajie


Remove compatible test for python 2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4826) Resetdb command throws warning in new version of alembic

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868659#comment-16868659
 ] 

ASF GitHub Bot commented on AIRFLOW-4826:
-

thomashillyer commented on pull request #5447: [AIRFLOW-4826] Fix warning in 
resetdb command
URL: https://github.com/apache/airflow/pull/5447
 
 
   Change engine to connection because error is thrown from new alembic version.
   Connection is expected.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
 - https://issues.apache.org/jira/browse/AIRFLOW-4826
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
- Change the resetdb command to use `settings.engine.connect()` instead of 
just `settings.engine` because the new version of alembic throws a warning 
about expecting a connection instead of an engine.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
- existing tests cover
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Resetdb command throws warning in new version of alembic
> 
>
> Key: AIRFLOW-4826
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4826
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.3
>Reporter: Thomas Hillyer
>Assignee: Thomas Hillyer
>Priority: Minor
> Fix For: 1.10.4
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Resetdb still works but connection should be updated
> {code:java}
> $ airflow resetdb
> {...}
> [2019-06-20 09:52:58,165] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> DB: postgres+psycopg2://USER:PASS@SERVER/DBNAME
> This will drop existing tables if they exist. Proceed? (y/n)y
> [2019-06-20 09:53:00,869] \{db.py:370} INFO - Dropping tables that exist
> /path/to/ve/lib64/python3.6/site-packages/alembic/util/messaging.py:69: 
> UserWarning: 'connection' argument to configure() is expected to be a 
> sqlalchemy.engine.Connection instance, got 
> Engine(postgres+psycopg2://USER:PASS@SERVER/DBNAME)
>  warnings.warn(msg)
> [2019-06-20 09:53:01,010] \{migration.py:117} INFO - Context impl 
> PostgresqlImpl.
> {...}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] thomashillyer opened a new pull request #5447: [AIRFLOW-4826] Fix warning in resetdb command

2019-06-20 Thread GitBox
thomashillyer opened a new pull request #5447: [AIRFLOW-4826] Fix warning in 
resetdb command
URL: https://github.com/apache/airflow/pull/5447
 
 
   Change engine to connection because error is thrown from new alembic version.
   Connection is expected.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
 - https://issues.apache.org/jira/browse/AIRFLOW-4826
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
- Change the resetdb command to use `settings.engine.connect()` instead of 
just `settings.engine` because the new version of alembic throws a warning 
about expecting a connection instead of an engine.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
- existing tests cover
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4826) Resetdb command throws warning in new version of alembic

2019-06-20 Thread Thomas Hillyer (JIRA)
Thomas Hillyer created AIRFLOW-4826:
---

 Summary: Resetdb command throws warning in new version of alembic
 Key: AIRFLOW-4826
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4826
 Project: Apache Airflow
  Issue Type: Bug
  Components: database
Affects Versions: 1.10.3
Reporter: Thomas Hillyer
Assignee: Thomas Hillyer
 Fix For: 1.10.4


Resetdb still works but connection should be updated
{code:java}
$ airflow resetdb
{...}
[2019-06-20 09:52:58,165] \{__init__.py:51} INFO - Using executor CeleryExecutor
DB: postgres+psycopg2://USER:PASS@SERVER/DBNAME
This will drop existing tables if they exist. Proceed? (y/n)y
[2019-06-20 09:53:00,869] \{db.py:370} INFO - Dropping tables that exist
/path/to/ve/lib64/python3.6/site-packages/alembic/util/messaging.py:69: 
UserWarning: 'connection' argument to configure() is expected to be a 
sqlalchemy.engine.Connection instance, got 
Engine(postgres+psycopg2://USER:PASS@SERVER/DBNAME)
 warnings.warn(msg)
[2019-06-20 09:53:01,010] \{migration.py:117} INFO - Context impl 
PostgresqlImpl.

{...}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4825) BigQueryOperator execute a list of SQL queries doesn't work

2019-06-20 Thread Evgeny (JIRA)
Evgeny created AIRFLOW-4825:
---

 Summary: BigQueryOperator execute a list of SQL queries doesn't 
work
 Key: AIRFLOW-4825
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4825
 Project: Apache Airflow
  Issue Type: Bug
  Components: operators
Affects Versions: 1.10.2
Reporter: Evgeny


The documentation of bigquery_operator says that I can send to the field sql in 
BigQueryOperator a list of strings where each string is a SQL query.

1. When i'm trying to run  DAG with the code below, i'm receiving the error 
"TypeError: query argument must have a type (,) not " .

sql_1 = """select 1 from ...

sql_2 = """select 2 from ...

list_of_queries = list([sql_1,sql_2])
updates = BigQueryOperator(
 task_id="Updates_\{0}".format(new_profile_name),
 sql=list_of_queries,
 allow_large_results=True,
 use_legacy_sql=False,
 bigquery_conn_id="bigquery_default",
 dag=dag
)

2. Does the execution order of the queries is identical to the list indexes?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4824) MySqlHook needs to override DbApiHook.get_uri to pull in extra for charset=utf-8 during create_engine

2019-06-20 Thread Lola Slade (JIRA)
Lola Slade created AIRFLOW-4824:
---

 Summary: MySqlHook needs to override DbApiHook.get_uri to pull in 
extra for charset=utf-8 during create_engine
 Key: AIRFLOW-4824
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4824
 Project: Apache Airflow
  Issue Type: Bug
  Components: hooks
Affects Versions: 1.10.3
Reporter: Lola Slade


When using the engine from a MySQLHook in other code (such as Pandas) the 
engine returned from the create_engine function is missing the charset=utf-8 

This issue was reported here: 
[https://stackoverflow.com/questions/46084744/how-to-explicitly-declare-charset-utf8-for-airflow-connections]

conn = MySqlHook(mysql_conn_id='conn_id')
engine = conn.get_sqlalchemy_engine()

I can see that the code in function *get_uri* in dbapi_hook.py does not use the 
charset = utf8 information from the extra section and that mysql_hook.py does 
not override the function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4584) Error when using ssh operateur to execute a sh script from an remote server

2019-06-20 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868534#comment-16868534
 ] 

Ash Berlin-Taylor commented on AIRFLOW-4584:


Does that .kjb file get printed by the commands you run by any chance?

> Error when using ssh operateur to execute a sh script from an remote server
> ---
>
> Key: AIRFLOW-4584
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4584
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.2
>Reporter: W Hasnaoui
>Priority: Major
>
> hello guys;
> i need yr help please, i'm new in apache airflow, and i'm trying to ssh 
> operateur to execute a shell script from a remote server, my code looks like 
> this:
>  
> t4 = SSHOperator(
>     ssh_conn_id='test_ssh',
>     task_id= 'Execute_transfert',
>     command="""sh 'scripts/jwi/test.sh'""",
>     dag=dag )
>  
> the only thing is inside my script (test.sh) i called a pentaho job (.kjb 
> extention), the line command gives:
> LOGFILE="/xxx2/xxx3/logs/migxxx__`date "+%Y-%m-%d-%H%M"`.log"
> JOBFILE="/xxx2/xxx3/xxx4/migxxx/avxxx.kjb"
> PDI_LEVEL=Detailed
> /folder1/folder2/kitchen.sh -file:$JOBFILE -level:$PDI_LEVEL -logfile:$LOGFILE
> when running and afeter establishing connection to the remote server, the 
> execution faild, a snapshot of the log:
> {{[2019-05-27 20:02:02,651] \{logging_mixin.py:95} INFO - [2019-05-27 
> 20:02:02,651] \{transport.py:1746} INFO - Connected (version 2.0, client 
> OpenSSH_4.3) }}
> {{[2019-05-27 20:02:05,877] \{logging_mixin.py:95} INFO - [2019-05-27 
> 20:02:05,877] \{transport.py:1746} INFO - Authentication (publickey) failed. 
> }}
> {{[2019-05-27 20:02:05,897] \{logging_mixin.py:95} INFO - [2019-05-27 
> 20:02:05,897] \{transport.py:1746} INFO - Authentication (password) 
> successful! }}
> {{[2019-05-27 20:02:06,640] \{ssh_operator.py:133} INFO - INFO 27-05 
> 18:22:07,371 - Using "/tmp/vfs_cache" as temporary files store. }}
> {{[2019-05-27 20:02:06,777] \{models.py:1788} ERROR - SSH operator error: 
> 'utf8' codec can't decode byte 0xe9 in position 63: invalid continuation byte 
> }}
> {{Traceback (most recent call last): }}
> {{File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1657, in 
> _run_raw_task }}
> {{result = task_copy.execute(context=context)}}
> {{ File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/ssh_operator.py", 
> line 167, in execute}}
> {{ raise AirflowException("SSH operator error: \{0}".format(str(e)))}}
> {{ AirflowException: SSH operator error: 'utf8' codec can't decode byte 0xe9 
> in position 63: invalid continuation byte }}
> {{[2019-05-27 20:02:06,780] \{models.py:1817} INFO - All retries failed; 
> marking task as FAILED [2019-05-27 20:02:06,795] \{base_task_runner.py:101} 
> INFO - Job 1146: Subtask Execute_transfert Traceback (most recent call last): 
> }}
> {{[2019-05-27 20:02:06,796] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File "/usr/bin/airflow", line 32, in  
> [2019-05-27 20:02:06,796] \{base_task_runner.py:101} INFO - Job 1146: Subtask 
> Execute_transfert args.func(args) }}
> {{[2019-05-27 20:02:06,796] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File 
> "/usr/lib/python2.7/site-packages/airflow/utils/cli.py", line 74, in wrapper 
> [2019-05-27 20:02:06,797] \{base_task_runner.py:101} INFO - Job 1146: Subtask 
> Execute_transfert return f(*args, **kwargs) }}
> {{[2019-05-27 20:02:06,797] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File 
> "/usr/lib/python2.7/site-packages/airflow/bin/cli.py", line 526, in run 
> [2019-05-27 20:02:06,798] \{base_task_runner.py:101} INFO - Job 1146: Subtask 
> Execute_transfert _run(args, dag, ti) }}
> {{[2019-05-27 20:02:06,798] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File 
> "/usr/lib/python2.7/site-packages/airflow/bin/cli.py", line 445, in _run 
> [2019-05-27 20:02:06,798] \{base_task_runner.py:101} INFO - Job 1146: Subtask 
> Execute_transfert pool=args.pool, }}
> {{[2019-05-27 20:02:06,799] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File 
> "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 73, in wrapper 
> [2019-05-27 20:02:06,799] \{base_task_runner.py:101} INFO - Job 1146: Subtask 
> Execute_transfert return func(*args, **kwargs) }}
> {{[2019-05-27 20:02:06,799] \{base_task_runner.py:101} INFO - Job 1146: 
> Subtask Execute_transfert File 
> "/usr/lib/python2.7/site-packages/airflow/models.py", line 1657, in 
> _run_raw_task [2019-05-27 20:02:06,799] \{base_task_runner.py:101} INFO - Job 
> 1146: Subtask Execute_transfert result = task_copy.execute(context=context) }}
> {{[2019-05-27 20:02:06,800] 

[GitHub] [airflow] XD-DENG commented on a change in pull request #5435: [AIRFLOW-4759] Don't error when marking sucessful run as failed

2019-06-20 Thread GitBox
XD-DENG commented on a change in pull request #5435: [AIRFLOW-4759] Don't error 
when marking sucessful run as failed
URL: https://github.com/apache/airflow/pull/5435#discussion_r295795875
 
 

 ##
 File path: airflow/api/common/experimental/mark_tasks.py
 ##
 @@ -337,6 +340,9 @@ def set_dag_run_state_to_failed(dag, execution_date, 
commit=False, session=None)
 task.dag = dag
 tasks.append(task)
 
+if not tasks:
+return []
+
 
 Review comment:
   This check may not be necessary as you already handled it inside 
`set_state()`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #5435: [AIRFLOW-4759] Don't error when marking sucessful run as failed

2019-06-20 Thread GitBox
XD-DENG commented on a change in pull request #5435: [AIRFLOW-4759] Don't error 
when marking sucessful run as failed
URL: https://github.com/apache/airflow/pull/5435#discussion_r295795875
 
 

 ##
 File path: airflow/api/common/experimental/mark_tasks.py
 ##
 @@ -337,6 +340,9 @@ def set_dag_run_state_to_failed(dag, execution_date, 
commit=False, session=None)
 task.dag = dag
 tasks.append(task)
 
+if not tasks:
+return []
+
 
 Review comment:
   This check may not be necessary as it's already handled inside `set_state()`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4409) UI duration view can be broken by task faile null duration column

2019-06-20 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-4409:
---
Fix Version/s: (was: 2.0.0)
   1.10.4

> UI duration view can be broken by task faile null duration column
> -
>
> Key: AIRFLOW-4409
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4409
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.0
>Reporter: Yingbo Wang
>Assignee: Yingbo Wang
>Priority: Minor
> Fix For: 1.10.4
>
> Attachments: Screen Shot 2019-04-24 at 1.54.53 PM.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Airflow DAG UI has a view "task duration". Due to our recent migration. There 
> are some records in the task_fail table which has null duration (caused by 
> missing start date). This is affecting the UI view of task duration  !Screen 
> Shot 2019-04-24 at 1.54.53 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] kaxil merged pull request #5446: Add Crealytics to the list of Airflow users

2019-06-20 Thread GitBox
kaxil merged pull request #5446: Add Crealytics to the list of Airflow users
URL: https://github.com/apache/airflow/pull/5446
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] manu-crealytics opened a new pull request #5446: Add Crealytics to the list of Airflow users

2019-06-20 Thread GitBox
manu-crealytics opened a new pull request #5446: Add Crealytics to the list of 
Airflow users
URL: https://github.com/apache/airflow/pull/5446
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   N/A, just changing the README.
   
   ### Description
   
   Simply adding Crealytics to the list of companies that use Apache Airflow.
   
   ### Tests
   
   N/A, just changing the README.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   N/A, just changing the README.
   
   ### Code Quality
   
   N/A, just changing the README.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] chaimt commented on a change in pull request #4633: AIRFLOW-3791: Dataflow

2019-06-20 Thread GitBox
chaimt commented on a change in pull request #4633: AIRFLOW-3791: Dataflow
URL: https://github.com/apache/airflow/pull/4633#discussion_r295766277
 
 

 ##
 File path: airflow/contrib/operators/dataflow_operator.py
 ##
 @@ -118,6 +118,8 @@ def __init__(
 delegate_to=None,
 poll_sleep=10,
 job_class=None,
+check_if_running=None,
+multiple_jobs=None,
 
 Review comment:
   You can have a pipline that spans on more than one job, this will wait until 
all jobs have finished


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-4801) trigger dag got when dag is zipped

2019-06-20 Thread Rinat Abdullin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868447#comment-16868447
 ] 

Rinat Abdullin edited comment on AIRFLOW-4801 at 6/20/19 11:32 AM:
---

It looks like this PR handles the problem: 
https://github.com/apache/airflow/pull/5404


was (Author: abdullin):
[~mithril] could you, please, try renaming bug to mention the fatal crash? 
Perhaps this would bring up the attention.

 

I've investigated the code a little. It appears that the culprit is in this 
part of the `process_file` method (located in models/__init__.py in the v 
1.10.3):


{code:python}
# if the source file no longer exists in the DB or in the filesystem,
# return an empty list
# todo: raise exception?
if filepath is None or not os.path.isfile(filepath):
return found_dags
{code}

process_file is invoked when the dag is triggered, and it is passed 
non-existent path that is created by joining path to the archive and relative 
file path within the archive. This silently returns no DAGs.

Throwing an exception would've been, indeed, a way to alert to the problem.
 
PS: It looks like this PR handles the problem: 
https://github.com/apache/airflow/pull/5404
 

> trigger dag got when dag is zipped 
> ---
>
> Key: AIRFLOW-4801
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4801
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.3
>Reporter: kasim
>Priority: Major
>
> I have dags.zip like below :  
>  
> {code:java}
> Archive: dags.zip
> extracting: jobs.zip
> extracting: libs.zip
> inflating: log4j.properties
> inflating: main.py# spark-summit python file
> creating: resources/
> creating: resources/result/
> creating: resources/train/
> inflating: resources/train/sale_count.parquet
> inflating: resources/words.txt
> inflating: salecount.py   # a dag contain SparkSubmitOperator
> inflating: test.py# very simple one as airflow example 
> {code}
>  
> Always got error when click `trigger dag  test.py`
>  
> {code:java}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: dc09
> ---
> Traceback (most recent call last):
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 2311, in wsgi_app
> response = self.full_dispatch_request()
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1834, in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1737, in handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/_compat.py", 
> line 36, in reraise
> raise value
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1832, in full_dispatch_request
> rv = self.dispatch_request()
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1818, in dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask_admin/base.py",
>  line 69, in inner
> return self._run_view(f, *args, **kwargs)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask_admin/base.py",
>  

[jira] [Commented] (AIRFLOW-4823) Add the ability to toggle creation of default connections on deployment

2019-06-20 Thread jagan kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868458#comment-16868458
 ] 

jagan kumar commented on AIRFLOW-4823:
--

[https://github.com/apache/airflow/pull/5443]

> Add the ability to toggle creation of default connections on deployment
> ---
>
> Key: AIRFLOW-4823
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4823
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.3
>Reporter: Richard Jarvis
>Priority: Minor
>
> Desire the capability to allow user to decide if default connections are 
> created or not when Airflow is deployed. 
>  
> Suggest addition of new property in config file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-4048) HttpSensor provide context to response_check

2019-06-20 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-4048.

Resolution: Fixed

> HttpSensor provide context to response_check
> 
>
> Key: AIRFLOW-4048
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4048
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: raphael auv
>Assignee: raphael auv
>Priority: Minor
> Fix For: 1.10.4
>
>
> Actually the response_check do not get by parameter the context , with a 
> simple , provide_context option at the constructor , we could provide the 
> context to the response_check function
> actual code :
> {code:java}
>...
>  
>def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> return self.response_check(response)
> {code}
>  
>  futur code:
> {code:java}
>def __init__(self,
>  endpoint,
>  http_conn_id='http_default',
>  method='GET',
>  request_params=None,
>  headers=None,
>  response_check=None,
>  extra_options=None
>  provide_context=False, *args, **kwargs):
>     super(HttpSensor, self).__init__(*args, **kwargs)
> ...
> self.provide_context = provide_context
> def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> response_check_kwargs = {}
>if self.provide_context:
>response_check_kwargs["context"] = context
>return self.response_check(response, **response_check_kwargs)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4048) HttpSensor provide context to response_check

2019-06-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868456#comment-16868456
 ] 

ASF subversion and git services commented on AIRFLOW-4048:
--

Commit 8b0a1ab9cdc16a35fc060cb6c906e37ea54e78f7 in airflow's branch 
refs/heads/master from raphaelauv
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8b0a1ab ]

[AIRFLOW-4048] http_sensor provide Context to response_check (#4890)



> HttpSensor provide context to response_check
> 
>
> Key: AIRFLOW-4048
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4048
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: raphael auv
>Assignee: raphael auv
>Priority: Minor
> Fix For: 1.10.4
>
>
> Actually the response_check do not get by parameter the context , with a 
> simple , provide_context option at the constructor , we could provide the 
> context to the response_check function
> actual code :
> {code:java}
>...
>  
>def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> return self.response_check(response)
> {code}
>  
>  futur code:
> {code:java}
>def __init__(self,
>  endpoint,
>  http_conn_id='http_default',
>  method='GET',
>  request_params=None,
>  headers=None,
>  response_check=None,
>  extra_options=None
>  provide_context=False, *args, **kwargs):
>     super(HttpSensor, self).__init__(*args, **kwargs)
> ...
> self.provide_context = provide_context
> def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> response_check_kwargs = {}
>if self.provide_context:
>response_check_kwargs["context"] = context
>return self.response_check(response, **response_check_kwargs)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4048) HttpSensor provide context to response_check

2019-06-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868455#comment-16868455
 ] 

ASF GitHub Bot commented on AIRFLOW-4048:
-

ashb commented on pull request #4890: [AIRFLOW-4048] HttpSensor provide-context 
to response_check
URL: https://github.com/apache/airflow/pull/4890
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> HttpSensor provide context to response_check
> 
>
> Key: AIRFLOW-4048
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4048
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: raphael auv
>Assignee: raphael auv
>Priority: Minor
> Fix For: 1.10.4
>
>
> Actually the response_check do not get by parameter the context , with a 
> simple , provide_context option at the constructor , we could provide the 
> context to the response_check function
> actual code :
> {code:java}
>...
>  
>def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> return self.response_check(response)
> {code}
>  
>  futur code:
> {code:java}
>def __init__(self,
>  endpoint,
>  http_conn_id='http_default',
>  method='GET',
>  request_params=None,
>  headers=None,
>  response_check=None,
>  extra_options=None
>  provide_context=False, *args, **kwargs):
>     super(HttpSensor, self).__init__(*args, **kwargs)
> ...
> self.provide_context = provide_context
> def poke(self, context):
> self.log.info('Poking: %s', self.endpoint)
> try:
> response = self.hook.run(self.endpoint,
>  data=self.request_params,
>  headers=self.headers,
>  extra_options=self.extra_options)
> if self.response_check:
> # run content check on response
> response_check_kwargs = {}
>if self.provide_context:
>response_check_kwargs["context"] = context
>return self.response_check(response, **response_check_kwargs)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] ashb merged pull request #4890: [AIRFLOW-4048] HttpSensor provide-context to response_check

2019-06-20 Thread GitBox
ashb merged pull request #4890: [AIRFLOW-4048] HttpSensor provide-context to 
response_check
URL: https://github.com/apache/airflow/pull/4890
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-4801) trigger dag got when dag is zipped

2019-06-20 Thread Rinat Abdullin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868447#comment-16868447
 ] 

Rinat Abdullin edited comment on AIRFLOW-4801 at 6/20/19 11:17 AM:
---

[~mithril] could you, please, try renaming bug to mention the fatal crash? 
Perhaps this would bring up the attention.

 

I've investigated the code a little. It appears that the culprit is in this 
part of the `process_file` method (located in models/__init__.py in the v 
1.10.3):


{code:python}
# if the source file no longer exists in the DB or in the filesystem,
# return an empty list
# todo: raise exception?
if filepath is None or not os.path.isfile(filepath):
return found_dags
{code}

process_file is invoked when the dag is triggered, and it is passed 
non-existent path that is created by joining path to the archive and relative 
file path within the archive. This silently returns no DAGs.

Throwing an exception would've been, indeed, a way to alert to the problem.
 
PS: It looks like this PR handles the problem: 
https://github.com/apache/airflow/pull/5404
 


was (Author: abdullin):
[~mithril] could you, please, try renaming bug to mention the fatal crash? 
Perhaps this would bring up the attention.

 

I've investigated the code a little. It appears that the culprit is in this 
part of the `process_file` method (located in models/__init__.py in the v 
1.10.3):


{code:python}
# if the source file no longer exists in the DB or in the filesystem,
# return an empty list
# todo: raise exception?
if filepath is None or not os.path.isfile(filepath):
return found_dags
{code}

process_file is invoked when the dag is triggered, and it is passed 
non-existent path that is created by joining path to the archive and relative 
file path within the archive. This silently returns no DAGs.

Throwing an exception would've been, indeed, a way to alert to the problem.
 

 

> trigger dag got when dag is zipped 
> ---
>
> Key: AIRFLOW-4801
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4801
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.3
>Reporter: kasim
>Priority: Major
>
> I have dags.zip like below :  
>  
> {code:java}
> Archive: dags.zip
> extracting: jobs.zip
> extracting: libs.zip
> inflating: log4j.properties
> inflating: main.py# spark-summit python file
> creating: resources/
> creating: resources/result/
> creating: resources/train/
> inflating: resources/train/sale_count.parquet
> inflating: resources/words.txt
> inflating: salecount.py   # a dag contain SparkSubmitOperator
> inflating: test.py# very simple one as airflow example 
> {code}
>  
> Always got error when click `trigger dag  test.py`
>  
> {code:java}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: dc09
> ---
> Traceback (most recent call last):
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 2311, in wsgi_app
> response = self.full_dispatch_request()
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1834, in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1737, in handle_user_exception
> 

[jira] [Commented] (AIRFLOW-4801) trigger dag got when dag is zipped

2019-06-20 Thread Rinat Abdullin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868447#comment-16868447
 ] 

Rinat Abdullin commented on AIRFLOW-4801:
-

[~mithril] could you, please, try renaming bug to mention the fatal crash? 
Perhaps this would bring up the attention.

 

I've investigated the code a little. It appears that the culprit is in this 
part of the `process_file` method (located in models/__init__.py in the v 
1.10.3):


{code:python}
# if the source file no longer exists in the DB or in the filesystem,
# return an empty list
# todo: raise exception?
if filepath is None or not os.path.isfile(filepath):
return found_dags
{code}

process_file is invoked when the dag is triggered, and it is passed 
non-existent path that is created by joining path to the archive and relative 
file path within the archive. This silently returns no DAGs.

Throwing an exception would've been, indeed, a way to alert to the problem.
 

 

> trigger dag got when dag is zipped 
> ---
>
> Key: AIRFLOW-4801
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4801
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.3
>Reporter: kasim
>Priority: Major
>
> I have dags.zip like below :  
>  
> {code:java}
> Archive: dags.zip
> extracting: jobs.zip
> extracting: libs.zip
> inflating: log4j.properties
> inflating: main.py# spark-summit python file
> creating: resources/
> creating: resources/result/
> creating: resources/train/
> inflating: resources/train/sale_count.parquet
> inflating: resources/words.txt
> inflating: salecount.py   # a dag contain SparkSubmitOperator
> inflating: test.py# very simple one as airflow example 
> {code}
>  
> Always got error when click `trigger dag  test.py`
>  
> {code:java}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: dc09
> ---
> Traceback (most recent call last):
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 2311, in wsgi_app
> response = self.full_dispatch_request()
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1834, in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1737, in handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/_compat.py", 
> line 36, in reraise
> raise value
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1832, in full_dispatch_request
> rv = self.dispatch_request()
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask/app.py", line 
> 1818, in dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask_admin/base.py",
>  line 69, in inner
> return self._run_view(f, *args, **kwargs)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask_admin/base.py",
>  line 368, in _run_view
> return fn(self, *args, **kwargs)
>   File 
> "/opt/anaconda3/envs/airflow/lib/python3.6/site-packages/flask_login/utils.py",
>  line 258, in decorated_view
> return func(*args, **kwargs)
>   File 
> 

[GitHub] [airflow] OmerJog commented on issue #5229: [AIRFLOW-XXX] Links to Pendulum in macros.rst

2019-06-20 Thread GitBox
OmerJog commented on issue #5229: [AIRFLOW-XXX] Links to Pendulum in macros.rst
URL: https://github.com/apache/airflow/pull/5229#issuecomment-503948120
 
 
   @mik-laj ready to merge?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Jagankmr3 commented on issue #5443: [AIRFLOW-4823] modified the source to have create_default_connections flag in config

2019-06-20 Thread GitBox
Jagankmr3 commented on issue #5443: [AIRFLOW-4823] modified the source to have 
create_default_connections flag in config
URL: https://github.com/apache/airflow/pull/5443#issuecomment-503947605
 
 
   Updated the default_airflow.cfg to have a flag "create_default_connection" 
the same variable will be handled in airflow\airflow\utils\db.py file to create 
or not to create default connections while the airflow is deployed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-800) Initialize default Google BigQuery Connection with valid conn_type

2019-06-20 Thread Kamil Bregula (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-800:
--
Component/s: gcp

> Initialize default Google BigQuery Connection with valid conn_type
> --
>
> Key: AIRFLOW-800
> URL: https://issues.apache.org/jira/browse/AIRFLOW-800
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp, utils
>Reporter: Wilson Lian
>Assignee: Kaxil Naik
>Priority: Minor
> Fix For: 2.0.0
>
>
> {{airflow initdb}} creates a connection with conn_id='bigquery_default' and 
> conn_type='bigquery'. However, bigquery is not a valid conn_type, according 
> to models.Connection._types, and BigQuery connections should use the 
> google_cloud_platform conn_type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3495) DataProcSparkSqlOperator and DataProcHiveOperator should raise error when query and query_uri are both provided

2019-06-20 Thread Kamil Bregula (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-3495:
---
Component/s: gcp

> DataProcSparkSqlOperator and DataProcHiveOperator should raise error when 
> query and query_uri are both provided
> ---
>
> Key: AIRFLOW-3495
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3495
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, gcp
>Reporter: Wilson Lian
>Priority: Minor
>
> Exactly 1 of the query and query_uri params will be used. It should be an 
> error to provide more than one. Fixing this will make cases like 
> [this|https://stackoverflow.com/questions/53424091/unable-to-query-using-file-in-data-proc-hive-operator]
>  less confusing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3143) Support auto-zone in DataprocClusterCreateOperator

2019-06-20 Thread Kamil Bregula (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-3143:
---
Component/s: gcp

> Support auto-zone in DataprocClusterCreateOperator
> --
>
> Key: AIRFLOW-3143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Reporter: Wilson Lian
>Assignee: Joel Croteau
>Priority: Minor
> Fix For: 1.10.4
>
>
> [Dataproc 
> Auto-zone|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/auto-zone]
>  allows users to omit the zone when creating a cluster, and the service will 
> pick a zone in the chosen region.
> Providing an empty string or None for `zone` would match up with how users 
> would request auto-zone via direct API access, but as-is the 
> DataprocClusterCreateOperator makes a bad API request when such values are 
> passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3550) GKEClusterHook doesn't use gcp_conn_id

2019-06-20 Thread Kamil Bregula (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-3550:
---
Component/s: gcp

> GKEClusterHook doesn't use gcp_conn_id
> --
>
> Key: AIRFLOW-3550
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3550
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, gcp
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Wilson Lian
>Priority: Major
> Fix For: 1.10.2
>
>
> The hook doesn't inherit from GoogleCloudBaseHook. API calls are made using 
> the default service account (if present).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3401) Properly encode templated fields in Cloud Pub/Sub example DAG

2019-06-20 Thread Kamil Bregula (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-3401:
---
Component/s: gcp

> Properly encode templated fields in Cloud Pub/Sub example DAG
> -
>
> Key: AIRFLOW-3401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3401
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, documentation, gcp
>Reporter: Wilson Lian
>Priority: Trivial
>  Labels: examples
>
> Context: 
> [https://groups.google.com/d/msg/cloud-composer-discuss/McHHu582G7o/7N66GrwsBAAJ|http://example.com]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4823) Add the ability to toggle creation of default connections on deployment

2019-06-20 Thread Richard Jarvis (JIRA)
Richard Jarvis created AIRFLOW-4823:
---

 Summary: Add the ability to toggle creation of default connections 
on deployment
 Key: AIRFLOW-4823
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4823
 Project: Apache Airflow
  Issue Type: Improvement
  Components: core
Affects Versions: 1.10.3
Reporter: Richard Jarvis


Desire the capability to allow user to decide if default connections are 
created or not when Airflow is deployed. 

 

Suggest addition of new property in config file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4805) Add py_file as templated field in DataflowPythonOperator

2019-06-20 Thread Kamil Bregula (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868360#comment-16868360
 ] 

Kamil Bregula commented on AIRFLOW-4805:


[~wwlian] Can you add a GCP component to this ticket?

Reference:

[https://lists.apache.org/thread.html/87df3782647dbb3ed238e209f4919daf80cc385d796421b930950f36@%3Cdev.airflow.apache.org%3E]

> Add py_file as templated field in DataflowPythonOperator
> 
>
> Key: AIRFLOW-4805
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4805
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.9.0, 1.10.1, 1.10.2
>Reporter: Wilson Lian
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4804) Enable CORS (Cross-Origin Request Sharing) for the API

2019-06-20 Thread Kamil Bregula (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868356#comment-16868356
 ] 

Kamil Bregula commented on AIRFLOW-4804:


[https://airflow.apache.org/howto/run-behind-proxy.html]

 

I recommend running Airflow together with Nginx and add the appropriate header 
in Nginx

> Enable CORS (Cross-Origin Request Sharing) for the API
> --
>
> Key: AIRFLOW-4804
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4804
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api
>Affects Versions: 1.10.3
>Reporter: Srinivasa kalyan Sozhavaram
>Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Ability needed to call the Airflow API that are hosted in a server with SSL 
> enabled. We get an error when the API is being integrated to a custom-built 
> portal on HTTP:
>  
> {quote} *Access to XMLHttpRequest at 
> 'https:///api/experimental/dags//dag_runs' from origin 
> 'http://localhost:4200' has been blocked by CORS policy: Response to 
> preflight request doesn't pass access control check: No 
> 'Access-Control-Allow-Origin' header is present on the requested resource.*
> {quote}
>  
> Header was set with Access-Control-Allow-Origin as "***"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >