[GitHub] yeluolei commented on issue #3675: [AIRFLOW-2834] fix build script for k8s docker

2018-08-30 Thread GitBox
yeluolei commented on issue #3675: [AIRFLOW-2834] fix build script for k8s 
docker
URL: 
https://github.com/apache/incubator-airflow/pull/3675#issuecomment-417559822
 
 
   @ashb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417558995
 
 
   @dimberman we missed [creating the airflow database for 
postgresql](https://github.com/apache/incubator-airflow/blob/b7f63c59d75ad21d210a72bd6212e5a7b2c6f25b/.travis.yml#L104)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417552202
 
 
   @dimberman I'm trying to figure out the simplest changes that can get this 
to work. So far:
   
   - `airflow initdb` is failing. It might be easier to [install postgres in 
the travisci host 
again](https://github.com/apache/incubator-airflow/blob/c37fc0b6ba19e3fe5656ae37cef9b59cef3c29e8/.travis.yml#L28)
   - After this, we'll need another value for 
[`backend_postgres`](https://github.com/apache/incubator-airflow/blob/a9705c21f1bbd5d79cbd92dee84673b34332dab8/tox.ini#L51)
 (or another variable altogether) when running the k8s tests. This one should 
point to [localhost 
instead](https://github.com/apache/incubator-airflow/blob/c37fc0b6ba19e3fe5656ae37cef9b59cef3c29e8/tox.ini#L49)
   - The final error I see is ` kinit: command not found`, but the script keeps 
running after failing that one anyway.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417545995
 
 
   @gerardo Ok it's now solidly back in the court of "getting TOX to work". 
Kubeadm is able to build and deploy. PTAL and let me know how we can get these 
to pass. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417541379
 
 
   cc: @feng-tao @kaxil just a warning any PR merged right now is not being 
tested against kubernetes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417296502
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=h1)
 Report
   > Merging 
[#3823](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3823/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3823   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=footer).
 Last update 
[8245447...63cf213](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417296502
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=h1)
 Report
   > Merging 
[#3823](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3823/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3823   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=footer).
 Last update 
[8245447...63cf213](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
XD-DENG commented on issue #3823: [AIRFLOW-2985] An operator for S3 object 
copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417539567
 
 
   Hi both @ashb @feng-tao , I have updated the code bases on your earlier 
inputs:
   
   - Changed the way to specify bucket/key, in order to be consistent with 
existing S3 operators/sensors.
   - Add this class to `docs/code.rst`.
   - Test cases are updated (there are two cases to test different argument 
combinations).
   - Added a note in the comment (which would be documentation later), 
highlighting that the S3 connection used here must be able to access both 
source and destination bucket/key.
   
   CI passed. PTAL.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417538581
 
 
   @Fokko @bolkedebruin @gerardo I was able to get kubeadm to work with a local 
registry (that was a rough experience lol). I'm still running into some weird 
TOX issues (like being unable to find python 3.5) but progress!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3820: [AIRFLOW-XXX] Fix Docstrings for Hooks/Operators

2018-08-30 Thread GitBox
feng-tao commented on issue #3820: [AIRFLOW-XXX] Fix Docstrings for 
Hooks/Operators
URL: 
https://github.com/apache/incubator-airflow/pull/3820#issuecomment-417536750
 
 
   lgtm @kaxil 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2983) Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598158#comment-16598158
 ] 

ASF GitHub Bot commented on AIRFLOW-2983:
-

feng-tao closed pull request #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: https://github.com/apache/incubator-airflow/pull/3821
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 55badf4828..93368e1f18 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -1815,12 +1815,16 @@ def get_template_context(self, session=None):
 next_execution_date = task.dag.following_schedule(self.execution_date)
 
 next_ds = None
+next_ds_nodash = None
 if next_execution_date:
 next_ds = next_execution_date.strftime('%Y-%m-%d')
+next_ds_nodash = next_ds.replace('-', '')
 
 prev_ds = None
+prev_ds_nodash = None
 if prev_execution_date:
 prev_ds = prev_execution_date.strftime('%Y-%m-%d')
+prev_ds_nodash = prev_ds.replace('-', '')
 
 ds_nodash = ds.replace('-', '')
 ts_nodash = ts.replace('-', '').replace(':', '')
@@ -1887,7 +1891,9 @@ def __repr__(self):
 'dag': task.dag,
 'ds': ds,
 'next_ds': next_ds,
+'next_ds_nodash': next_ds_nodash,
 'prev_ds': prev_ds,
+'prev_ds_nodash': prev_ds_nodash,
 'ds_nodash': ds_nodash,
 'ts': ts,
 'ts_nodash': ts_nodash,
diff --git a/docs/code.rst b/docs/code.rst
index 80ec76193f..c9e9b3d431 100644
--- a/docs/code.rst
+++ b/docs/code.rst
@@ -242,12 +242,14 @@ VariableDescription
 =   
 ``{{ ds }}``the execution date as ``-MM-DD``
 ``{{ ds_nodash }}`` the execution date as ``MMDD``
-``{{ prev_ds }}``   the previous execution date as 
``-MM-DD``.
+``{{ prev_ds }}``   the previous execution date as 
``-MM-DD``
 if ``{{ ds }}`` is ``2016-01-08`` and 
``schedule_interval`` is ``@weekly``,
-``{{ prev_ds }}`` will be ``2016-01-01``.
-``{{ next_ds }}``   the next execution date as ``-MM-DD``.
+``{{ prev_ds }}`` will be ``2016-01-01``
+``{{ prev_ds_nodash }}``the previous execution date as 
``MMDD`` if exists, else ``None`
+``{{ next_ds }}``   the next execution date as ``-MM-DD``
 if ``{{ ds }}`` is ``2016-01-01`` and 
``schedule_interval`` is ``@weekly``,
-``{{ prev_ds }}`` will be ``2016-01-08``.
+``{{ prev_ds }}`` will be ``2016-01-08``
+``{{ next_ds_nodash }}``the next execution date as ``MMDD`` if 
exists, else ``None`
 ``{{ yesterday_ds }}``  yesterday's date as ``-MM-DD``
 ``{{ yesterday_ds_nodash }}``   yesterday's date as ``MMDD``
 ``{{ tomorrow_ds }}``   tomorrow's date as ``-MM-DD``
diff --git a/tests/core.py b/tests/core.py
index 8df6312eeb..f8b8691912 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -626,6 +626,35 @@ def __bool__(self):
 dag=self.dag)
 t.resolve_template_files()
 
+def test_task_get_template(self):
+TI = models.TaskInstance
+ti = TI(
+task=self.runme_0, execution_date=DEFAULT_DATE)
+ti.dag = self.dag_bash
+ti.run(ignore_ti_state=True)
+context = ti.get_template_context()
+
+# DEFAULT DATE is 2015-01-01
+self.assertEquals(context['ds'], '2015-01-01')
+self.assertEquals(context['ds_nodash'], '20150101')
+
+# next_ds is 2015-01-02 as the dag interval is daily
+self.assertEquals(context['next_ds'], '2015-01-02')
+self.assertEquals(context['next_ds_nodash'], '20150102')
+
+# prev_ds is 2014-12-31 as the dag interval is daily
+self.assertEquals(context['prev_ds'], '2014-12-31')
+self.assertEquals(context['prev_ds_nodash'], '20141231')
+
+self.assertEquals(context['ts'], '2015-01-01T00:00:00+00:00')
+self.assertEquals(context['ts_nodash'], '20150101T00+')
+
+self.assertEquals(context['yesterday_ds'], '2014-12-31')
+self.assertEquals(context['yesterday_ds_nodash'], '20141231')
+
+self.assertEquals(context['tomorrow_ds'], '2015-01-02')
+self.assertEquals(context['tomorrow_ds_nodash'], '20150102')
+
 def 

[GitHub] feng-tao closed pull request #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
feng-tao closed pull request #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: https://github.com/apache/incubator-airflow/pull/3821
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 55badf4828..93368e1f18 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -1815,12 +1815,16 @@ def get_template_context(self, session=None):
 next_execution_date = task.dag.following_schedule(self.execution_date)
 
 next_ds = None
+next_ds_nodash = None
 if next_execution_date:
 next_ds = next_execution_date.strftime('%Y-%m-%d')
+next_ds_nodash = next_ds.replace('-', '')
 
 prev_ds = None
+prev_ds_nodash = None
 if prev_execution_date:
 prev_ds = prev_execution_date.strftime('%Y-%m-%d')
+prev_ds_nodash = prev_ds.replace('-', '')
 
 ds_nodash = ds.replace('-', '')
 ts_nodash = ts.replace('-', '').replace(':', '')
@@ -1887,7 +1891,9 @@ def __repr__(self):
 'dag': task.dag,
 'ds': ds,
 'next_ds': next_ds,
+'next_ds_nodash': next_ds_nodash,
 'prev_ds': prev_ds,
+'prev_ds_nodash': prev_ds_nodash,
 'ds_nodash': ds_nodash,
 'ts': ts,
 'ts_nodash': ts_nodash,
diff --git a/docs/code.rst b/docs/code.rst
index 80ec76193f..c9e9b3d431 100644
--- a/docs/code.rst
+++ b/docs/code.rst
@@ -242,12 +242,14 @@ VariableDescription
 =   
 ``{{ ds }}``the execution date as ``-MM-DD``
 ``{{ ds_nodash }}`` the execution date as ``MMDD``
-``{{ prev_ds }}``   the previous execution date as 
``-MM-DD``.
+``{{ prev_ds }}``   the previous execution date as 
``-MM-DD``
 if ``{{ ds }}`` is ``2016-01-08`` and 
``schedule_interval`` is ``@weekly``,
-``{{ prev_ds }}`` will be ``2016-01-01``.
-``{{ next_ds }}``   the next execution date as ``-MM-DD``.
+``{{ prev_ds }}`` will be ``2016-01-01``
+``{{ prev_ds_nodash }}``the previous execution date as 
``MMDD`` if exists, else ``None`
+``{{ next_ds }}``   the next execution date as ``-MM-DD``
 if ``{{ ds }}`` is ``2016-01-01`` and 
``schedule_interval`` is ``@weekly``,
-``{{ prev_ds }}`` will be ``2016-01-08``.
+``{{ prev_ds }}`` will be ``2016-01-08``
+``{{ next_ds_nodash }}``the next execution date as ``MMDD`` if 
exists, else ``None`
 ``{{ yesterday_ds }}``  yesterday's date as ``-MM-DD``
 ``{{ yesterday_ds_nodash }}``   yesterday's date as ``MMDD``
 ``{{ tomorrow_ds }}``   tomorrow's date as ``-MM-DD``
diff --git a/tests/core.py b/tests/core.py
index 8df6312eeb..f8b8691912 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -626,6 +626,35 @@ def __bool__(self):
 dag=self.dag)
 t.resolve_template_files()
 
+def test_task_get_template(self):
+TI = models.TaskInstance
+ti = TI(
+task=self.runme_0, execution_date=DEFAULT_DATE)
+ti.dag = self.dag_bash
+ti.run(ignore_ti_state=True)
+context = ti.get_template_context()
+
+# DEFAULT DATE is 2015-01-01
+self.assertEquals(context['ds'], '2015-01-01')
+self.assertEquals(context['ds_nodash'], '20150101')
+
+# next_ds is 2015-01-02 as the dag interval is daily
+self.assertEquals(context['next_ds'], '2015-01-02')
+self.assertEquals(context['next_ds_nodash'], '20150102')
+
+# prev_ds is 2014-12-31 as the dag interval is daily
+self.assertEquals(context['prev_ds'], '2014-12-31')
+self.assertEquals(context['prev_ds_nodash'], '20141231')
+
+self.assertEquals(context['ts'], '2015-01-01T00:00:00+00:00')
+self.assertEquals(context['ts_nodash'], '20150101T00+')
+
+self.assertEquals(context['yesterday_ds'], '2014-12-31')
+self.assertEquals(context['yesterday_ds_nodash'], '20141231')
+
+self.assertEquals(context['tomorrow_ds'], '2015-01-02')
+self.assertEquals(context['tomorrow_ds_nodash'], '20150102')
+
 def test_import_examples(self):
 self.assertEqual(len(self.dagbag.dags), NUM_EXAMPLE_DAGS)
 


 


This is an automated message from the Apache Git Service.
To respond to the 

[GitHub] feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417536518
 
 
   Hey @kaxil, @r39132, a test to check template context is added. Merge it now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash 
and next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417186769
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=h1)
 Report
   > Merging 
[#3821](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3821/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3821  +/-   ##
   ==
   + Coverage   77.43%   77.44%   +<.01% 
   ==
 Files 203  203  
 Lines   1584015844   +4 
   ==
   + Hits1226612270   +4 
 Misses   3574 3574
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.78% <100%> (+0.01%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=footer).
 Last update 
[8245447...c78c818](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash 
and next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417186769
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=h1)
 Report
   > Merging 
[#3821](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3821/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3821  +/-   ##
   ==
   + Coverage   77.43%   77.44%   +<.01% 
   ==
 Files 203  203  
 Lines   1584015844   +4 
   ==
   + Hits1226612270   +4 
 Misses   3574 3574
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.78% <100%> (+0.01%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=footer).
 Last update 
[8245447...c78c818](https://codecov.io/gh/apache/incubator-airflow/pull/3821?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417515885
 
 
   @kaxil , thanks for the comment. I added a unit test to check. Will wait for 
CI to finish.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on a change in pull request #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
gerardo commented on a change in pull request #3797: [AIRFLOW-2952] Splits CI 
into k8s + docker-compose
URL: https://github.com/apache/incubator-airflow/pull/3797#discussion_r214217582
 
 

 ##
 File path: .travis.yml
 ##
 @@ -26,14 +26,14 @@ env:
 - TRAVIS_CACHE=$HOME/.travis_cache/
   matrix:
 - TOX_ENV=flake8
-- TOX_ENV=py27-backend_mysql
-- TOX_ENV=py27-backend_sqlite
-- TOX_ENV=py27-backend_postgres
-- TOX_ENV=py35-backend_mysql PYTHON_VERSION=3
-- TOX_ENV=py35-backend_sqlite PYTHON_VERSION=3
-- TOX_ENV=py35-backend_postgres PYTHON_VERSION=3
-- TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
-- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0 PYTHON_VERSION=3
+- TOX_ENV=py27-backend_mysql-env_docker
+- TOX_ENV=py27-backend_sqlite-env_docker
+- TOX_ENV=py27-backend_postgres-env_docker
+- TOX_ENV=py35-backend_mysql-env_docker PYTHON_VERSION=3
+- TOX_ENV=py35-backend_sqlite-env_ddocker PYTHON_VERSION=3
 
 Review comment:
   There's a typo here


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ndmar commented on issue #2708: [AIRFLOW-1746] Add a Nomad operator to trigger job from Airflow

2018-08-30 Thread GitBox
ndmar commented on issue #2708: [AIRFLOW-1746] Add a Nomad operator to trigger 
job from Airflow
URL: 
https://github.com/apache/incubator-airflow/pull/2708#issuecomment-417502998
 
 
   @etrabelsi @Fokko Just discovered this PR and am pretty excited about it, as 
we just started using Airflow on Nomad. This'll greatly simplify our deployment 
setup. Seems like it's almost there; I'm more than happy/willing to help in any 
way push this across the finish line!  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: 
https://github.com/apache/incubator-airflow/pull/3825#issuecomment-417496304
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=h1)
 Report
   > Merging 
[#3825](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3825/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3825   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=footer).
 Last update 
[8245447...486efa8](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: 
https://github.com/apache/incubator-airflow/pull/3825#issuecomment-417496304
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=h1)
 Report
   > Merging 
[#3825](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3825/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3825   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=footer).
 Last update 
[8245447...486efa8](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2548) Output Plugin Import Errors to WebUI

2018-08-30 Thread Jimmy Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598028#comment-16598028
 ] 

Jimmy Cao commented on AIRFLOW-2548:


This makes sense to me. Are you working on a PR?

> Output Plugin Import Errors to WebUI
> 
>
> Key: AIRFLOW-2548
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2548
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Andy Cooper
>Priority: Major
> Fix For: 2.0.0
>
>
> All,
>  
> We currently output all DAG import errors to the webUI. I propose we do the 
> same with plugin errors as well. This will provide a better user experience 
> by bubbling up all errors to the webUI instead of hiding them in stdOut.
>  
> Proposal...
>  * Extend models.ImportError to have a "type" field to distinguish from error 
> types.
>  * Prevent class SchedulerJob methods from clearing out and pulling from 
> models.ImportError if type = 'plugin'
>  * Create new ImportError records in plugins_manager.py for each plugin that 
> fails to import
>  * Prompt user in views.py with plugin ImportErrors - specifying that they 
> need to fix and restart webserver to resolve.
>  
> Does this seem reasonable to everyone? I'd be interested in taking on this 
> work if needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: 
https://github.com/apache/incubator-airflow/pull/3825#issuecomment-417496304
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=h1)
 Report
   > Merging 
[#3825](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **decrease** coverage by `0.28%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3825/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3825  +/-   ##
   ==
   - Coverage   77.43%   77.15%   -0.29% 
   ==
 Files 203  203  
 Lines   1584015840  
   ==
   - Hits1226612221  -45 
   - Misses   3574 3619  +45
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/hdfs\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oZGZzX2hvb2sucHk=)
 | `27.5% <0%> (-65%)` | :arrow_down: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `85.41% <0%> (-6.25%)` | :arrow_down: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `79.67% <0%> (-3.26%)` | :arrow_down: |
   | 
[airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=)
 | `77.96% <0%> (-1.7%)` | :arrow_down: |
   | 
[airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5)
 | `96.51% <0%> (-1.17%)` | :arrow_down: |
   | 
[airflow/www\_rbac/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9hcHAucHk=)
 | `96.66% <0%> (-1.12%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `82.96% <0%> (-1.12%)` | :arrow_down: |
   | 
[airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5)
 | `98.97% <0%> (-1.03%)` | :arrow_down: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.53% <0%> (-0.26%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.47% <0%> (-0.15%)` | :arrow_down: |
   | ... and [1 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=footer).
 Last update 
[8245447...486efa8](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
codecov-io commented on issue #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: 
https://github.com/apache/incubator-airflow/pull/3825#issuecomment-417496304
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=h1)
 Report
   > Merging 
[#3825](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **decrease** coverage by `0.28%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3825/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3825  +/-   ##
   ==
   - Coverage   77.43%   77.15%   -0.29% 
   ==
 Files 203  203  
 Lines   1584015840  
   ==
   - Hits1226612221  -45 
   - Misses   3574 3619  +45
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/hdfs\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oZGZzX2hvb2sucHk=)
 | `27.5% <0%> (-65%)` | :arrow_down: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `85.41% <0%> (-6.25%)` | :arrow_down: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `79.67% <0%> (-3.26%)` | :arrow_down: |
   | 
[airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=)
 | `77.96% <0%> (-1.7%)` | :arrow_down: |
   | 
[airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5)
 | `96.51% <0%> (-1.17%)` | :arrow_down: |
   | 
[airflow/www\_rbac/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9hcHAucHk=)
 | `96.66% <0%> (-1.12%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `82.96% <0%> (-1.12%)` | :arrow_down: |
   | 
[airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5)
 | `98.97% <0%> (-1.03%)` | :arrow_down: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.53% <0%> (-0.26%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.47% <0%> (-0.15%)` | :arrow_down: |
   | ... and [1 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3825/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=footer).
 Last update 
[8245447...486efa8](https://codecov.io/gh/apache/incubator-airflow/pull/3825?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2988) GCP Dataflow hook should specifically run python2

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598001#comment-16598001
 ] 

ASF GitHub Bot commented on AIRFLOW-2988:
-

jcao219 opened a new pull request #3826: [AIRFLOW-2988] Run specifically 
python2 for dataflow
URL: https://github.com/apache/incubator-airflow/pull/3826
 
 
   Apache beam does not yet support python3, so it's best to run dataflow
   jobs with python2 specifically until python3 support is complete
   (BEAM-1251), in case if the user's 'python' in PATH is python3.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> GCP Dataflow hook should specifically run python2
> -
>
> Key: AIRFLOW-2988
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2988
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Dataflow, gcp, hooks
>Reporter: Jimmy Cao
>Priority: Major
>
> Currently the GCP dataflow hook invokes 'python' 
> [here|https://github.com/apache/incubator-airflow/blob/c3939c8e721870d263997e7aeaebc28e678d544b/airflow/contrib/hooks/gcp_dataflow_hook.py#L239].
>   This can fail if the user's 'python' in PATH starts python 3, which Apache 
> Beam does not yet support, (see BEAM-1251).
> It should be changed to 'python2' to ensure that Apache Beam is run with the 
> correct version of Python. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jcao219 opened a new pull request #3826: [AIRFLOW-2988] Run specifically python2 for dataflow

2018-08-30 Thread GitBox
jcao219 opened a new pull request #3826: [AIRFLOW-2988] Run specifically 
python2 for dataflow
URL: https://github.com/apache/incubator-airflow/pull/3826
 
 
   Apache beam does not yet support python3, so it's best to run dataflow
   jobs with python2 specifically until python3 support is complete
   (BEAM-1251), in case if the user's 'python' in PATH is python3.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2989) No Parameter to change bootDiskType for DataprocClusterCreateOperator

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597996#comment-16597996
 ] 

ASF GitHub Bot commented on AIRFLOW-2989:
-

kaxil opened a new pull request #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: https://github.com/apache/incubator-airflow/pull/3825
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2989
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   - Add param to set bootDiskType for master and worker nodes in 
`DataprocClusterCreateOperator`
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Modified `DataprocClusterCreateOperatorTest`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> No Parameter to change bootDiskType for DataprocClusterCreateOperator 
> --
>
> Key: AIRFLOW-2989
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2989
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, gcp
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
> Fix For: 1.10.1
>
>
> Currently, we cannot set the Primary disk type for master and worker to 
> `pd-ssd` for DataprocClusterCreateOperator.
> Google API: 
> https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters#diskconfig
> Related StackOverflow Issue: 
> https://stackoverflow.com/questions/52090315/airflow-dataprocclustercreateoperator/52092942#52092942



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil opened a new pull request #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
kaxil opened a new pull request #3825: [AIRFLOW-2989] Add param to set 
bootDiskType in Dataproc Op
URL: https://github.com/apache/incubator-airflow/pull/3825
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2989
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   - Add param to set bootDiskType for master and worker nodes in 
`DataprocClusterCreateOperator`
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Modified `DataprocClusterCreateOperatorTest`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op

2018-08-30 Thread GitBox
kaxil commented on issue #3825: [AIRFLOW-2989] Add param to set bootDiskType in 
Dataproc Op
URL: 
https://github.com/apache/incubator-airflow/pull/3825#issuecomment-417488271
 
 
   cc @Fokko @fenglu-g 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2989) No Parameter to change bootDiskType for DataprocClusterCreateOperator

2018-08-30 Thread Kaxil Naik (JIRA)
Kaxil Naik created AIRFLOW-2989:
---

 Summary: No Parameter to change bootDiskType for 
DataprocClusterCreateOperator 
 Key: AIRFLOW-2989
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2989
 Project: Apache Airflow
  Issue Type: New Feature
  Components: contrib, gcp
Affects Versions: 1.9.0, 1.10.0
Reporter: Kaxil Naik
Assignee: Kaxil Naik
 Fix For: 1.10.1


Currently, we cannot set the Primary disk type for master and worker to 
`pd-ssd` for DataprocClusterCreateOperator.

Google API: 
https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters#diskconfig

Related StackOverflow Issue: 
https://stackoverflow.com/questions/52090315/airflow-dataprocclustercreateoperator/52092942#52092942



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3817: [AIRFLOW-2974] Extended Databricks hook with cluster operation

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3817: [AIRFLOW-2974] Extended Databricks 
hook with cluster operation
URL: 
https://github.com/apache/incubator-airflow/pull/3817#issuecomment-416701478
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=h1)
 Report
   > Merging 
[#3817](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3817/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3817   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=footer).
 Last update 
[8245447...e70aa98](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2988) GCP Dataflow hook should specifically run python2

2018-08-30 Thread Jimmy Cao (JIRA)
Jimmy Cao created AIRFLOW-2988:
--

 Summary: GCP Dataflow hook should specifically run python2
 Key: AIRFLOW-2988
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2988
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Dataflow, gcp, hooks
Reporter: Jimmy Cao


Currently the GCP dataflow hook invokes 'python' 
[here|https://github.com/apache/incubator-airflow/blob/c3939c8e721870d263997e7aeaebc28e678d544b/airflow/contrib/hooks/gcp_dataflow_hook.py#L239].
  This can fail if the user's 'python' in PATH starts python 3, which Apache 
Beam does not yet support, (see BEAM-1251).

It should be changed to 'python2' to ensure that Apache Beam is run with the 
correct version of Python. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3817: [AIRFLOW-2974] Extended Databricks hook with cluster operation

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3817: [AIRFLOW-2974] Extended Databricks 
hook with cluster operation
URL: 
https://github.com/apache/incubator-airflow/pull/3817#issuecomment-416701478
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=h1)
 Report
   > Merging 
[#3817](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3817/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3817   +/-   ##
   ===
 Coverage   77.43%   77.43%   
   ===
 Files 203  203   
 Lines   1584015840   
   ===
 Hits1226612266   
 Misses   3574 3574
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=footer).
 Last update 
[8245447...e70aa98](https://codecov.io/gh/apache/incubator-airflow/pull/3817?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-417475558
 
 
   @kaxil, check pls. And thanks in advance!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
kaxil commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417475014
 
 
   LGTM. @feng-tao  Can we add a simple test for this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3820: [AIRFLOW-XXX] Fix Docstrings for Hooks/Operators

2018-08-30 Thread GitBox
codecov-io commented on issue #3820: [AIRFLOW-XXX] Fix Docstrings for 
Hooks/Operators
URL: 
https://github.com/apache/incubator-airflow/pull/3820#issuecomment-417466806
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=h1)
 Report
   > Merging 
[#3820](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/82454477c57699ece5c6515ce85b7df0c0583571?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3820/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3820  +/-   ##
   ==
   - Coverage   77.43%   77.43%   -0.01% 
   ==
 Files 203  203  
 Lines   1584015840  
   ==
   - Hits1226612265   -1 
   - Misses   3574 3575   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3820/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `30.3% <ø> (ø)` | :arrow_up: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3820/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[airflow/operators/s3\_file\_transform\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3820/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfZmlsZV90cmFuc2Zvcm1fb3BlcmF0b3IucHk=)
 | `93.87% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3820/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.72% <ø> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=footer).
 Last update 
[8245447...583f2b7](https://codecov.io/gh/apache/incubator-airflow/pull/3820?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417466718
 
 
   @gerardo I agree that it would be a pain, but it's going to REALLY hurt if 
we merge PRs for a couple of weeks and then can't track down what broke the k8s 
executor when it restarts. Definitely please try on a different branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417461794
 
 
   @dimberman I can take a stab at making this work in a separate branch if you 
want. This is definitely a blocker, but reverting sounds like even more work.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2963) Error parsing AIRFLOW_CONN_ URI

2018-08-30 Thread Casandra julie mitchell (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casandra julie mitchell updated AIRFLOW-2963:
-
Attachment: (was: 2811fb90a1302h.txt)

> Error parsing AIRFLOW_CONN_ URI
> ---
>
> Key: AIRFLOW-2963
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2963
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: boto3, configuration
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Leonardo de Campos Almeida
>Assignee: Casandra julie mitchell
>Priority: Minor
>  Labels: easyfix
>
> I'm using the environment variable AIRFLOW_CONN_ to define my connection to 
> AWS, but my AWS secret access key has a slash on it.
>  e.g.:
> {code:java}
> s3://login:pass/word@bucket
> {code}
>  The problem is that the method *BaseHook._get_connection_from_env* doesn't 
> accept this URI as a valid URI. When it finds the / it is assuming that the 
> path starts there, so it is returning:
>  * host: login
>  * port: pass
>  * path: word
> And ignoring the rest, so I get an error, because pass is not a valid port 
> number.
> So, I tried to pass the URI quoted
> {code:java}
> s3://login:pass%2Fword@bucker
> {code}
> But them, the values are not being unquoted correctly, and the AwsHook is 
> trying to use pass%2Fword as the secret access key.
>  I took a look at the method that parses the URI, and it is only unquoting 
> the host, manually.
> {code:java}
> def parse_from_uri(self, uri):
> temp_uri = urlparse(uri)
> hostname = temp_uri.hostname or ''
> if '%2f' in hostname:
> hostname = hostname.replace('%2f', '/').replace('%2F', '/')
> conn_type = temp_uri.scheme
> if conn_type == 'postgresql':
> conn_type = 'postgres'
> self.conn_type = conn_type
> self.host = hostname
> self.schema = temp_uri.path[1:]
> self.login = temp_uri.username
> self.password = temp_uri.password
> self.port = temp_uri.port
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2963) Error parsing AIRFLOW_CONN_ URI

2018-08-30 Thread Casandra julie mitchell (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casandra julie mitchell updated AIRFLOW-2963:
-
Attachment: 2811fb90a1302h.txt

> Error parsing AIRFLOW_CONN_ URI
> ---
>
> Key: AIRFLOW-2963
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2963
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: boto3, configuration
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Leonardo de Campos Almeida
>Assignee: Casandra julie mitchell
>Priority: Minor
>  Labels: easyfix
> Attachments: 2811fb90a1302h.txt
>
>
> I'm using the environment variable AIRFLOW_CONN_ to define my connection to 
> AWS, but my AWS secret access key has a slash on it.
>  e.g.:
> {code:java}
> s3://login:pass/word@bucket
> {code}
>  The problem is that the method *BaseHook._get_connection_from_env* doesn't 
> accept this URI as a valid URI. When it finds the / it is assuming that the 
> path starts there, so it is returning:
>  * host: login
>  * port: pass
>  * path: word
> And ignoring the rest, so I get an error, because pass is not a valid port 
> number.
> So, I tried to pass the URI quoted
> {code:java}
> s3://login:pass%2Fword@bucker
> {code}
> But them, the values are not being unquoted correctly, and the AwsHook is 
> trying to use pass%2Fword as the secret access key.
>  I took a look at the method that parses the URI, and it is only unquoting 
> the host, manually.
> {code:java}
> def parse_from_uri(self, uri):
> temp_uri = urlparse(uri)
> hostname = temp_uri.hostname or ''
> if '%2f' in hostname:
> hostname = hostname.replace('%2f', '/').replace('%2F', '/')
> conn_type = temp_uri.scheme
> if conn_type == 'postgresql':
> conn_type = 'postgres'
> self.conn_type = conn_type
> self.host = hostname
> self.schema = temp_uri.path[1:]
> self.login = temp_uri.username
> self.password = temp_uri.password
> self.port = temp_uri.port
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2963) Error parsing AIRFLOW_CONN_ URI

2018-08-30 Thread Casandra julie mitchell (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casandra julie mitchell reassigned AIRFLOW-2963:


Assignee: Casandra julie mitchell

> Error parsing AIRFLOW_CONN_ URI
> ---
>
> Key: AIRFLOW-2963
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2963
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: boto3, configuration
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Leonardo de Campos Almeida
>Assignee: Casandra julie mitchell
>Priority: Minor
>  Labels: easyfix
>
> I'm using the environment variable AIRFLOW_CONN_ to define my connection to 
> AWS, but my AWS secret access key has a slash on it.
>  e.g.:
> {code:java}
> s3://login:pass/word@bucket
> {code}
>  The problem is that the method *BaseHook._get_connection_from_env* doesn't 
> accept this URI as a valid URI. When it finds the / it is assuming that the 
> path starts there, so it is returning:
>  * host: login
>  * port: pass
>  * path: word
> And ignoring the rest, so I get an error, because pass is not a valid port 
> number.
> So, I tried to pass the URI quoted
> {code:java}
> s3://login:pass%2Fword@bucker
> {code}
> But them, the values are not being unquoted correctly, and the AwsHook is 
> trying to use pass%2Fword as the secret access key.
>  I took a look at the method that parses the URI, and it is only unquoting 
> the host, manually.
> {code:java}
> def parse_from_uri(self, uri):
> temp_uri = urlparse(uri)
> hostname = temp_uri.hostname or ''
> if '%2f' in hostname:
> hostname = hostname.replace('%2f', '/').replace('%2F', '/')
> conn_type = temp_uri.scheme
> if conn_type == 'postgresql':
> conn_type = 'postgres'
> self.conn_type = conn_type
> self.host = hostname
> self.schema = temp_uri.path[1:]
> self.login = temp_uri.username
> self.password = temp_uri.password
> self.port = temp_uri.port
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2900) Code not visible for Packaged DAGs

2018-08-30 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2900:

Affects Version/s: (was: Airflow 1.9.0)
   1.10.0
   1.9.0

> Code not visible for Packaged DAGs
> --
>
> Key: AIRFLOW-2900
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2900
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp, webserver
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Jacob Biesinger
>Assignee: Jacob Biesinger
>Priority: Minor
> Fix For: 1.10.1
>
>
> Packaged DAGs are present on the server as ZIP files. The [rendering 
> code|https://github.com/apache/incubator-airflow/blob/a29fe350164937b28f525b46f7aecbc309665e5a/airflow/www/views.py#L668]
>  is not aware of zip files and fails to show code for packaged apps.
>  
> Easy fix: If .zip appears as a suffix in the path components, attempt to open 
> the file using ZipFile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2900) Code not visible for Packaged DAGs

2018-08-30 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2900:

Fix Version/s: 1.10.1

> Code not visible for Packaged DAGs
> --
>
> Key: AIRFLOW-2900
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2900
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp, webserver
>Affects Versions: Airflow 1.9.0
>Reporter: Jacob Biesinger
>Assignee: Jacob Biesinger
>Priority: Minor
> Fix For: 1.10.1
>
>
> Packaged DAGs are present on the server as ZIP files. The [rendering 
> code|https://github.com/apache/incubator-airflow/blob/a29fe350164937b28f525b46f7aecbc309665e5a/airflow/www/views.py#L668]
>  is not aware of zip files and fails to show code for packaged apps.
>  
> Easy fix: If .zip appears as a suffix in the path components, attempt to open 
> the file using ZipFile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2900) Code not visible for Packaged DAGs

2018-08-30 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-2900.
-
Resolution: Fixed

Resolved by https://github.com/apache/incubator-airflow/pull/3749

> Code not visible for Packaged DAGs
> --
>
> Key: AIRFLOW-2900
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2900
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp, webserver
>Affects Versions: Airflow 1.9.0
>Reporter: Jacob Biesinger
>Assignee: Jacob Biesinger
>Priority: Minor
>
> Packaged DAGs are present on the server as ZIP files. The [rendering 
> code|https://github.com/apache/incubator-airflow/blob/a29fe350164937b28f525b46f7aecbc309665e5a/airflow/www/views.py#L668]
>  is not aware of zip files and fails to show code for packaged apps.
>  
> Easy fix: If .zip appears as a suffix in the path components, attempt to open 
> the file using ZipFile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #3749: [AIRFLOW-2900] Show code for packaged DAGs

2018-08-30 Thread GitBox
kaxil closed pull request #3749: [AIRFLOW-2900] Show code for packaged DAGs
URL: https://github.com/apache/incubator-airflow/pull/3749
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 94e18794d6..ddf3094567 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -337,7 +337,8 @@ def process_file(self, filepath, only_if_updated=True, 
safe_mode=True):
 return found_dags
 
 mods = []
-if not zipfile.is_zipfile(filepath):
+is_zipfile = zipfile.is_zipfile(filepath)
+if not is_zipfile:
 if safe_mode and os.path.isfile(filepath):
 with open(filepath, 'rb') as f:
 content = f.read()
@@ -409,7 +410,7 @@ def process_file(self, filepath, only_if_updated=True, 
safe_mode=True):
 if isinstance(dag, DAG):
 if not dag.full_filepath:
 dag.full_filepath = filepath
-if dag.fileloc != filepath:
+if dag.fileloc != filepath and not is_zipfile:
 dag.fileloc = filepath
 try:
 dag.is_subdag = False
diff --git a/airflow/www/utils.py b/airflow/www/utils.py
index 9ce114d5ed..e85bc5909a 100644
--- a/airflow/www/utils.py
+++ b/airflow/www/utils.py
@@ -20,17 +20,21 @@
 # flake8: noqa: E402
 import inspect
 from future import standard_library
-standard_library.install_aliases()
+standard_library.install_aliases()  # noqa: E402
 from builtins import str, object
 
 from cgi import escape
 from io import BytesIO as IO
 import functools
 import gzip
+import io
 import json
+import os
+import re
 import time
 import wtforms
 from wtforms.compat import text_type
+import zipfile
 
 from flask import after_this_request, request, Response
 from flask_admin.model import filters
@@ -372,6 +376,22 @@ def zipper(response):
 return view_func
 
 
+def open_maybe_zipped(f, mode='r'):
+"""
+Opens the given file. If the path contains a folder with a .zip suffix, 
then
+the folder is treated as a zip archive, opening the file inside the 
archive.
+
+:return: a file object, as in `open`, or as in `ZipFile.open`.
+"""
+
+_, archive, filename = re.search(
+r'((.*\.zip){})?(.*)'.format(re.escape(os.sep)), f).groups()
+if archive and zipfile.is_zipfile(archive):
+return zipfile.ZipFile(archive, mode=mode).open(filename)
+else:
+return io.open(f, mode=mode)
+
+
 def make_cache_key(*args, **kwargs):
 """
 Used by cache to get a unique key per URL
diff --git a/airflow/www/views.py b/airflow/www/views.py
index e1a7caa8bb..aa2530e458 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -661,7 +661,7 @@ def code(self):
 dag = dagbag.get_dag(dag_id)
 title = dag_id
 try:
-with open(dag.fileloc, 'r') as f:
+with wwwutils.open_maybe_zipped(dag.fileloc, 'r') as f:
 code = f.read()
 html_code = highlight(
 code, lexers.PythonLexer(), HtmlFormatter(linenos=True))
diff --git a/airflow/www_rbac/utils.py b/airflow/www_rbac/utils.py
index a0e9258eae..0176a5312c 100644
--- a/airflow/www_rbac/utils.py
+++ b/airflow/www_rbac/utils.py
@@ -26,6 +26,10 @@
 import wtforms
 import bleach
 import markdown
+import re
+import zipfile
+import os
+import io
 
 from builtins import str
 from past.builtins import basestring
@@ -202,6 +206,22 @@ def json_response(obj):
 mimetype="application/json")
 
 
+def open_maybe_zipped(f, mode='r'):
+"""
+Opens the given file. If the path contains a folder with a .zip suffix, 
then
+the folder is treated as a zip archive, opening the file inside the 
archive.
+
+:return: a file object, as in `open`, or as in `ZipFile.open`.
+"""
+
+_, archive, filename = re.search(
+r'((.*\.zip){})?(.*)'.format(re.escape(os.sep)), f).groups()
+if archive and zipfile.is_zipfile(archive):
+return zipfile.ZipFile(archive, mode=mode).open(filename)
+else:
+return io.open(f, mode=mode)
+
+
 def make_cache_key(*args, **kwargs):
 """
 Used by cache to get a unique key per URL
diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py
index d011724cc6..3dc3400968 100644
--- a/airflow/www_rbac/views.py
+++ b/airflow/www_rbac/views.py
@@ -400,7 +400,7 @@ def code(self):
 dag = dagbag.get_dag(dag_id)
 title = dag_id
 try:
-with open(dag.fileloc, 'r') as f:
+with wwwutils.open_maybe_zipped(dag.fileloc, 'r') as f:
 code = f.read()
 html_code = highlight(
 code, lexers.PythonLexer(), 

[jira] [Commented] (AIRFLOW-2900) Code not visible for Packaged DAGs

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597881#comment-16597881
 ] 

ASF GitHub Bot commented on AIRFLOW-2900:
-

kaxil closed pull request #3749: [AIRFLOW-2900] Show code for packaged DAGs
URL: https://github.com/apache/incubator-airflow/pull/3749
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 94e18794d6..ddf3094567 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -337,7 +337,8 @@ def process_file(self, filepath, only_if_updated=True, 
safe_mode=True):
 return found_dags
 
 mods = []
-if not zipfile.is_zipfile(filepath):
+is_zipfile = zipfile.is_zipfile(filepath)
+if not is_zipfile:
 if safe_mode and os.path.isfile(filepath):
 with open(filepath, 'rb') as f:
 content = f.read()
@@ -409,7 +410,7 @@ def process_file(self, filepath, only_if_updated=True, 
safe_mode=True):
 if isinstance(dag, DAG):
 if not dag.full_filepath:
 dag.full_filepath = filepath
-if dag.fileloc != filepath:
+if dag.fileloc != filepath and not is_zipfile:
 dag.fileloc = filepath
 try:
 dag.is_subdag = False
diff --git a/airflow/www/utils.py b/airflow/www/utils.py
index 9ce114d5ed..e85bc5909a 100644
--- a/airflow/www/utils.py
+++ b/airflow/www/utils.py
@@ -20,17 +20,21 @@
 # flake8: noqa: E402
 import inspect
 from future import standard_library
-standard_library.install_aliases()
+standard_library.install_aliases()  # noqa: E402
 from builtins import str, object
 
 from cgi import escape
 from io import BytesIO as IO
 import functools
 import gzip
+import io
 import json
+import os
+import re
 import time
 import wtforms
 from wtforms.compat import text_type
+import zipfile
 
 from flask import after_this_request, request, Response
 from flask_admin.model import filters
@@ -372,6 +376,22 @@ def zipper(response):
 return view_func
 
 
+def open_maybe_zipped(f, mode='r'):
+"""
+Opens the given file. If the path contains a folder with a .zip suffix, 
then
+the folder is treated as a zip archive, opening the file inside the 
archive.
+
+:return: a file object, as in `open`, or as in `ZipFile.open`.
+"""
+
+_, archive, filename = re.search(
+r'((.*\.zip){})?(.*)'.format(re.escape(os.sep)), f).groups()
+if archive and zipfile.is_zipfile(archive):
+return zipfile.ZipFile(archive, mode=mode).open(filename)
+else:
+return io.open(f, mode=mode)
+
+
 def make_cache_key(*args, **kwargs):
 """
 Used by cache to get a unique key per URL
diff --git a/airflow/www/views.py b/airflow/www/views.py
index e1a7caa8bb..aa2530e458 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -661,7 +661,7 @@ def code(self):
 dag = dagbag.get_dag(dag_id)
 title = dag_id
 try:
-with open(dag.fileloc, 'r') as f:
+with wwwutils.open_maybe_zipped(dag.fileloc, 'r') as f:
 code = f.read()
 html_code = highlight(
 code, lexers.PythonLexer(), HtmlFormatter(linenos=True))
diff --git a/airflow/www_rbac/utils.py b/airflow/www_rbac/utils.py
index a0e9258eae..0176a5312c 100644
--- a/airflow/www_rbac/utils.py
+++ b/airflow/www_rbac/utils.py
@@ -26,6 +26,10 @@
 import wtforms
 import bleach
 import markdown
+import re
+import zipfile
+import os
+import io
 
 from builtins import str
 from past.builtins import basestring
@@ -202,6 +206,22 @@ def json_response(obj):
 mimetype="application/json")
 
 
+def open_maybe_zipped(f, mode='r'):
+"""
+Opens the given file. If the path contains a folder with a .zip suffix, 
then
+the folder is treated as a zip archive, opening the file inside the 
archive.
+
+:return: a file object, as in `open`, or as in `ZipFile.open`.
+"""
+
+_, archive, filename = re.search(
+r'((.*\.zip){})?(.*)'.format(re.escape(os.sep)), f).groups()
+if archive and zipfile.is_zipfile(archive):
+return zipfile.ZipFile(archive, mode=mode).open(filename)
+else:
+return io.open(f, mode=mode)
+
+
 def make_cache_key(*args, **kwargs):
 """
 Used by cache to get a unique key per URL
diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py
index d011724cc6..3dc3400968 100644
--- a/airflow/www_rbac/views.py
+++ b/airflow/www_rbac/views.py
@@ -400,7 +400,7 @@ def code(self):
 dag = dagbag.get_dag(dag_id)
 title = dag_id
 

[GitHub] kaxil commented on issue #3749: [AIRFLOW-2900] Show code for packaged DAGs

2018-08-30 Thread GitBox
kaxil commented on issue #3749: [AIRFLOW-2900] Show code for packaged DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/3749#issuecomment-417451770
 
 
   Awesome, Great work.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2981) TypeError in dataflow operators when using GCS jar or py_file

2018-08-30 Thread Jeffrey Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Payne updated AIRFLOW-2981:
---
Description: 
The {{GoogleCloudBucketHelper.google_cloud_to_local}} function attempts to 
compare a list to an int, resulting in the TypeError, with:
{noformat}
...
path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
if path_components < 2:
...
{noformat}
This should be {{if len(path_components) < 2:}}.

Also, fix {{if file_size > 0:}} in same function...

  was:
The {{GoogleCloudBucketHelper.google_cloud_to_local}} function attempts to 
compare a list to an int, resulting in the TypeError, with:
{noformat}
...
path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
if path_components < 2:
...
{noformat}
This should be {{if len(path_components) < 2:}}.


>  TypeError in dataflow operators when using GCS jar or py_file
> --
>
> Key: AIRFLOW-2981
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2981
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, Dataflow
>Affects Versions: 1.9.0, 1.10
>Reporter: Jeffrey Payne
>Assignee: Jeffrey Payne
>Priority: Major
>
> The {{GoogleCloudBucketHelper.google_cloud_to_local}} function attempts to 
> compare a list to an int, resulting in the TypeError, with:
> {noformat}
> ...
> path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
> if path_components < 2:
> ...
> {noformat}
> This should be {{if len(path_components) < 2:}}.
> Also, fix {{if file_size > 0:}} in same function...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-2981) TypeError in dataflow operators when using GCS jar or py_file

2018-08-30 Thread Jeffrey Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-2981 started by Jeffrey Payne.
--
>  TypeError in dataflow operators when using GCS jar or py_file
> --
>
> Key: AIRFLOW-2981
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2981
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, Dataflow
>Affects Versions: 1.9.0, 1.10
>Reporter: Jeffrey Payne
>Assignee: Jeffrey Payne
>Priority: Major
>
> The {{GoogleCloudBucketHelper.google_cloud_to_local}} function attempts to 
> compare a list to an int, resulting in the TypeError, with:
> {noformat}
> ...
> path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
> if path_components < 2:
> ...
> {noformat}
> This should be {{if len(path_components) < 2:}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
feng-tao commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417446554
 
 
   Thanks @r39132  for the feedback. I updated the description in the pr and 
jira and let me know if it looks ok to you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Shivakumar Gopalakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597832#comment-16597832
 ] 

Shivakumar Gopalakrishnan commented on AIRFLOW-2986:


yes, I have 

In fact I have loaded the proxies in the airflow script to 
os.environment["http_proxy"], https_proxy and no_proxy

also, the scheduler is able to write to the queue; only the worker is not able 
to read the queue;

I have checked the proxy logs, and they do show a tunnel connection to 
[eu-west-1.queue.amazonaws.com|https://eu-west-1.queue.amazonaws.com/] port 443

the only thing that comes to my mind is 
- ** -- .> transport: sqs://localhost//

 

is it looking for a queue by the name of localhost - I tried debugging, I was 
not able to figure out where this is being set

 

> Airflow Worker does not reach sqs
> -
>
> Key: AIRFLOW-2986
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2986
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: amazon linux
>Reporter: Shivakumar Gopalakrishnan
>Priority: Major
>
> I am running the airflow worker service. The service is not able to connect 
> to the sqs
> The scheduler is able to reach and write to the queue
> Proxies are fine; I have implemented this in both python 2.7 and 3.5 same 
> issue
> Copy of the log is below
> {code}
> starting airflow-worker...
> /data/share/airflow
> /data/share/airflow/airflow.cfg
> [2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12304)
> [2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> [2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12308)
> [2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> Starting flask
> [2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
> http://0.0.0.0:8793/ (Press CTRL+C to quit)
> [2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
> Exception('Request Empty body HTTP 599 Failed to connect to 
> eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
> 207, in start
>  self.blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
> in start
>  return self.obj.start()
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 316, in start
>  blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 592, in start
>  c.loop(*c.loop_args())
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
>  next(loop)
>  File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
>  cb(*cbargs)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 114, in on_writable
>  return self._on_event(fd, _pycurl.CSELECT_OUT)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 124, in _on_event
>  self._process_pending_requests()
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 132, in _process_pending_requests
>  self._process(curl, errno, reason)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 178, in _process
>  buffer=buffer, effective_url=effective_url, error=error,
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
> __call__
>  svpending(*ca, **ck)
>  File 

[jira] [Commented] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597828#comment-16597828
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2986:


You mentioned you have a proxy? Does airflow worker run such that 
{{https_proxy}} environment variable is set?

> Airflow Worker does not reach sqs
> -
>
> Key: AIRFLOW-2986
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2986
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: amazon linux
>Reporter: Shivakumar Gopalakrishnan
>Priority: Major
>
> I am running the airflow worker service. The service is not able to connect 
> to the sqs
> The scheduler is able to reach and write to the queue
> Proxies are fine; I have implemented this in both python 2.7 and 3.5 same 
> issue
> Copy of the log is below
> {code}
> starting airflow-worker...
> /data/share/airflow
> /data/share/airflow/airflow.cfg
> [2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12304)
> [2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> [2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12308)
> [2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> Starting flask
> [2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
> http://0.0.0.0:8793/ (Press CTRL+C to quit)
> [2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
> Exception('Request Empty body HTTP 599 Failed to connect to 
> eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
> 207, in start
>  self.blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
> in start
>  return self.obj.start()
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 316, in start
>  blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 592, in start
>  c.loop(*c.loop_args())
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
>  next(loop)
>  File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
>  cb(*cbargs)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 114, in on_writable
>  return self._on_event(fd, _pycurl.CSELECT_OUT)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 124, in _on_event
>  self._process_pending_requests()
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 132, in _process_pending_requests
>  self._process(curl, errno, reason)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 178, in _process
>  buffer=buffer, effective_url=effective_url, error=error,
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
> __call__
>  svpending(*ca, **ck)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
> __call__
>  retval = fun(*final_args, **final_kwargs)
>  File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in 
> _transback
>  return callback(ret)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
> File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 

[GitHub] r39132 commented on issue #1875: [AIRFLOW-620] Add log refresh button to TI's log view page

2018-08-30 Thread GitBox
r39132 commented on issue #1875: [AIRFLOW-620] Add log refresh button to TI's 
log view page
URL: 
https://github.com/apache/incubator-airflow/pull/1875#issuecomment-417429399
 
 
   @msumit What do you want to do with this PR? Close and re-open a fresh one 
or rebase and ask for a review. If I don't hear back in a few days, I'll close 
this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] r39132 commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro

2018-08-30 Thread GitBox
r39132 commented on issue #3821: [AIRFLOW-2983] Add prev_ds_nodash and 
next_ds_nodash macro
URL: 
https://github.com/apache/incubator-airflow/pull/3821#issuecomment-417428702
 
 
   @feng-tao can you provide a description for your change to the JIRA? The 
problem statement should be apparent to anyone in the community. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r213896240
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -566,95 +612,108 @@ def run_query(self,
   'Airflow.',
   category=DeprecationWarning)
 
-if sql is None:
-raise TypeError('`BigQueryBaseCursor.run_query` missing 1 required 
'
-'positional argument: `sql`')
+if not sql and not configuration['query'].get('query', None):
+raise TypeError('`BigQueryBaseCursor.run_query` '
+'missing 1 required positional argument: `sql`')
+
+# BigQuery also allows you to define how you want a table's schema
+# to change as a side effect of a query job for more details:
+# 
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
 
-# BigQuery also allows you to define how you want a table's schema to 
change
-# as a side effect of a query job
-# for more details:
-#   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
 allowed_schema_update_options = [
 'ALLOW_FIELD_ADDITION', "ALLOW_FIELD_RELAXATION"
 ]
-if not set(allowed_schema_update_options).issuperset(
-set(schema_update_options)):
-raise ValueError(
-"{0} contains invalid schema update options. "
-"Please only use one or more of the following options: {1}"
-.format(schema_update_options, allowed_schema_update_options))
 
-if use_legacy_sql is None:
-use_legacy_sql = self.use_legacy_sql
+if not set(allowed_schema_update_options
+   ).issuperset(set(schema_update_options)):
+raise ValueError("{0} contains invalid schema update options. "
+ "Please only use one or more of the following "
+ "options: {1}"
+ .format(schema_update_options,
+ allowed_schema_update_options))
 
-configuration = {
-'query': {
-'query': sql,
-'useLegacySql': use_legacy_sql,
-'maximumBillingTier': maximum_billing_tier,
-'maximumBytesBilled': maximum_bytes_billed,
-'priority': priority
-}
-}
+if schema_update_options:
+if write_disposition not in ["WRITE_APPEND", "WRITE_TRUNCATE"]:
+raise ValueError("schema_update_options is only "
+ "allowed if write_disposition is "
+ "'WRITE_APPEND' or 'WRITE_TRUNCATE'.")
 
 if destination_dataset_table:
-if '.' not in destination_dataset_table:
-raise ValueError(
-'Expected destination_dataset_table name in the format of '
-'.. Got: {}'.format(
-destination_dataset_table))
 destination_project, destination_dataset, destination_table = \
 _split_tablename(table_input=destination_dataset_table,
  default_project_id=self.project_id)
-configuration['query'].update({
-'allowLargeResults': allow_large_results,
-'flattenResults': flatten_results,
-'writeDisposition': write_disposition,
-'createDisposition': create_disposition,
-'destinationTable': {
-'projectId': destination_project,
-'datasetId': destination_dataset,
-'tableId': destination_table,
-}
-})
-if udf_config:
-if not isinstance(udf_config, list):
-raise TypeError("udf_config argument must have a type 'list'"
-" not {}".format(type(udf_config)))
-configuration['query'].update({
-'userDefinedFunctionResources': udf_config
-})
 
-if query_params:
-if self.use_legacy_sql:
-raise ValueError("Query parameters are not allowed when using "
- "legacy SQL")
-else:
-configuration['query']['queryParameters'] = query_params
+destination_dataset_table = {
+'projectId': destination_project,
+'datasetId': destination_dataset,
+'tableId': destination_table,
+}
 
-if labels:
-configuration['labels'] = labels
+query_param_list = [
+(sql, 

[jira] [Commented] (AIRFLOW-491) Add cache parameter in BigQuery query method

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597733#comment-16597733
 ] 

ASF GitHub Bot commented on AIRFLOW-491:


xnuinside opened a new pull request #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-491
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Added "useQueryCache" from job BQ configuration 
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query 
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add cache parameter in BigQuery query method
> 
>
> Key: AIRFLOW-491
> URL: https://issues.apache.org/jira/browse/AIRFLOW-491
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, gcp
>Affects Versions: Airflow 1.7.1
>Reporter: Chris Riccomini
>Assignee: Iuliia Volkova
>Priority: Major
> Fix For: Airflow 1.8
>
>
> The current BigQuery query() method does not have a user_query_cache 
> parameter. This param always defaults to true (see 
> [here|https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.query]).
>  I'd like to disable query caching for some data consistency checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] xnuinside opened a new pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
xnuinside opened a new pull request #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-491
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Added "useQueryCache" from job BQ configuration 
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query 
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-491) Add cache parameter in BigQuery query method

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597730#comment-16597730
 ] 

ASF GitHub Bot commented on AIRFLOW-491:


xnuinside closed pull request #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index e4c0653bfe..e4957b3831 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -24,6 +24,7 @@
 
 import time
 from builtins import range
+from copy import deepcopy
 
 from past.builtins import basestring
 
@@ -195,10 +196,19 @@ class BigQueryBaseCursor(LoggingMixin):
 PEP 249 cursor isn't needed.
 """
 
-def __init__(self, service, project_id, use_legacy_sql=True):
+def __init__(self,
+ service,
+ project_id,
+ use_legacy_sql=True,
+ api_resource_configs=None):
+
 self.service = service
 self.project_id = project_id
 self.use_legacy_sql = use_legacy_sql
+if api_resource_configs:
+_validate_value("api_resource_configs", api_resource_configs, dict)
+self.api_resource_configs = api_resource_configs \
+if api_resource_configs else {}
 self.running_job_id = None
 
 def create_empty_table(self,
@@ -238,8 +248,7 @@ def create_empty_table(self,
 
 :return:
 """
-if time_partitioning is None:
-time_partitioning = dict()
+
 project_id = project_id if project_id is not None else self.project_id
 
 table_resource = {
@@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
   flatten_results=False,
-  udf_config=False,
+  udf_config=None,
   use_legacy_sql=None,
   maximum_billing_tier=None,
   maximum_bytes_billed=None,
@@ -486,7 +495,8 @@ def run_query(self,
   labels=None,
   schema_update_options=(),
   priority='INTERACTIVE',
-  time_partitioning=None):
+  time_partitioning=None,
+  api_resource_configs=None):
 """
 Executes a BigQuery SQL query. Optionally persists results in a 
BigQuery
 table. See here:
@@ -550,12 +560,22 @@ def run_query(self,
 :type time_partitioning: dict
 
 """
+if not api_resource_configs:
+api_resource_configs = self.api_resource_configs
+else:
+_validate_value('api_resource_configs',
+api_resource_configs, dict)
+configuration = deepcopy(api_resource_configs)
+if 'query' not in configuration:
+configuration['query'] = {}
+
+else:
+_validate_value("api_resource_configs['query']",
+configuration['query'], dict)
 
-# TODO remove `bql` in Airflow 2.0 - Jira: [AIRFLOW-2513]
-if time_partitioning is None:
-time_partitioning = {}
 sql = bql if sql is None else sql
 
+# TODO remove `bql` in Airflow 2.0 - Jira: [AIRFLOW-2513]
 if bql:
 import warnings
 warnings.warn('Deprecated parameter `bql` used in '
@@ -566,95 +586,109 @@ def run_query(self,
   'Airflow.',
   category=DeprecationWarning)
 
-if sql is None:
-raise TypeError('`BigQueryBaseCursor.run_query` missing 1 required 
'
-'positional argument: `sql`')
+if sql is None and not configuration['query'].get('query', None):
+raise TypeError('`BigQueryBaseCursor.run_query` '
+'missing 1 required positional argument: `sql`')
 
 # BigQuery also allows you to define how you want a table's schema to 
change
 # as a side effect of a query job
 # for more details:
 #   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
+
 allowed_schema_update_options = [
 'ALLOW_FIELD_ADDITION', "ALLOW_FIELD_RELAXATION"
 ]
-if not 

[GitHub] xnuinside closed pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
xnuinside closed pull request #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index e4c0653bfe..e4957b3831 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -24,6 +24,7 @@
 
 import time
 from builtins import range
+from copy import deepcopy
 
 from past.builtins import basestring
 
@@ -195,10 +196,19 @@ class BigQueryBaseCursor(LoggingMixin):
 PEP 249 cursor isn't needed.
 """
 
-def __init__(self, service, project_id, use_legacy_sql=True):
+def __init__(self,
+ service,
+ project_id,
+ use_legacy_sql=True,
+ api_resource_configs=None):
+
 self.service = service
 self.project_id = project_id
 self.use_legacy_sql = use_legacy_sql
+if api_resource_configs:
+_validate_value("api_resource_configs", api_resource_configs, dict)
+self.api_resource_configs = api_resource_configs \
+if api_resource_configs else {}
 self.running_job_id = None
 
 def create_empty_table(self,
@@ -238,8 +248,7 @@ def create_empty_table(self,
 
 :return:
 """
-if time_partitioning is None:
-time_partitioning = dict()
+
 project_id = project_id if project_id is not None else self.project_id
 
 table_resource = {
@@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
   flatten_results=False,
-  udf_config=False,
+  udf_config=None,
   use_legacy_sql=None,
   maximum_billing_tier=None,
   maximum_bytes_billed=None,
@@ -486,7 +495,8 @@ def run_query(self,
   labels=None,
   schema_update_options=(),
   priority='INTERACTIVE',
-  time_partitioning=None):
+  time_partitioning=None,
+  api_resource_configs=None):
 """
 Executes a BigQuery SQL query. Optionally persists results in a 
BigQuery
 table. See here:
@@ -550,12 +560,22 @@ def run_query(self,
 :type time_partitioning: dict
 
 """
+if not api_resource_configs:
+api_resource_configs = self.api_resource_configs
+else:
+_validate_value('api_resource_configs',
+api_resource_configs, dict)
+configuration = deepcopy(api_resource_configs)
+if 'query' not in configuration:
+configuration['query'] = {}
+
+else:
+_validate_value("api_resource_configs['query']",
+configuration['query'], dict)
 
-# TODO remove `bql` in Airflow 2.0 - Jira: [AIRFLOW-2513]
-if time_partitioning is None:
-time_partitioning = {}
 sql = bql if sql is None else sql
 
+# TODO remove `bql` in Airflow 2.0 - Jira: [AIRFLOW-2513]
 if bql:
 import warnings
 warnings.warn('Deprecated parameter `bql` used in '
@@ -566,95 +586,109 @@ def run_query(self,
   'Airflow.',
   category=DeprecationWarning)
 
-if sql is None:
-raise TypeError('`BigQueryBaseCursor.run_query` missing 1 required 
'
-'positional argument: `sql`')
+if sql is None and not configuration['query'].get('query', None):
+raise TypeError('`BigQueryBaseCursor.run_query` '
+'missing 1 required positional argument: `sql`')
 
 # BigQuery also allows you to define how you want a table's schema to 
change
 # as a side effect of a query job
 # for more details:
 #   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
+
 allowed_schema_update_options = [
 'ALLOW_FIELD_ADDITION', "ALLOW_FIELD_RELAXATION"
 ]
-if not set(allowed_schema_update_options).issuperset(
-set(schema_update_options)):
-raise ValueError(
-"{0} contains invalid schema update options. "
-"Please only use one or more of the following options: 

[GitHub] tedmiston edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-08-30 Thread GitBox
tedmiston edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-417341304
 
 
   @r39132 Yes, I've been working on the revisions discussed above and 
anticipate pushing up the next pass later today or tomorrow.
   
   Edit: The `npm run lint` is still valid.  I've updated CONTRIBUTING.md 
locally for a note with a workaround on how to use `npm run lint:fix` with the 
Jinja plugin.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2987) "About" page version info is not available

2018-08-30 Thread Frank Maritato (JIRA)
Frank Maritato created AIRFLOW-2987:
---

 Summary: "About" page version info is not available
 Key: AIRFLOW-2987
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2987
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Frank Maritato
 Attachments: Screen Shot 2018-08-30 at 10.17.52 AM.png

>From the Airflow 1.10.0 ui, click about and the resulting page shows version 
>and git version as "Not available"

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
feng-tao edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417384630
 
 
   Thanks for the info. Then I think it should add a todo / note in the comment 
about the limitation for cross bucket copying.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
feng-tao commented on issue #3823: [AIRFLOW-2985] An operator for S3 object 
copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417384630
 
 
   Thanks for the info. Then I think it should mention a todo / note in the 
code about the limitation for cross bucket copying.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Shivakumar Gopalakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597631#comment-16597631
 ] 

Shivakumar Gopalakrishnan commented on AIRFLOW-2986:


curl https://eu-west-1.queue.amazonaws.com


it reaches the environment

in fact I have assigned IAM roles to the machine and I am able to read and write

 

> Airflow Worker does not reach sqs
> -
>
> Key: AIRFLOW-2986
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2986
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: amazon linux
>Reporter: Shivakumar Gopalakrishnan
>Priority: Major
>
> I am running the airflow worker service. The service is not able to connect 
> to the sqs
> The scheduler is able to reach and write to the queue
> Proxies are fine; I have implemented this in both python 2.7 and 3.5 same 
> issue
> Copy of the log is below
> {code}
> starting airflow-worker...
> /data/share/airflow
> /data/share/airflow/airflow.cfg
> [2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12304)
> [2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> [2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12308)
> [2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> Starting flask
> [2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
> http://0.0.0.0:8793/ (Press CTRL+C to quit)
> [2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
> Exception('Request Empty body HTTP 599 Failed to connect to 
> eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
> 207, in start
>  self.blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
> in start
>  return self.obj.start()
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 316, in start
>  blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 592, in start
>  c.loop(*c.loop_args())
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
>  next(loop)
>  File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
>  cb(*cbargs)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 114, in on_writable
>  return self._on_event(fd, _pycurl.CSELECT_OUT)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 124, in _on_event
>  self._process_pending_requests()
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 132, in _process_pending_requests
>  self._process(curl, errno, reason)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 178, in _process
>  buffer=buffer, effective_url=effective_url, error=error,
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
> __call__
>  svpending(*ca, **ck)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
> __call__
>  retval = fun(*final_args, **final_kwargs)
>  File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in 
> _transback
>  return callback(ret)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
> File 

[jira] [Commented] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597626#comment-16597626
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2986:


This sounds like a problem with the AWS networking for your worker instances. 
If you SSH to the instance can you run {{curl 
https://eu-west-1.queue.amazonaws.com}}?

> Airflow Worker does not reach sqs
> -
>
> Key: AIRFLOW-2986
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2986
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: amazon linux
>Reporter: Shivakumar Gopalakrishnan
>Priority: Major
>
> I am running the airflow worker service. The service is not able to connect 
> to the sqs
> The scheduler is able to reach and write to the queue
> Proxies are fine; I have implemented this in both python 2.7 and 3.5 same 
> issue
> Copy of the log is below
> {code}
> starting airflow-worker...
> /data/share/airflow
> /data/share/airflow/airflow.cfg
> [2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12304)
> [2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> [2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
> pool (PID 12308)
> [2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
> Using pool settings. pool_size=5, pool_recycle=1800
> [2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
> doesn't look like a module path
> [2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding 
>  to pre execution callback
> [2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
> callbacks: []
> Starting flask
> [2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
> http://0.0.0.0:8793/ (Press CTRL+C to quit)
> [2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
> Exception('Request Empty body HTTP 599 Failed to connect to 
> eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
> 207, in start
>  self.blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
> in start
>  return self.obj.start()
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 316, in start
>  blueprint.start(self)
>  File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
> in start
>  step.start(parent)
>  File 
> "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
> line 592, in start
>  c.loop(*c.loop_args())
>  File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 
> 91, in asynloop
>  next(loop)
>  File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", 
> line 354, in create_loop
>  cb(*cbargs)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 114, in on_writable
>  return self._on_event(fd, _pycurl.CSELECT_OUT)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 124, in _on_event
>  self._process_pending_requests()
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 132, in _process_pending_requests
>  self._process(curl, errno, reason)
>  File 
> "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
> line 178, in _process
>  buffer=buffer, effective_url=effective_url, error=error,
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
> __call__
>  svpending(*ca, **ck)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
> __call__
>  retval = fun(*final_args, **final_kwargs)
>  File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in 
> _transback
>  return callback(ret)
>  File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
> __call__
>  return self.throw()
> File 

[jira] [Updated] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2986:
---
Description: 
I am running the airflow worker service. The service is not able to connect to 
the sqs

The scheduler is able to reach and write to the queue

Proxies are fine; I have implemented this in both python 2.7 and 3.5 same issue

Copy of the log is below

{code}
starting airflow-worker...
/data/share/airflow
/data/share/airflow/airflow.cfg
[2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
pool (PID 12304)
[2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
doesn't look like a module path
[2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor CeleryExecutor
[2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding  to pre execution callback
[2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
callbacks: []
[2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
pool (PID 12308)
[2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
doesn't look like a module path
[2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor CeleryExecutor
[2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding  to pre execution callback
[2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
callbacks: []
Starting flask
[2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
http://0.0.0.0:8793/ (Press CTRL+C to quit)
[2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
Exception('Request Empty body HTTP 599 Failed to connect to 
eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
Traceback (most recent call last):
 File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
207, in start
 self.blueprint.start(self)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
in start
 step.start(parent)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
in start
 return self.obj.start()
 File 
"/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
line 316, in start
 blueprint.start(self)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
in start
 step.start(parent)
 File 
"/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
line 592, in start
 c.loop(*c.loop_args())
 File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 91, 
in asynloop
 next(loop)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", line 
354, in create_loop
 cb(*cbargs)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 114, in on_writable
 return self._on_event(fd, _pycurl.CSELECT_OUT)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 124, in _on_event
 self._process_pending_requests()
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 132, in _process_pending_requests
 self._process(curl, errno, reason)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 178, in _process
 buffer=buffer, effective_url=effective_url, error=error,
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
__call__
 svpending(*ca, **ck)
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
__call__
 return self.throw()
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
__call__
 retval = fun(*final_args, **final_kwargs)
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in 
_transback
 return callback(ret)
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
__call__
 return self.throw()

File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
__call__
 retval = fun(*final_args, **final_kwargs)
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 98, in 
_transback
 callback.throw()
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 96, in 
_transback
 ret = filter_(*args + (ret,), **kwargs)
 File 
"/usr/local/lib/python3.5/site-packages/kombu/asynchronous/aws/connection.py", 
line 233, in _on_list_ready
 raise self._for_status(response, response.read())
Exception: Request Empty body HTTP 599 Failed to connect to 
eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)

-- celery@ip-10-92-19-197 v4.1.1 (latentcall)
  -
--- * *** * -- 

[jira] [Created] (AIRFLOW-2986) Airflow Worker does not reach sqs

2018-08-30 Thread Shivakumar Gopalakrishnan (JIRA)
Shivakumar Gopalakrishnan created AIRFLOW-2986:
--

 Summary: Airflow Worker does not reach sqs
 Key: AIRFLOW-2986
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2986
 Project: Apache Airflow
  Issue Type: Bug
 Environment: amazon linux
Reporter: Shivakumar Gopalakrishnan


I am running the airflow worker service. The service is not able to connect to 
the sqs

The scheduler is able to reach and write to the queue

Proxies are fine; I have implemented this in both python 2.7 and 3.5 same issue

Copy of the log is below

starting airflow-worker...
/data/share/airflow
/data/share/airflow/airflow.cfg
[2018-08-30 15:41:44,367] \{settings.py:146} DEBUG - Setting up DB connection 
pool (PID 12304)
[2018-08-30 15:41:44,367] \{settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-08-30 15:41:44,468] \{__init__.py:42} DEBUG - Cannot import due to 
doesn't look like a module path
[2018-08-30 15:41:44,875] \{__init__.py:51} INFO - Using executor CeleryExecutor
[2018-08-30 15:41:44,886] \{cli_action_loggers.py:40} DEBUG - Adding  to pre execution callback
[2018-08-30 15:41:44,995] \{cli_action_loggers.py:64} DEBUG - Calling 
callbacks: []
[2018-08-30 15:41:45,768] \{settings.py:146} DEBUG - Setting up DB connection 
pool (PID 12308)
[2018-08-30 15:41:45,768] \{settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-08-30 15:41:45,883] \{__init__.py:42} DEBUG - Cannot import due to 
doesn't look like a module path
[2018-08-30 15:41:46,345] \{__init__.py:51} INFO - Using executor CeleryExecutor
[2018-08-30 15:41:46,358] \{cli_action_loggers.py:40} DEBUG - Adding  to pre execution callback
[2018-08-30 15:41:46,476] \{cli_action_loggers.py:64} DEBUG - Calling 
callbacks: []
Starting flask
[2018-08-30 15:41:46,519] \{_internal.py:88} INFO - * Running on 
http://0.0.0.0:8793/ (Press CTRL+C to quit)
[2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: 
Exception('Request Empty body HTTP 599 Failed to connect to 
eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
Traceback (most recent call last):
 File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 
207, in start
 self.blueprint.start(self)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
in start
 step.start(parent)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, 
in start
 return self.obj.start()
 File 
"/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
line 316, in start
 blueprint.start(self)
 File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, 
in start
 step.start(parent)
 File 
"/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", 
line 592, in start
 c.loop(*c.loop_args())
 File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 91, 
in asynloop
 next(loop)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", line 
354, in create_loop
 cb(*cbargs)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 114, in on_writable
 return self._on_event(fd, _pycurl.CSELECT_OUT)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 124, in _on_event
 self._process_pending_requests()
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 132, in _process_pending_requests
 self._process(curl, errno, reason)
 File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", 
line 178, in _process
 buffer=buffer, effective_url=effective_url, error=error,
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in 
__call__
 svpending(*ca, **ck)
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
__call__
 return self.throw()
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
__call__
 retval = fun(*final_args, **final_kwargs)
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in 
_transback
 return callback(ret)
 File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in 
__call__
 return self.throw()

File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in 
__call__
 retval = fun(*final_args, **final_kwargs)
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 98, in 
_transback
 callback.throw()
 File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 96, in 
_transback
 ret = filter_(*args + (ret,), **kwargs)
 File 
"/usr/local/lib/python3.5/site-packages/kombu/asynchronous/aws/connection.py", 
line 233, in _on_list_ready
 raise self._for_status(response, response.read())
Exception: Request Empty body HTTP 599 Failed to connect to 

[GitHub] XD-DENG commented on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
XD-DENG commented on issue #3823: [AIRFLOW-2985] An operator for S3 object 
copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417363095
 
 
   Thanks @ashb for adding up & clarifying. My intention for this PR is to 
provide the same `copy_object()` feature in `boto3`, for which the current 
implementation should suffice.
   
   BTW, I have made changes based on your earlier reviews. Will re-push to this 
PR after my isolated tests pass.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jakebiesinger commented on issue #3749: [AIRFLOW-2900] Show code for packaged DAGs

2018-08-30 Thread GitBox
jakebiesinger commented on issue #3749: [AIRFLOW-2900] Show code for packaged 
DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/3749#issuecomment-417359964
 
 
   Thanks, @kaxil. Looks like I had a bad merge when I squashed my commit.  
Should be working now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jakahn commented on issue #3805: [AIRFLOW-2062] Add per-connection KMS encryption.

2018-08-30 Thread GitBox
jakahn commented on issue #3805: [AIRFLOW-2062] Add per-connection KMS 
encryption.
URL: 
https://github.com/apache/incubator-airflow/pull/3805#issuecomment-417359277
 
 
   @bolkedebruin @gerardo, thanks for the feedback.
   
   The intent is that this implementation _should_ be able to support other 
KMSs in the future, what aspects were you concerned about regarding Amazon KMS 
integration? The intent is that, for example, an AWS KMS Hook could be added in 
the future (similar to `GcpKmsHook` now) following the `KmsApiHook` interface 
(in addition to supporting any AWS-specific features), and then add it to the 
list of supported KMSs in `get_kms_hook` (`models.py`, Line 883 in this PR). 
You should then be able to choose between AWS or GCP KMS on a per-connection 
basis. 
   
   The reason the `kms_*` fields are not stored as part of the `extra` field is 
so that you can encrypt *any* connection via KMS managed credentials (not just 
Google connections). Since other connections may not use JSON extras, we didn't 
want to mess with their extra data.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
feng-tao commented on issue #3823: [AIRFLOW-2985] An operator for S3 object 
copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417358938
 
 
   Different s3 bucket is only acceessible by different IAM role. How does this 
operator work with different IAM roles? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-30 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417349648
 
 
   @Fokko @gerardo Quick update. I've been still running into weird minikube 
issues and have been unable to get the CI to build properly. This has become 
blocking on me implementing/PRing fixes for the k8sExecutor and the bug reports 
are starting to pile up. Could we revert the dockerized CI and then re-merge it 
once we get it working with k8s?
   
   I'm working with the k8s-kubeadm-dind guys as I think the best way forward 
might be to switch to that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-08-30 Thread GitBox
tedmiston commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-417341304
 
 
   @r39132 Yes, I've been working on the revisions discussed above and 
anticipate pushing up the next pass later today or tomorrow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit closed pull request #3824: [AIRFLOW-XXX] Add Format to company list

2018-08-30 Thread GitBox
msumit closed pull request #3824: [AIRFLOW-XXX] Add Format to company list
URL: https://github.com/apache/incubator-airflow/pull/3824
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index 275f6cc600..e911225aee 100644
--- a/README.md
+++ b/README.md
@@ -153,6 +153,7 @@ Currently **officially** using Airflow:
 1. [eRevalue](https://www.datamaran.com) 
[[@hamedhsn](https://github.com/hamedhsn)]
 1. [evo.company](https://evo.company/) 
[[@orhideous](https://github.com/orhideous)]
 1. [Flipp](https://www.flipp.com) 
[[@sethwilsonwishabi](https://github.com/sethwilsonwishabi)]
+1. [Format](https://www.format.com) [[@format](https://github.com/4ormat) & 
[@jasonicarter](https://github.com/jasonicarter)]
 1. [FreshBooks](https://github.com/freshbooks) 
[[@DinoCow](https://github.com/DinoCow)]
 1. [Fundera](https://fundera.com) 
[[@andyxhadji](https://github.com/andyxhadji)]
 1. [G Adventures](https://gadventures.com) 
[[@samuelmullin](https://github.com/samuelmullin)]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3824: [AIRFLOW-XXX] Add Format to company list

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3824: [AIRFLOW-XXX] Add Format to company 
list
URL: 
https://github.com/apache/incubator-airflow/pull/3824#issuecomment-417325998
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=h1)
 Report
   > Merging 
[#3824](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/274f093da42d300b4295b5489013a65439fc11e4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3824   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1582115821   
   ===
 Hits1224812248   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=footer).
 Last update 
[274f093...085113f](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3824: [AIRFLOW-XXX] Add Format to company list

2018-08-30 Thread GitBox
codecov-io commented on issue #3824: [AIRFLOW-XXX] Add Format to company list
URL: 
https://github.com/apache/incubator-airflow/pull/3824#issuecomment-417325998
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=h1)
 Report
   > Merging 
[#3824](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/274f093da42d300b4295b5489013a65439fc11e4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3824   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1582115821   
   ===
 Hits1224812248   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=footer).
 Last update 
[274f093...085113f](https://codecov.io/gh/apache/incubator-airflow/pull/3824?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mascah edited a comment on issue #2946: [AIRFLOW-1927] Convert naive datetimes for TaskInstances

2018-08-30 Thread GitBox
mascah edited a comment on issue #2946: [AIRFLOW-1927] Convert naive datetimes 
for TaskInstances
URL: 
https://github.com/apache/incubator-airflow/pull/2946#issuecomment-417314553
 
 
   @Fokko @bolkedebruin Can you guys please confirm that this was intended to 
change the output of macros? 
   
   In 1.9, {{ ts_nodash }} would output `20160317T00`, in 1.10 it outputs 
`20160317T00+`. I would consider this a breaking change since 
previously generated file names using the result of this macro are now 
incompatible. Am I missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mascah commented on issue #2946: [AIRFLOW-1927] Convert naive datetimes for TaskInstances

2018-08-30 Thread GitBox
mascah commented on issue #2946: [AIRFLOW-1927] Convert naive datetimes for 
TaskInstances
URL: 
https://github.com/apache/incubator-airflow/pull/2946#issuecomment-417314553
 
 
   @Fokko @bolkedebruin Can you guys please confirm that this was intended to 
change the output of macros? 
   
   In 1.9, {{ ts_nodash }} would output `20160317T00`, in 1.10 it outputs 
`20160317T00+`. I would consider this a breaking change since 
previously generated files using the result of this macro are now incompatible. 
Am I missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jasonicarter opened a new pull request #3824: [AIRFLOW-XXX] Add Format to company list

2018-08-30 Thread GitBox
jasonicarter opened a new pull request #3824: [AIRFLOW-XXX] Add Format to 
company list
URL: https://github.com/apache/incubator-airflow/pull/3824
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Add Format to the official list of companies in README.md
   
   ### Tests
   
   - [x] None. Update to README.md
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] None. Update to README.md
   
   ### Code Quality
   
   - [x] Not a code change
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-2966) KubernetesExecutor + namespace quotas kills scheduler if the pod can't be launched

2018-08-30 Thread Roland de Boo (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597385#comment-16597385
 ] 

Roland de Boo edited comment on AIRFLOW-2966 at 8/30/18 12:40 PM:
--

Colleague of John here. Some additional info:
 * Updated to 1.10.0 and retried, same issue remains
 * Last observation in the log (not mentioned above):

{{[2018-08-30 12:19:46,967] \{jobs.py:1585} INFO - Exited execute loop}}

In the Pod I can see 2 other threads remaining, but they don't seem to do 
anything.

{{$ ps -ef}}

{{airflow 16 1 0 12:19 ? 00:00:02 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}
 {{airflow 38 16 0 12:19 ? 00:00:00 /usr/local/bin/python 
/usr/local/bin/airflow scheduler -n -1}}

The Pod is stuck but does not exit. So we need to kill it by hand.

If we increase the quota on the namespace, nothing happens to the scheduler.

 

Steps to reproduce:

Set a Pod quotum on your namespace. First count the current number of pods and 
set it to that value.
{code:java}
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
pods: "4"{code}
 Then try to schedule a task.

 


was (Author: rdeboo):
Colleague of John here. Some additional info:
 * Updated to 1.10.0 and retried, same issue remains
 * Last observation in the log (not mentioned above):

{{[2018-08-30 12:19:46,967] \{jobs.py:1585} INFO - Exited execute loop}}

In the Pod I can see 2 other threads remaining, but they don't seem to do 
anything.

{{$ ps -ef}}

{{airflow 16 1 0 12:19 ? 00:00:02 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}
 {{airflow 38 16 0 12:19 ? 00:00:00 /usr/local/bin/python 
/usr/local/bin/airflow scheduler -n -1}}

The Pod is stuck but does not exit. So we need to kill it by hand.

 

If we increase the quota on the namespace, nothing happens to the scheduler.

 

 

> KubernetesExecutor + namespace quotas kills scheduler if the pod can't be 
> launched
> --
>
> Key: AIRFLOW-2966
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2966
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10
> Environment: Kubernetes 1.9.8
>Reporter: John Hofman
>Priority: Major
>
> When running Airflow in Kubernetes with the KubernetesExecutor and resource 
> quota's set on the namespace Airflow is deployed in. If the scheduler tries 
> to launch a pod into the namespace that exceeds the namespace limits it gets 
> an ApiException, and crashes the scheduler.
> This stack trace is an example of the ApiException from the kubernetes client:
> {code:java}
> [2018-08-27 09:51:08,516] {pod_launcher.py:58} ERROR - Exception when 
> attempting to create Namespaced Pod.
> Traceback (most recent call last):
> File "/src/apache-airflow/airflow/contrib/kubernetes/pod_launcher.py", line 
> 55, in run_pod_async
> resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6057, in create_namespaced_pod
> (data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6142, in create_namespaced_pod_with_http_info
> collection_formats=collection_formats)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 321, in call_api
> _return_http_data_only, collection_formats, _preload_content, 
> _request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 155, in __call_api
> _request_timeout=_request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 364, in request
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 266, in POST
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 222, in request
> raise ApiException(http_resp=r)
> kubernetes.client.rest.ApiException: (403)
> Reason: Forbidden
> HTTP response headers: HTTPHeaderDict({'Audit-Id': 
> 'b00e2cbb-bdb2-41f3-8090-824aee79448c', 'Content-Type': 'application/json', 
> 'Date': 'Mon, 27 Aug 2018 09:51:08 GMT', 'Content-Length': '410'})
> HTTP response body: 
> {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
>  \"podname-ec366e89ef934d91b2d3ffe96234a725\" is forbidden: exceeded quota: 
> compute-resources, requested: limits.memory=4Gi, used: limits.memory=6508Mi, 
> limited: 
> limits.memory=10Gi","reason":"Forbidden","details":{"name":"podname-ec366e89ef934d91b2d3ffe96234a725","kind":"pods"},"code":403}{code}
>  
> I would expect the scheduler to 

[GitHub] codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter 
in BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-413105867
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=h1)
 Report
   > Merging 
[#3733](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3733/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3733  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15817  
   =
   - Hits12244   12243   -1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3733/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=footer).
 Last update 
[ac9033d...07ee01b](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter 
in BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-413105867
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=h1)
 Report
   > Merging 
[#3733](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3733/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3733  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15817  
   =
   - Hits12244   12243   -1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3733/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=footer).
 Last update 
[ac9033d...07ee01b](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bolkedebruin commented on issue #3526: [AIRFLOW-2651] Add file system hooks with a common interface

2018-08-30 Thread GitBox
bolkedebruin commented on issue #3526: [AIRFLOW-2651] Add file system hooks 
with a common interface
URL: 
https://github.com/apache/incubator-airflow/pull/3526#issuecomment-417304220
 
 
   @jrderuiter I like the possibilities that this will deliver, but I think 
some architectural updates are required. The `lineage` improvements basically 
allow for the same kind of functionality and these changes will need to tie in 
with it. Maybe have a discussion offline or on th emaling list (or an 
improvement proposal) can speed this up?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-2966) KubernetesExecutor + namespace quotas kills scheduler if the pod can't be launched

2018-08-30 Thread Roland de Boo (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597385#comment-16597385
 ] 

Roland de Boo edited comment on AIRFLOW-2966 at 8/30/18 12:35 PM:
--

Colleague of John here. Some additional info:
 * Updated to 1.10.0 and retried, same issue remains
 * Last observation in the log (not mentioned above):

{{[2018-08-30 12:19:46,967] \{jobs.py:1585} INFO - Exited execute loop}}

In the Pod I can see 2 other threads remaining, but they don't seem to do 
anything.

{{$ ps -ef}}

{{airflow 16 1 0 12:19 ? 00:00:02 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}
 {{airflow 38 16 0 12:19 ? 00:00:00 /usr/local/bin/python 
/usr/local/bin/airflow scheduler -n -1}}

The Pod is stuck but does not exit. So we need to kill it by hand.

 

If we increase the quota on the namespace, nothing happens to the scheduler.

 

 


was (Author: rdeboo):
Colleague of John here. Some additional info:
 * Updated to 1.10.0 and retried, same issue remains
 * Last observation in the log (not mentioned above):

{{[2018-08-30 12:19:46,967] \{jobs.py:1585} INFO - Exited execute loop}}

In the Pod I can see 2 other threads remaining, but they don't seem to do 
anything.

{{$ ps -ef}}

{{airflow 16 1 0 12:19 ? 00:00:02 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}
{{airflow 38 16 0 12:19 ? 00:00:00 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}

The Pod is stuck but does not exit. So we need to kill it by hand.

 

 

> KubernetesExecutor + namespace quotas kills scheduler if the pod can't be 
> launched
> --
>
> Key: AIRFLOW-2966
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2966
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10
> Environment: Kubernetes 1.9.8
>Reporter: John Hofman
>Priority: Major
>
> When running Airflow in Kubernetes with the KubernetesExecutor and resource 
> quota's set on the namespace Airflow is deployed in. If the scheduler tries 
> to launch a pod into the namespace that exceeds the namespace limits it gets 
> an ApiException, and crashes the scheduler.
> This stack trace is an example of the ApiException from the kubernetes client:
> {code:java}
> [2018-08-27 09:51:08,516] {pod_launcher.py:58} ERROR - Exception when 
> attempting to create Namespaced Pod.
> Traceback (most recent call last):
> File "/src/apache-airflow/airflow/contrib/kubernetes/pod_launcher.py", line 
> 55, in run_pod_async
> resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6057, in create_namespaced_pod
> (data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6142, in create_namespaced_pod_with_http_info
> collection_formats=collection_formats)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 321, in call_api
> _return_http_data_only, collection_formats, _preload_content, 
> _request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 155, in __call_api
> _request_timeout=_request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 364, in request
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 266, in POST
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 222, in request
> raise ApiException(http_resp=r)
> kubernetes.client.rest.ApiException: (403)
> Reason: Forbidden
> HTTP response headers: HTTPHeaderDict({'Audit-Id': 
> 'b00e2cbb-bdb2-41f3-8090-824aee79448c', 'Content-Type': 'application/json', 
> 'Date': 'Mon, 27 Aug 2018 09:51:08 GMT', 'Content-Length': '410'})
> HTTP response body: 
> {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
>  \"podname-ec366e89ef934d91b2d3ffe96234a725\" is forbidden: exceeded quota: 
> compute-resources, requested: limits.memory=4Gi, used: limits.memory=6508Mi, 
> limited: 
> limits.memory=10Gi","reason":"Forbidden","details":{"name":"podname-ec366e89ef934d91b2d3ffe96234a725","kind":"pods"},"code":403}{code}
>  
> I would expect the scheduler to catch the Exception and at least mark the 
> task as failed, or better yet retry the task later.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter 
in BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-413105867
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=h1)
 Report
   > Merging 
[#3733](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **decrease** coverage by `0.04%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3733/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3733  +/-   ##
   ==
   - Coverage   77.41%   77.36%   -0.05% 
   ==
 Files 203  203  
 Lines   1581715817  
   ==
   - Hits1224412237   -7 
   - Misses   3573 3580   +7
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3733/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `75.71% <0%> (-5.72%)` | :arrow_down: |
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3733/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `69.18% <0%> (-0.13%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3733/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=footer).
 Last update 
[ac9033d...07ee01b](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2966) KubernetesExecutor + namespace quotas kills scheduler if the pod can't be launched

2018-08-30 Thread Roland de Boo (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597385#comment-16597385
 ] 

Roland de Boo commented on AIRFLOW-2966:


Colleague of John here. Some additional info:
 * Updated to 1.10.0 and retried, same issue remains
 * Last observation in the log (not mentioned above):

{{[2018-08-30 12:19:46,967] \{jobs.py:1585} INFO - Exited execute loop}}

In the Pod I can see 2 other threads remaining, but they don't seem to do 
anything.

{{$ ps -ef}}

{{airflow 16 1 0 12:19 ? 00:00:02 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}
{{airflow 38 16 0 12:19 ? 00:00:00 /usr/local/bin/python /usr/local/bin/airflow 
scheduler -n -1}}

The Pod is stuck but does not exit. So we need to kill it by hand.

 

 

> KubernetesExecutor + namespace quotas kills scheduler if the pod can't be 
> launched
> --
>
> Key: AIRFLOW-2966
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2966
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10
> Environment: Kubernetes 1.9.8
>Reporter: John Hofman
>Priority: Major
>
> When running Airflow in Kubernetes with the KubernetesExecutor and resource 
> quota's set on the namespace Airflow is deployed in. If the scheduler tries 
> to launch a pod into the namespace that exceeds the namespace limits it gets 
> an ApiException, and crashes the scheduler.
> This stack trace is an example of the ApiException from the kubernetes client:
> {code:java}
> [2018-08-27 09:51:08,516] {pod_launcher.py:58} ERROR - Exception when 
> attempting to create Namespaced Pod.
> Traceback (most recent call last):
> File "/src/apache-airflow/airflow/contrib/kubernetes/pod_launcher.py", line 
> 55, in run_pod_async
> resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6057, in create_namespaced_pod
> (data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py",
>  line 6142, in create_namespaced_pod_with_http_info
> collection_formats=collection_formats)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 321, in call_api
> _return_http_data_only, collection_formats, _preload_content, 
> _request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 155, in __call_api
> _request_timeout=_request_timeout)
> File 
> "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
> line 364, in request
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 266, in POST
> body=body)
> File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
> 222, in request
> raise ApiException(http_resp=r)
> kubernetes.client.rest.ApiException: (403)
> Reason: Forbidden
> HTTP response headers: HTTPHeaderDict({'Audit-Id': 
> 'b00e2cbb-bdb2-41f3-8090-824aee79448c', 'Content-Type': 'application/json', 
> 'Date': 'Mon, 27 Aug 2018 09:51:08 GMT', 'Content-Length': '410'})
> HTTP response body: 
> {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
>  \"podname-ec366e89ef934d91b2d3ffe96234a725\" is forbidden: exceeded quota: 
> compute-resources, requested: limits.memory=4Gi, used: limits.memory=6508Mi, 
> limited: 
> limits.memory=10Gi","reason":"Forbidden","details":{"name":"podname-ec366e89ef934d91b2d3ffe96234a725","kind":"pods"},"code":403}{code}
>  
> I would expect the scheduler to catch the Exception and at least mark the 
> task as failed, or better yet retry the task later.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bolkedebruin closed pull request #3761: Subdag inherit runid *do not merge*

2018-08-30 Thread GitBox
bolkedebruin closed pull request #3761: Subdag inherit runid *do not merge*
URL: https://github.com/apache/incubator-airflow/pull/3761
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/jobs.py b/airflow/jobs.py
index e7fff3114f..4c82e722cb 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -1922,6 +1922,7 @@ def __init__(
 ignore_task_deps=False,
 pool=None,
 delay_on_limit_secs=1.0,
+run_id_template=None,
 *args, **kwargs):
 self.dag = dag
 self.dag_id = dag.dag_id
@@ -1934,6 +1935,9 @@ def __init__(
 self.ignore_task_deps = ignore_task_deps
 self.pool = pool
 self.delay_on_limit_secs = delay_on_limit_secs
+self.run_id_template = BackfillJob.ID_FORMAT_PREFIX
+if run_id_template:
+self.run_id_template = run_id_template
 super(BackfillJob, self).__init__(*args, **kwargs)
 
 def _update_counters(self, ti_status):
@@ -2023,7 +2027,7 @@ def _get_dag_run(self, run_date, session=None):
 :type session: Session
 :return: a DagRun in state RUNNING or None
 """
-run_id = BackfillJob.ID_FORMAT_PREFIX.format(run_date.isoformat())
+run_id = self.run_id_template.format(run_date.isoformat())
 
 # consider max_active_runs but ignore when running subdags
 respect_dag_max_active_limit = (True
diff --git a/airflow/models.py b/airflow/models.py
old mode 100755
new mode 100644
index 3e296eb58b..90546f5940
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -3675,7 +3675,9 @@ def run(
 ignore_task_deps=False,
 ignore_first_depends_on_past=False,
 pool=None,
-delay_on_limit_secs=1.0):
+delay_on_limit_secs=1.0,
+run_id_template=None
+):
 """
 Runs the DAG.
 
@@ -3703,6 +3705,8 @@ def run(
 :param delay_on_limit_secs: Time in seconds to wait before next 
attempt to run
 dag run when max_active_runs limit has been reached
 :type delay_on_limit_secs: float
+:param run_id_template: Template for the run_id to be with the 
execution date
+:type run_id_template: string
 """
 from airflow.jobs import BackfillJob
 if not executor and local:
@@ -3720,7 +3724,9 @@ def run(
 ignore_task_deps=ignore_task_deps,
 ignore_first_depends_on_past=ignore_first_depends_on_past,
 pool=pool,
-delay_on_limit_secs=delay_on_limit_secs)
+delay_on_limit_secs=delay_on_limit_secs,
+run_id_template=run_id_template
+)
 job.run()
 
 def cli(self):
diff --git a/airflow/operators/subdag_operator.py 
b/airflow/operators/subdag_operator.py
index 9445c4c96d..369c645ed7 100644
--- a/airflow/operators/subdag_operator.py
+++ b/airflow/operators/subdag_operator.py
@@ -87,6 +87,11 @@ def __init__(
 
 def execute(self, context):
 ed = context['execution_date']
+# Use the parent's run id as a template for the subdag dag run's run_id
+run_id = context['run_id']
+run_id_template = run_id + '.{0}'
 self.subdag.run(
 start_date=ed, end_date=ed, donot_pickle=True,
-executor=self.executor)
+executor=self.executor,
+run_id_template=run_id_template
+)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bolkedebruin commented on issue #3761: Subdag inherit runid *do not merge*

2018-08-30 Thread GitBox
bolkedebruin commented on issue #3761: Subdag inherit runid *do not merge*
URL: 
https://github.com/apache/incubator-airflow/pull/3761#issuecomment-417302471
 
 
   please run this on your own CI and discuss it on the mailing list. CI costs 
Apache Infra money.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r213894129
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -566,95 +612,108 @@ def run_query(self,
   'Airflow.',
   category=DeprecationWarning)
 
-if sql is None:
-raise TypeError('`BigQueryBaseCursor.run_query` missing 1 required 
'
-'positional argument: `sql`')
+if not sql and not configuration['query'].get('query', None):
+raise TypeError('`BigQueryBaseCursor.run_query` '
+'missing 1 required positional argument: `sql`')
+
+# BigQuery also allows you to define how you want a table's schema
+# to change as a side effect of a query job for more details:
+# 
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
 
-# BigQuery also allows you to define how you want a table's schema to 
change
-# as a side effect of a query job
-# for more details:
-#   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.schemaUpdateOptions
 allowed_schema_update_options = [
 'ALLOW_FIELD_ADDITION', "ALLOW_FIELD_RELAXATION"
 ]
-if not set(allowed_schema_update_options).issuperset(
-set(schema_update_options)):
-raise ValueError(
-"{0} contains invalid schema update options. "
-"Please only use one or more of the following options: {1}"
-.format(schema_update_options, allowed_schema_update_options))
 
-if use_legacy_sql is None:
-use_legacy_sql = self.use_legacy_sql
+if not set(allowed_schema_update_options
+   ).issuperset(set(schema_update_options)):
+raise ValueError("{0} contains invalid schema update options. "
+ "Please only use one or more of the following "
+ "options: {1}"
+ .format(schema_update_options,
+ allowed_schema_update_options))
 
-configuration = {
-'query': {
-'query': sql,
-'useLegacySql': use_legacy_sql,
-'maximumBillingTier': maximum_billing_tier,
-'maximumBytesBilled': maximum_bytes_billed,
-'priority': priority
-}
-}
+if schema_update_options:
+if write_disposition not in ["WRITE_APPEND", "WRITE_TRUNCATE"]:
+raise ValueError("schema_update_options is only "
+ "allowed if write_disposition is "
+ "'WRITE_APPEND' or 'WRITE_TRUNCATE'.")
 
 if destination_dataset_table:
-if '.' not in destination_dataset_table:
-raise ValueError(
-'Expected destination_dataset_table name in the format of '
-'.. Got: {}'.format(
-destination_dataset_table))
 destination_project, destination_dataset, destination_table = \
 _split_tablename(table_input=destination_dataset_table,
  default_project_id=self.project_id)
-configuration['query'].update({
-'allowLargeResults': allow_large_results,
-'flattenResults': flatten_results,
-'writeDisposition': write_disposition,
-'createDisposition': create_disposition,
-'destinationTable': {
-'projectId': destination_project,
-'datasetId': destination_dataset,
-'tableId': destination_table,
-}
-})
-if udf_config:
-if not isinstance(udf_config, list):
-raise TypeError("udf_config argument must have a type 'list'"
-" not {}".format(type(udf_config)))
-configuration['query'].update({
-'userDefinedFunctionResources': udf_config
-})
 
-if query_params:
-if self.use_legacy_sql:
-raise ValueError("Query parameters are not allowed when using "
- "legacy SQL")
-else:
-configuration['query']['queryParameters'] = query_params
+destination_dataset_table = {
+'projectId': destination_project,
+'datasetId': destination_dataset,
+'tableId': destination_table,
+}
 
-if labels:
-configuration['labels'] = labels
+query_param_list = [
+(sql, 

[jira] [Commented] (AIRFLOW-2779) Verify and correct licenses

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597382#comment-16597382
 ] 

ASF GitHub Bot commented on AIRFLOW-2779:
-

bolkedebruin closed pull request #3803: [AIRFLOW-2779] Restore Copyright notice 
of GHE auth backend
URL: https://github.com/apache/incubator-airflow/pull/3803
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Verify and correct licenses
> ---
>
> Key: AIRFLOW-2779
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2779
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bolke de Bruin
>Priority: Major
>  Labels: licenses
> Fix For: 1.10.0
>
>
> # {color:#00}/airflow/security/utils.py{color}
> {color:#00}2. ./airflow/security/kerberos.py{color}
> {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color}
> {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color}
> {color:#00}5. 
> ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color}
> {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional]
> {color:#00}7. ./docs/license.rst{color}
> {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color}
> {color:#00}9. 
> /airflow/contrib/auth/backends/github_enterprise_auth.py{color}
> {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color}
> {color:#00}11. /airflow/minihivecluster.py{color}
> {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files 
> that refers to a NOTICE file, that information in that NOTICE file (at the 
> very least the copyright into) should be in your NOTICE file. This should 
> also be noted in LICENSE.{color}
>  
> {color:#00}LICENSE is:
> - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not 
> required to list them but it’s a good idea to do so.
> - missing the license for this [5]
> - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom 
> of it{color}
>  
>  * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t 
> valid as copyright has an expiry date{color}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bolkedebruin closed pull request #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend

2018-08-30 Thread GitBox
bolkedebruin closed pull request #3803: [AIRFLOW-2779] Restore Copyright notice 
of GHE auth backend
URL: https://github.com/apache/incubator-airflow/pull/3803
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2984) Cannot convert naive_datetime when task has a naive start_date/end_date

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597377#comment-16597377
 ] 

ASF GitHub Bot commented on AIRFLOW-2984:
-

bolkedebruin closed pull request #3822: [AIRFLOW-2984] Convert operator dates 
to UTC
URL: https://github.com/apache/incubator-airflow/pull/3822
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 55badf4828..94e18794d6 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -2413,10 +2413,17 @@ def __init__(
 self.email = email
 self.email_on_retry = email_on_retry
 self.email_on_failure = email_on_failure
+
 self.start_date = start_date
 if start_date and not isinstance(start_date, datetime):
 self.log.warning("start_date for %s isn't datetime.datetime", self)
+elif start_date:
+self.start_date = timezone.convert_to_utc(start_date)
+
 self.end_date = end_date
+if end_date:
+self.end_date = timezone.convert_to_utc(end_date)
+
 if not TriggerRule.is_valid(trigger_rule):
 raise AirflowException(
 "The trigger_rule must be one of {all_triggers},"
diff --git a/docs/timezone.rst b/docs/timezone.rst
index 9e8598e2ed..fe44ecfbb9 100644
--- a/docs/timezone.rst
+++ b/docs/timezone.rst
@@ -2,23 +2,23 @@ Time zones
 ==
 
 Support for time zones is enabled by default. Airflow stores datetime 
information in UTC internally and in the database.
-It allows you to run your DAGs with time zone dependent schedules. At the 
moment Airflow does not convert them to the 
-end user’s time zone in the user interface. There it will always be displayed 
in UTC. Also templates used in Operators 
+It allows you to run your DAGs with time zone dependent schedules. At the 
moment Airflow does not convert them to the
+end user’s time zone in the user interface. There it will always be displayed 
in UTC. Also templates used in Operators
 are not converted. Time zone information is exposed and it is up to the writer 
of DAG what do with it.
 
-This is handy if your users live in more than one time zone and you want to 
display datetime information according to 
+This is handy if your users live in more than one time zone and you want to 
display datetime information according to
 each user’s wall clock.
 
-Even if you are running Airflow in only one time zone it is still good 
practice to store data in UTC in your database 
-(also before Airflow became time zone aware this was also to recommended or 
even required setup). The main reason is 
-Daylight Saving Time (DST). Many countries have a system of DST, where clocks 
are moved forward in spring and backward 
-in autumn. If you’re working in local time, you’re likely to encounter errors 
twice a year, when the transitions 
-happen. (The pendulum and pytz documentation discusses these issues in greater 
detail.) This probably doesn’t matter 
-for a simple DAG, but it’s a problem if you are in, for example, financial 
services where you have end of day 
-deadlines to meet. 
+Even if you are running Airflow in only one time zone it is still good 
practice to store data in UTC in your database
+(also before Airflow became time zone aware this was also to recommended or 
even required setup). The main reason is
+Daylight Saving Time (DST). Many countries have a system of DST, where clocks 
are moved forward in spring and backward
+in autumn. If you’re working in local time, you’re likely to encounter errors 
twice a year, when the transitions
+happen. (The pendulum and pytz documentation discusses these issues in greater 
detail.) This probably doesn’t matter
+for a simple DAG, but it’s a problem if you are in, for example, financial 
services where you have end of day
+deadlines to meet.
 
-The time zone is set in `airflow.cfg`. By default it is set to utc, but you 
change it to use the system’s settings or 
-an arbitrary IANA time zone, e.g. `Europe/Amsterdam`. It is dependent on 
`pendulum`, which is more accurate than `pytz`. 
+The time zone is set in `airflow.cfg`. By default it is set to utc, but you 
change it to use the system’s settings or
+an arbitrary IANA time zone, e.g. `Europe/Amsterdam`. It is dependent on 
`pendulum`, which is more accurate than `pytz`.
 Pendulum is installed when you install Airflow.
 
 Please note that the Web UI currently only runs in UTC.
@@ -28,8 +28,8 @@ Concepts
 Naïve and aware datetime objects
 
 
-Python’s datetime.datetime objects have a tzinfo attribute that can be used to 
store time zone information, 
-represented as an instance of a subclass of 

[GitHub] bolkedebruin closed pull request #3822: [AIRFLOW-2984] Convert operator dates to UTC

2018-08-30 Thread GitBox
bolkedebruin closed pull request #3822: [AIRFLOW-2984] Convert operator dates 
to UTC
URL: https://github.com/apache/incubator-airflow/pull/3822
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index 55badf4828..94e18794d6 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -2413,10 +2413,17 @@ def __init__(
 self.email = email
 self.email_on_retry = email_on_retry
 self.email_on_failure = email_on_failure
+
 self.start_date = start_date
 if start_date and not isinstance(start_date, datetime):
 self.log.warning("start_date for %s isn't datetime.datetime", self)
+elif start_date:
+self.start_date = timezone.convert_to_utc(start_date)
+
 self.end_date = end_date
+if end_date:
+self.end_date = timezone.convert_to_utc(end_date)
+
 if not TriggerRule.is_valid(trigger_rule):
 raise AirflowException(
 "The trigger_rule must be one of {all_triggers},"
diff --git a/docs/timezone.rst b/docs/timezone.rst
index 9e8598e2ed..fe44ecfbb9 100644
--- a/docs/timezone.rst
+++ b/docs/timezone.rst
@@ -2,23 +2,23 @@ Time zones
 ==
 
 Support for time zones is enabled by default. Airflow stores datetime 
information in UTC internally and in the database.
-It allows you to run your DAGs with time zone dependent schedules. At the 
moment Airflow does not convert them to the 
-end user’s time zone in the user interface. There it will always be displayed 
in UTC. Also templates used in Operators 
+It allows you to run your DAGs with time zone dependent schedules. At the 
moment Airflow does not convert them to the
+end user’s time zone in the user interface. There it will always be displayed 
in UTC. Also templates used in Operators
 are not converted. Time zone information is exposed and it is up to the writer 
of DAG what do with it.
 
-This is handy if your users live in more than one time zone and you want to 
display datetime information according to 
+This is handy if your users live in more than one time zone and you want to 
display datetime information according to
 each user’s wall clock.
 
-Even if you are running Airflow in only one time zone it is still good 
practice to store data in UTC in your database 
-(also before Airflow became time zone aware this was also to recommended or 
even required setup). The main reason is 
-Daylight Saving Time (DST). Many countries have a system of DST, where clocks 
are moved forward in spring and backward 
-in autumn. If you’re working in local time, you’re likely to encounter errors 
twice a year, when the transitions 
-happen. (The pendulum and pytz documentation discusses these issues in greater 
detail.) This probably doesn’t matter 
-for a simple DAG, but it’s a problem if you are in, for example, financial 
services where you have end of day 
-deadlines to meet. 
+Even if you are running Airflow in only one time zone it is still good 
practice to store data in UTC in your database
+(also before Airflow became time zone aware this was also to recommended or 
even required setup). The main reason is
+Daylight Saving Time (DST). Many countries have a system of DST, where clocks 
are moved forward in spring and backward
+in autumn. If you’re working in local time, you’re likely to encounter errors 
twice a year, when the transitions
+happen. (The pendulum and pytz documentation discusses these issues in greater 
detail.) This probably doesn’t matter
+for a simple DAG, but it’s a problem if you are in, for example, financial 
services where you have end of day
+deadlines to meet.
 
-The time zone is set in `airflow.cfg`. By default it is set to utc, but you 
change it to use the system’s settings or 
-an arbitrary IANA time zone, e.g. `Europe/Amsterdam`. It is dependent on 
`pendulum`, which is more accurate than `pytz`. 
+The time zone is set in `airflow.cfg`. By default it is set to utc, but you 
change it to use the system’s settings or
+an arbitrary IANA time zone, e.g. `Europe/Amsterdam`. It is dependent on 
`pendulum`, which is more accurate than `pytz`.
 Pendulum is installed when you install Airflow.
 
 Please note that the Web UI currently only runs in UTC.
@@ -28,8 +28,8 @@ Concepts
 Naïve and aware datetime objects
 
 
-Python’s datetime.datetime objects have a tzinfo attribute that can be used to 
store time zone information, 
-represented as an instance of a subclass of datetime.tzinfo. When this 
attribute is set and describes an offset, 
+Python’s datetime.datetime objects have a tzinfo attribute that can be used to 
store time zone information,
+represented as an instance of a subclass of datetime.tzinfo. When 

[GitHub] ashb commented on a change in pull request #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
ashb commented on a change in pull request #3823: [AIRFLOW-2985] An operator 
for S3 object copying
URL: https://github.com/apache/incubator-airflow/pull/3823#discussion_r214007887
 
 

 ##
 File path: airflow/contrib/operators/s3_copy_object_operator.py
 ##
 @@ -0,0 +1,84 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from pprint import pformat
+
+from airflow.hooks.S3_hook import S3Hook
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class S3CopyObjectOperator(BaseOperator):
 
 Review comment:
   Please add a link to this class in docs/code.rst


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
ashb commented on a change in pull request #3823: [AIRFLOW-2985] An operator 
for S3 object copying
URL: https://github.com/apache/incubator-airflow/pull/3823#discussion_r214007593
 
 

 ##
 File path: airflow/contrib/operators/s3_copy_object_operator.py
 ##
 @@ -0,0 +1,84 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from pprint import pformat
+
+from airflow.hooks.S3_hook import S3Hook
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class S3CopyObjectOperator(BaseOperator):
+"""
+Creates a copy of an object that is already stored in S3.
+
+:param dest_s3_bucket: The name of the bucket to copy to
+:type dest_s3_bucket: str
+:param dest_s3_key: The name of the key to copy to
+:type dest_s3_key: str
+:param source_s3_bucket: The name of the source bucket
+:type source_s3_bucket: str
+:param source_s3_key: Key name of the source object
+:type source_s3_key: str
+:param source_s3_version_id: Version ID of the source object (OPTIONAL)
+:type source_s3_version_id: str
+:param s3_conn_id: Connection id of the S3 connection to use
+:type s3_conn_id: str
+:parame verify: Whether or not to verify SSL certificates for S3 connetion.
+By default SSL certificates are verified.
+You can provide the following values:
+- False: do not validate SSL certificates. SSL will still be used,
+ but SSL certificates will not be
+ verified.
+- path/to/cert/bundle.pem: A filename of the CA cert bundle to uses.
+ You can specify this argument if you want to use a different
+ CA cert bundle than the one used by botocore.
+This is also applicable to ``dest_verify``.
+:type verify: bool or str
+"""
+
+@apply_defaults
+def __init__(
+self,
+dest_s3_bucket,
+dest_s3_key,
+source_s3_bucket,
 
 Review comment:
   It would be nice to support a single `s3://bucket/key` style parameter here 
like we do in the other S3 ops/sensors 
https://github.com/apache/incubator-airflow/blob/ac9033db0981ae1f770a8bdb5597055751ab15bd/airflow/sensors/s3_key_sensor.py#L34-L39


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417296502
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=h1)
 Report
   > Merging 
[#3823](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3823/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3823   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=footer).
 Last update 
[ac9033d...4fc1df4](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
codecov-io commented on issue #3823: [AIRFLOW-2985] An operator for S3 object 
copying
URL: 
https://github.com/apache/incubator-airflow/pull/3823#issuecomment-417296502
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=h1)
 Report
   > Merging 
[#3823](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3823/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3823   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=footer).
 Last update 
[ac9033d...4fc1df4](https://codecov.io/gh/apache/incubator-airflow/pull/3823?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2985) An operator for S3 object copying [boto3.client.copy_object()]

2018-08-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597338#comment-16597338
 ] 

ASF GitHub Bot commented on AIRFLOW-2985:
-

XD-DENG opened a new pull request #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: https://github.com/apache/incubator-airflow/pull/3823
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2985
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Currently we don't have an operator in Airflow to help copy objects within 
S3, while this is a quite common use case when we deal with the data in S3.
   
   Under the hood, this operator is using `boto3.client.copy_object()`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Test case has been added
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> An operator for S3 object copying [boto3.client.copy_object()]
> --
>
> Key: AIRFLOW-2985
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2985
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Minor
>
> Currently we don't have an operator in Airflow to help copy objects within 
> S3, while this is a quite common use case when we deal with the data in S3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XD-DENG opened a new pull request #3823: [AIRFLOW-2985] An operator for S3 object copying

2018-08-30 Thread GitBox
XD-DENG opened a new pull request #3823: [AIRFLOW-2985] An operator for S3 
object copying
URL: https://github.com/apache/incubator-airflow/pull/3823
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2985
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Currently we don't have an operator in Airflow to help copy objects within 
S3, while this is a quite common use case when we deal with the data in S3.
   
   Under the hood, this operator is using `boto3.client.copy_object()`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Test case has been added
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2985) An operator for S3 object copying [boto3.client.copy_object()]

2018-08-30 Thread Xiaodong DENG (JIRA)
Xiaodong DENG created AIRFLOW-2985:
--

 Summary: An operator for S3 object copying 
[boto3.client.copy_object()]
 Key: AIRFLOW-2985
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2985
 Project: Apache Airflow
  Issue Type: Improvement
  Components: operators
Reporter: Xiaodong DENG
Assignee: Xiaodong DENG


Currently we don't have an operator in Airflow to help copy objects within S3, 
while this is a quite common use case when we deal with the data in S3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-30 Thread GitBox
codecov-io edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter 
in BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-413105867
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=h1)
 Report
   > Merging 
[#3733](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ac9033db0981ae1f770a8bdb5597055751ab15bd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3733/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3733   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=footer).
 Last update 
[ac9033d...3e7742d](https://codecov.io/gh/apache/incubator-airflow/pull/3733?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >