[jira] [Created] (AIRFLOW-2931) Make Quick Start work when user doesn't have `cryptography` installed

2018-08-21 Thread Mitchell Lloyd (JIRA)
Mitchell Lloyd created AIRFLOW-2931:
---

 Summary: Make Quick Start work when user doesn't have 
`cryptography` installed
 Key: AIRFLOW-2931
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2931
 Project: Apache Airflow
  Issue Type: Improvement
Affects Versions: Airflow 1.8
Reporter: Mitchell Lloyd


Following the [Quick Start 
guide|https://airflow.incubator.apache.org/start.html] I ran into the following 
error when running airflow initdb:
{code:java}
[2018-08-21 21:31:38,602] {__init__.py:45} INFO - Using executor 
SequentialExecutor DB: sqlite:Users/mitch/projects/airflow/airflow.db 
[2018-08-21 21:31:38,716] {db.py:312} INFO - Creating tables INFO 
[alembic.runtime.migration] Context impl SQLiteImpl. INFO 
[alembic.runtime.migration] Will assume non-transactional DDL. [2018-08-21 
21:31:38,792] {models.py:643} ERROR - Failed to load fernet while encrypting 
value, using non-encrypted value. Traceback (most recent call last): File 
"/Users/mitch/.pyenv/versions/3.6.6/lib/python3.6/site-packages/airflow/models.py",
 line 105, in get_fernet return Fernet(configuration.get('core', 
'FERNET_KEY').encode('utf-8')) File 
"/Users/mitch/.pyenv/versions/3.6.6/lib/python3.6/site-packages/cryptography/fernet.py",
 line 34, in __init__ key = base64.urlsafe_b64decode(key) File 
"/Users/mitch/.pyenv/versions/3.6.6/lib/python3.6/base64.py", line 133, in 
urlsafe_b64decode return b64decode(s) File 
"/Users/mitch/.pyenv/versions/3.6.6/lib/python3.6/base64.py", line 87, in 
b64decode return binascii.a2b_base64(s) binascii.Error: Incorrect padding{code}
 My airflow.cfg file contained
{code:java}
# Secret key to save connection passwords in the db fernet_key = 
cryptography_not_found_storing_passwords_in_plain_text{code}

[~kevcampb] Helped me on gitter and directed me to [this 
link|https://bcb.github.io/airflow/fernet-key] which resolved my issue.

It seems that the quick start instructions should be updated to include this 
fernet key generation information or perhaps there is a way to ensure that 
cryptography is installed when apache-airflow is installed and automatically 
generate a key. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove shell=True and bash from task launch

2018-08-21 Thread GitBox
YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove 
shell=True and bash from task launch
URL: https://github.com/apache/incubator-airflow/pull/3740#discussion_r211827451
 
 

 ##
 File path: airflow/executors/celery_executor.py
 ##
 @@ -84,7 +84,7 @@ def execute_async(self, key, command,
 self.log.info("[celery] queuing {key} through celery, "
   "queue={queue}".format(**locals()))
 self.tasks[key] = execute_command.apply_async(
-args=[command], queue=queue)
+args=command, queue=queue)
 
 Review comment:
   I would like to. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on a change in pull request #3740: [AIRFLOW-2888] Remove shell=True and bash from task launch

2018-08-21 Thread GitBox
feng-tao commented on a change in pull request #3740: [AIRFLOW-2888] Remove 
shell=True and bash from task launch
URL: https://github.com/apache/incubator-airflow/pull/3740#discussion_r211821195
 
 

 ##
 File path: airflow/executors/celery_executor.py
 ##
 @@ -84,7 +84,7 @@ def execute_async(self, key, command,
 self.log.info("[celery] queuing {key} through celery, "
   "queue={queue}".format(**locals()))
 self.tasks[key] = execute_command.apply_async(
-args=[command], queue=queue)
+args=command, queue=queue)
 
 Review comment:
   @YingboWang , good find. Do you want to create a pr for this issue?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-2930) scheduler exit when using celery executor

2018-08-21 Thread Yingbo Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-2930 started by Yingbo Wang.

> scheduler exit when using celery executor
> -
>
> Key: AIRFLOW-2930
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2930
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Reporter: Yingbo Wang
>Assignee: Yingbo Wang
>Priority: Major
>
> Caused by:
> [https://github.com/apache/incubator-airflow/pull/3740]
>  
> Use CeleryExecutor for airflow, scheduler exit after a Dag is activated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2930) scheduler exit when using celery executor

2018-08-21 Thread Yingbo Wang (JIRA)
Yingbo Wang created AIRFLOW-2930:


 Summary: scheduler exit when using celery executor
 Key: AIRFLOW-2930
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2930
 Project: Apache Airflow
  Issue Type: Bug
  Components: celery
Reporter: Yingbo Wang
Assignee: Yingbo Wang


Caused by:

[https://github.com/apache/incubator-airflow/pull/3740]

 

Use CeleryExecutor for airflow, scheduler exit after a Dag is activated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao commented on issue #3754: [Airflow-1737] Handle `execution_date`s with fractional seconds in www/views

2018-08-21 Thread GitBox
feng-tao commented on issue #3754: [Airflow-1737] Handle `execution_date`s with 
fractional seconds in www/views
URL: 
https://github.com/apache/incubator-airflow/pull/3754#issuecomment-414893693
 
 
   @7yl4r , I don't think we update 1.9 branch any more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414892947
 
 
   lgtm. @Fokko , I think those are from your local build dir. I can't find 
those in my repo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao removed a comment on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
feng-tao removed a comment on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414890984
 
 
   lgtm


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao removed a comment on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
feng-tao removed a comment on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414891090
 
 
   @Fokko , I think the files are from your local build dir. I can't find from 
my repo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414891090
 
 
   @Fokko , I think the files are from your local build dir. I can't find from 
my repo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
feng-tao commented on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414890984
 
 
   lgtm


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #3773: [AIRFLOW-2921][AIRFLOW-2922] Fix two potential bugs in CeleryExecutor()

2018-08-21 Thread GitBox
XD-DENG commented on issue #3773: [AIRFLOW-2921][AIRFLOW-2922] Fix two 
potential bugs in CeleryExecutor()
URL: 
https://github.com/apache/incubator-airflow/pull/3773#issuecomment-414875711
 
 
   Thanks @yrqls21 ! Luckily Big `O` is not involved this time ;-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ChengzhiZhao commented on issue #3730: [AIRFLOW-2882] Add import and export for pool cli using JSON

2018-08-21 Thread GitBox
ChengzhiZhao commented on issue #3730: [AIRFLOW-2882] Add import and export for 
pool cli using JSON
URL: 
https://github.com/apache/incubator-airflow/pull/3730#issuecomment-414874406
 
 
   I added a new JIRA ticket to add set and get in pool class to get more 
descriptive response: https://issues.apache.org/jira/browse/AIRFLOW-2929


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on issue #3773: [AIRFLOW-2921][AIRFLOW-2922] Fix two potential bugs in CeleryExecutor()

2018-08-21 Thread GitBox
yrqls21 commented on issue #3773: [AIRFLOW-2921][AIRFLOW-2922] Fix two 
potential bugs in CeleryExecutor()
URL: 
https://github.com/apache/incubator-airflow/pull/3773#issuecomment-414874161
 
 
   Great catch, we have actually a patch in Airbnb that covered the 2nd bug( 
involves more changes and we're baking it in our env).
   
   I agree on the fix for the 1st bug and its test. Also agree that it is very 
hard to have test to guard the 2nd bug.
   
   Tyvm for the fixes, LGTM.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2929) Add get and set for pool class in models.py

2018-08-21 Thread Chengzhi Zhao (JIRA)
Chengzhi Zhao created AIRFLOW-2929:
--

 Summary: Add get and set for pool class in models.py
 Key: AIRFLOW-2929
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2929
 Project: Apache Airflow
  Issue Type: Improvement
  Components: models
Affects Versions: 1.9.0
Reporter: Chengzhi Zhao
Assignee: Chengzhi Zhao


Currently Pool class in models.py doesn't have get and set method, I suggest we 
add those methods to make them similar as Variable/Connection class, it will 
also be easier to have cli get more descriptive response as discussed here 
https://github.com/apache/incubator-airflow/pull/3730#discussion_r210185544



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211802793
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   Yup.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove shell=True and bash from task launch

2018-08-21 Thread GitBox
YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove 
shell=True and bash from task launch
URL: https://github.com/apache/incubator-airflow/pull/3740#discussion_r211801472
 
 

 ##
 File path: airflow/executors/celery_executor.py
 ##
 @@ -84,7 +84,7 @@ def execute_async(self, key, command,
 self.log.info("[celery] queuing {key} through celery, "
   "queue={queue}".format(**locals()))
 self.tasks[key] = execute_command.apply_async(
-args=[command], queue=queue)
+args=command, queue=queue)
 
 Review comment:
   @bolkedebruin 
   Just found an issue for this commit. The update for function "apply_async" 
will crash scheduler when using celery executor. Tested in local environment.
   Concern for update in line 87:
   >**Before**: 
   >`args=[command]`  and command is a unicode type string . 
   > samples: `command = "airflow run dag323..."` and at this time `args = 
["airflow run example"]`
   >`execute_command("airflow run dag323")` asynchronously
   >**After**: 
   >`args=command` and command is a list of short unicode strings.
   > samples: `command = ["airflow", "run", "dag323",...]` and now`args = 
["airflow", "run", "dag323", ...]`
   >`execute_command("airflow", "run", "dag323",...)` ** Error here**
   
   `execute_command.apply_async(args = command, queue = queue)` will pass a 
list of arguments rather than one to function `execute_command` which is 
defined as taking only one argument. The test in test_celery_executor.py right 
now only have one item in the testing `command` list and can not cover this 
case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove shell=True and bash from task launch

2018-08-21 Thread GitBox
YingboWang commented on a change in pull request #3740: [AIRFLOW-2888] Remove 
shell=True and bash from task launch
URL: https://github.com/apache/incubator-airflow/pull/3740#discussion_r211801472
 
 

 ##
 File path: airflow/executors/celery_executor.py
 ##
 @@ -84,7 +84,7 @@ def execute_async(self, key, command,
 self.log.info("[celery] queuing {key} through celery, "
   "queue={queue}".format(**locals()))
 self.tasks[key] = execute_command.apply_async(
-args=[command], queue=queue)
+args=command, queue=queue)
 
 Review comment:
   @bolkedebruin 
   Just found an issue for this commit. The update for function "apply_async" 
will crash scheduler when using celery executor. Tested in local environment.
   Concern for update in line 87:
   **Before**: 
   >`args=[command]`  and command is a unicode type string . 
   > samples: `command = "airflow run dag323..."` and at this time `args = 
["airflow run example"]`
   >`execute_command("airflow run dag323")` asynchronously
   **After**: 
   >`args=command` and command is a list of short unicode strings.
   > samples: `command = ["airflow", "run", "dag323",...]` and now`args = 
["airflow", "run", "dag323", ...]`
   >`execute_command("airflow", "run", "dag323",...)` ** Error here**
   
   `execute_command.apply_async(args = command, queue = queue)` will pass a 
list of arguments rather than one to function `execute_command` which is 
defined as taking only one argument. The test in test_celery_executor.py right 
now only have one item in the testing `command` list and can not cover this 
case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211783774
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   So, you think it is safe to drop the test for `TEMPORARILY_UNAVAILABLE` code 
in the JSON response, and simply retry for any 5XX error?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #3777: Add Flipp to list of Airflow users

2018-08-21 Thread GitBox
kaxil commented on a change in pull request #3777: Add Flipp to list of Airflow 
users
URL: https://github.com/apache/incubator-airflow/pull/3777#discussion_r211774722
 
 

 ##
 File path: README.md
 ##
 @@ -144,6 +144,7 @@ Currently **officially** using Airflow:
 1. [Easy Taxi](http://www.easytaxi.com/) 
[[@caique-lima](https://github.com/caique-lima) & 
[@WesleyBatista](https://github.com/WesleyBatista) & 
[@diraol](https://github.com/diraol)]
 1. [eRevalue](https://www.datamaran.com) 
[[@hamedhsn](https://github.com/hamedhsn)]
 1. [evo.company](https://evo.company/) 
[[@orhideous](https://github.com/orhideous)]
+1. [Flipp] (https://www.flipp.com) [@sethwilsonwishabi]
 
 Review comment:
   It should be 
   ```
   [Flipp](https://www.flipp.com) 
[[@sethwilsonwishabi](https://github.com/sethwilsonwishabi)]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #3778: [AIRFLOW-XXX] Fix some operator names in the docs

2018-08-21 Thread GitBox
kaxil closed pull request #3778: [AIRFLOW-XXX] Fix some operator names in the 
docs
URL: https://github.com/apache/incubator-airflow/pull/3778
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/concepts.rst b/docs/concepts.rst
index 09c1293806..50c18c9b98 100644
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -115,13 +115,12 @@ Airflow provides operators for many common tasks, 
including:
 - ``BashOperator`` - executes a bash command
 - ``PythonOperator`` - calls an arbitrary Python function
 - ``EmailOperator`` - sends an email
-- ``HTTPOperator`` - sends an HTTP request
+- ``SimpleHttpOperator`` - sends an HTTP request
 - ``MySqlOperator``, ``SqliteOperator``, ``PostgresOperator``, 
``MsSqlOperator``, ``OracleOperator``, ``JdbcOperator``, etc. - executes a SQL 
command
 - ``Sensor`` - waits for a certain time, file, database row, S3 key, etc...
 
-
 In addition to these basic building blocks, there are many more specific
-operators: ``DockerOperator``, ``HiveOperator``, ``S3FileTransferOperator``,
+operators: ``DockerOperator``, ``HiveOperator``, ``S3FileTransformOperator``,
 ``PrestoToMysqlOperator``, ``SlackOperator``... you get the idea!
 
 The ``airflow/contrib/`` directory contains yet more operators built by the


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-21 Thread GitBox
codecov-io edited a comment on issue #3779: [AIRFLOW-2928] replace uuid1 with 
uuid4 for better randomness
URL: 
https://github.com/apache/incubator-airflow/pull/3779#issuecomment-414834847
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=h1)
 Report
   > Merging 
[#3779](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3779/graphs/tree.svg?height=150=650=WdLKlKHOAU=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3779  +/-   ##
   ==
   - Coverage   77.67%   77.67%   -0.01% 
   ==
 Files 204  204  
 Lines   1584015840  
   ==
   - Hits1230412303   -1 
   - Misses   3536 3537   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3779/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=footer).
 Last update 
[27309b1...31c2c8d](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-21 Thread GitBox
codecov-io edited a comment on issue #3779: [AIRFLOW-2928] replace uuid1 with 
uuid4 for better randomness
URL: 
https://github.com/apache/incubator-airflow/pull/3779#issuecomment-414834847
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=h1)
 Report
   > Merging 
[#3779](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3779/graphs/tree.svg?height=150=pr=WdLKlKHOAU=650)](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3779  +/-   ##
   ==
   - Coverage   77.67%   77.67%   -0.01% 
   ==
 Files 204  204  
 Lines   1584015840  
   ==
   - Hits1230412303   -1 
   - Misses   3536 3537   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3779/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=footer).
 Last update 
[27309b1...31c2c8d](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-21 Thread GitBox
codecov-io commented on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 
for better randomness
URL: 
https://github.com/apache/incubator-airflow/pull/3779#issuecomment-414834847
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=h1)
 Report
   > Merging 
[#3779](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3779/graphs/tree.svg?width=650=150=pr=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3779  +/-   ##
   ==
   - Coverage   77.67%   77.67%   -0.01% 
   ==
 Files 204  204  
 Lines   1584015840  
   ==
   - Hits1230412303   -1 
   - Misses   3536 3537   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3779/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=footer).
 Last update 
[27309b1...31c2c8d](https://codecov.io/gh/apache/incubator-airflow/pull/3779?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211770305
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
 
 Review comment:
   I think it's best to move it outside of the class here. See discussion in 
https://stackoverflow.com/questions/735975/static-methods-in-python.
   
   > Finally, use staticmethod() sparingly! There are very few situations where 
static-methods are necessary in Python, and I've seen them used many times 
where a separate "top-level" function would have been clearer.
   
   > You don't really need to use the staticmethod decorator. Just declaring a 
method (that doesn't expect the self parameter) and call it from the class. The 
decorator is only there in case you want to be able to call it from an instance 
as well (which was not what you wanted to do)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211769661
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   Yeah that sounds correct and may be simpler than parsing the response into 
JSON.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2928) Use uuid.uuid4 to create unique job name

2018-08-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588015#comment-16588015
 ] 

ASF GitHub Bot commented on AIRFLOW-2928:
-

imsut opened a new pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 
for better randomness
URL: https://github.com/apache/incubator-airflow/pull/3779
 
 
   ### Description
   
   some components in Airflow use the first 8 bytes of uuid.uuid1 to generate a 
unique job name. The first 8 bytes, however, seem to come from clock. so if 
this is called multiple times in a short time period, two ids will likely 
collide.
   uuid.uuid4 provides random values.
   
   ### Tests
   
   ```py
   Python 2.7.15 (default, Jun 17 2018, 12:46:58)
   [GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import uuid
   >>> for i in range(10):
   ...   uuid.uuid1()
   ...
   UUID('e8bc9959-a586-11e8-ab8c-8c859010d0c2')
   UUID('e8c254e3-a586-11e8-ac39-8c859010d0c2')
   UUID('e8c2560f-a586-11e8-8251-8c859010d0c2')
   UUID('e8c256c2-a586-11e8-994a-8c859010d0c2')
   UUID('e8c25759-a586-11e8-9ba6-8c859010d0c2')
   UUID('e8c257e6-a586-11e8-a854-8c859010d0c2')
   UUID('e8c2587d-a586-11e8-89e9-8c859010d0c2')
   UUID('e8c2590a-a586-11e8-a825-8c859010d0c2')
   UUID('e8c25994-a586-11e8-9421-8c859010d0c2')
   UUID('e8c25a21-a586-11e8-83fd-8c859010d0c2')
   >>> for i in range(10):
   ...   uuid.uuid4()
   ...
   UUID('f1eba69f-18ea-467e-a414-b18d67f34a51')
   UUID('aaa4e18e-d4e6-42c9-905c-3cde714c2741')
   UUID('82f55c27-69ae-474b-ab9a-afcc7891587c')
   UUID('fab63643-ad33-4307-837b-68444fce7240')
   UUID('c4efca6c-3d1b-436c-8b09-e9b7f55ccefb')
   UUID('58de3a76-9d98-4427-8232-d6d7df2a1904')
   UUID('4f0a55e8-1357-4697-a345-e60891685b00')
   UUID('0fed47a3-07b6-423e-ae2e-d821c440cb63')
   UUID('144b2c55-a9bd-431d-b536-239fb2048a5e')
   UUID('d47fd8a0-48e9-4dcc-87f7-42c022c309a8')
   Attachments
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use uuid.uuid4 to create unique job name
> 
>
> Key: AIRFLOW-2928
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2928
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Ken Kawamoto
>Priority: Minor
>
> some components in Airflow use the first 8 bytes of _uuid.uuid1_ to generate 
> a unique job name. The first 8 bytes, however, seem to come from clock. so if 
> this is called multiple times in a short time period, two ids will likely 
> collide.
> _uuid.uuid4_ provides random values.
> {code}
> Python 2.7.15 (default, Jun 17 2018, 12:46:58)
> [GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import uuid
> >>> for i in range(10):
> ...   uuid.uuid1()
> ...
> UUID('e8bc9959-a586-11e8-ab8c-8c859010d0c2')
> UUID('e8c254e3-a586-11e8-ac39-8c859010d0c2')
> UUID('e8c2560f-a586-11e8-8251-8c859010d0c2')
> UUID('e8c256c2-a586-11e8-994a-8c859010d0c2')
> UUID('e8c25759-a586-11e8-9ba6-8c859010d0c2')
> UUID('e8c257e6-a586-11e8-a854-8c859010d0c2')
> UUID('e8c2587d-a586-11e8-89e9-8c859010d0c2')
> UUID('e8c2590a-a586-11e8-a825-8c859010d0c2')
> UUID('e8c25994-a586-11e8-9421-8c859010d0c2')
> UUID('e8c25a21-a586-11e8-83fd-8c859010d0c2')
> >>> for i in range(10):
> ...   uuid.uuid4()
> ...
> UUID('f1eba69f-18ea-467e-a414-b18d67f34a51')
> UUID('aaa4e18e-d4e6-42c9-905c-3cde714c2741')
> UUID('82f55c27-69ae-474b-ab9a-afcc7891587c')
> UUID('fab63643-ad33-4307-837b-68444fce7240')
> UUID('c4efca6c-3d1b-436c-8b09-e9b7f55ccefb')
> UUID('58de3a76-9d98-4427-8232-d6d7df2a1904')
> UUID('4f0a55e8-1357-4697-a345-e60891685b00')
> UUID('0fed47a3-07b6-423e-ae2e-d821c440cb63')
> UUID('144b2c55-a9bd-431d-b536-239fb2048a5e')
> UUID('d47fd8a0-48e9-4dcc-87f7-42c022c309a8')
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211763481
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
 
 Review comment:
   I am tempted to leave it in the class as it is not used anywhere else. But, 
I have not strong objection to move it above the class definition.
   
   I will use `retryable` instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] imsut opened a new pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-21 Thread GitBox
imsut opened a new pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 
for better randomness
URL: https://github.com/apache/incubator-airflow/pull/3779
 
 
   ### Description
   
   some components in Airflow use the first 8 bytes of uuid.uuid1 to generate a 
unique job name. The first 8 bytes, however, seem to come from clock. so if 
this is called multiple times in a short time period, two ids will likely 
collide.
   uuid.uuid4 provides random values.
   
   ### Tests
   
   ```py
   Python 2.7.15 (default, Jun 17 2018, 12:46:58)
   [GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import uuid
   >>> for i in range(10):
   ...   uuid.uuid1()
   ...
   UUID('e8bc9959-a586-11e8-ab8c-8c859010d0c2')
   UUID('e8c254e3-a586-11e8-ac39-8c859010d0c2')
   UUID('e8c2560f-a586-11e8-8251-8c859010d0c2')
   UUID('e8c256c2-a586-11e8-994a-8c859010d0c2')
   UUID('e8c25759-a586-11e8-9ba6-8c859010d0c2')
   UUID('e8c257e6-a586-11e8-a854-8c859010d0c2')
   UUID('e8c2587d-a586-11e8-89e9-8c859010d0c2')
   UUID('e8c2590a-a586-11e8-a825-8c859010d0c2')
   UUID('e8c25994-a586-11e8-9421-8c859010d0c2')
   UUID('e8c25a21-a586-11e8-83fd-8c859010d0c2')
   >>> for i in range(10):
   ...   uuid.uuid4()
   ...
   UUID('f1eba69f-18ea-467e-a414-b18d67f34a51')
   UUID('aaa4e18e-d4e6-42c9-905c-3cde714c2741')
   UUID('82f55c27-69ae-474b-ab9a-afcc7891587c')
   UUID('fab63643-ad33-4307-837b-68444fce7240')
   UUID('c4efca6c-3d1b-436c-8b09-e9b7f55ccefb')
   UUID('58de3a76-9d98-4427-8232-d6d7df2a1904')
   UUID('4f0a55e8-1357-4697-a345-e60891685b00')
   UUID('0fed47a3-07b6-423e-ae2e-d821c440cb63')
   UUID('144b2c55-a9bd-431d-b536-239fb2048a5e')
   UUID('d47fd8a0-48e9-4dcc-87f7-42c022c309a8')
   Attachments
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2928) Use uuid.uuid4 to create unique job name

2018-08-21 Thread Ken Kawamoto (JIRA)
Ken Kawamoto created AIRFLOW-2928:
-

 Summary: Use uuid.uuid4 to create unique job name
 Key: AIRFLOW-2928
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2928
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Ken Kawamoto


some components in Airflow use the first 8 bytes of _uuid.uuid1_ to generate a 
unique job name. The first 8 bytes, however, seem to come from clock. so if 
this is called multiple times in a short time period, two ids will likely 
collide.
_uuid.uuid4_ provides random values.

{code}
Python 2.7.15 (default, Jun 17 2018, 12:46:58)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> for i in range(10):
...   uuid.uuid1()
...
UUID('e8bc9959-a586-11e8-ab8c-8c859010d0c2')
UUID('e8c254e3-a586-11e8-ac39-8c859010d0c2')
UUID('e8c2560f-a586-11e8-8251-8c859010d0c2')
UUID('e8c256c2-a586-11e8-994a-8c859010d0c2')
UUID('e8c25759-a586-11e8-9ba6-8c859010d0c2')
UUID('e8c257e6-a586-11e8-a854-8c859010d0c2')
UUID('e8c2587d-a586-11e8-89e9-8c859010d0c2')
UUID('e8c2590a-a586-11e8-a825-8c859010d0c2')
UUID('e8c25994-a586-11e8-9421-8c859010d0c2')
UUID('e8c25a21-a586-11e8-83fd-8c859010d0c2')
>>> for i in range(10):
...   uuid.uuid4()
...
UUID('f1eba69f-18ea-467e-a414-b18d67f34a51')
UUID('aaa4e18e-d4e6-42c9-905c-3cde714c2741')
UUID('82f55c27-69ae-474b-ab9a-afcc7891587c')
UUID('fab63643-ad33-4307-837b-68444fce7240')
UUID('c4efca6c-3d1b-436c-8b09-e9b7f55ccefb')
UUID('58de3a76-9d98-4427-8232-d6d7df2a1904')
UUID('4f0a55e8-1357-4697-a345-e60891685b00')
UUID('0fed47a3-07b6-423e-ae2e-d821c440cb63')
UUID('144b2c55-a9bd-431d-b536-239fb2048a5e')
UUID('d47fd8a0-48e9-4dcc-87f7-42c022c309a8')
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211760792
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   @andrewmchen I'm certainly not claiming that. What I meant is that we don't 
know what was the HTTP status code when we encountered the 
`TEMPORARILY_UNAVAILABLE` error (as we did not log the status code). I suppose 
it is most likely a `500` or something similar.
   
   Do you know what the exact status code is in this case?
   
   So, yes, the goal of this PR is mostly to add retries to the case when the 
`TEMPORARILY_UNAVAILABLE` error occurs.
   
   Having said that, we could also retry whenever the HTTP status code is >= 
500. Does that sound correct?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211756760
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
 
 Review comment:
   nit: lift outside of the class and `retriable` -> `retryable`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211756631
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   Sorry, just to clarify: You're not claiming that the HTTP response code can 
be 200 with a non-json response body right? If so, that is not a known issue.
   
   If the fix is to start catching `TEMPORARILY_UNAVAILABLE` error codes in the 
response, then I do think that the  ValueError check is reasonable. I do think 
we should say it is retryable though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-21 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r211753269
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -117,29 +123,51 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
 json=json,
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except (requests_exceptions.ConnectionError,
+requests_exceptions.Timeout) as e:
+self._log_request_error(attempt_num, e)
+except requests_exceptions.HTTPError as e:
+response = e.response
+if not self._retriable_error(response):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
 response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
+
+@staticmethod
+def _retriable_error(response):
+try:
+error_code = response.json().get('error_code')
+return error_code == 'TEMPORARILY_UNAVAILABLE'
 
 Review comment:
   Thanks, this is definitely a known issue and can happen sometimes when the 
gateway proxy is not behaving correctly... That being said, this fix is 
definitely necessary and appreciated!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #3393: [AIRFLOW-2499] Dockerised CI pipeline

2018-08-21 Thread GitBox
Fokko commented on a change in pull request #3393: [AIRFLOW-2499] Dockerised CI 
pipeline
URL: https://github.com/apache/incubator-airflow/pull/3393#discussion_r211750311
 
 

 ##
 File path: .travis.yml
 ##
 @@ -19,94 +19,40 @@
 sudo: true
 dist: trusty
 language: python
-jdk:
-  - oraclejdk8
-services:
-  - cassandra
-  - mongodb
-  - mysql
-  - postgresql
-  - rabbitmq
-addons:
-  apt:
-packages:
-  - slapd
-  - ldap-utils
-  - openssh-server
-  - mysql-server-5.6
-  - mysql-client-core-5.6
-  - mysql-client-5.6
-  - krb5-user
-  - krb5-kdc
-  - krb5-admin-server
-  - oracle-java8-installer
-  postgresql: "9.2"
-python:
-  - "2.7"
-  - "3.5"
 env:
   global:
+- DOCKER_COMPOSE_VERSION=1.20.0
 - SLUGIFY_USES_TEXT_UNIDECODE=yes
 - TRAVIS_CACHE=$HOME/.travis_cache/
-- KRB5_CONFIG=/etc/krb5.conf
-- KRB5_KTNAME=/etc/airflow.keytab
-# Travis on google cloud engine has a global /etc/boto.cfg that
-# does not work with python 3
-- BOTO_CONFIG=/tmp/bogusvalue
   matrix:
+- TOX_ENV=flake8
 - TOX_ENV=py27-backend_mysql
 - TOX_ENV=py27-backend_sqlite
 - TOX_ENV=py27-backend_postgres
-- TOX_ENV=py35-backend_mysql
-- TOX_ENV=py35-backend_sqlite
-- TOX_ENV=py35-backend_postgres
-- TOX_ENV=flake8
+- TOX_ENV=py35-backend_mysql PYTHON_VERSION=3
+- TOX_ENV=py35-backend_sqlite PYTHON_VERSION=3
+- TOX_ENV=py35-backend_postgres PYTHON_VERSION=3
 - TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
-- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
-matrix:
-  exclude:
-- python: "3.5"
-  env: TOX_ENV=py27-backend_mysql
-- python: "3.5"
-  env: TOX_ENV=py27-backend_sqlite
-- python: "3.5"
-  env: TOX_ENV=py27-backend_postgres
-- python: "2.7"
-  env: TOX_ENV=py35-backend_mysql
-- python: "2.7"
-  env: TOX_ENV=py35-backend_sqlite
-- python: "2.7"
-  env: TOX_ENV=py35-backend_postgres
-- python: "2.7"
-  env: TOX_ENV=flake8
-- python: "3.5"
-  env: TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
-- python: "2.7"
-  env: TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
+- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0 PYTHON_VERSION=3
 cache:
   directories:
 - $HOME/.wheelhouse/
+- $HOME/.cache/pip
 - $HOME/.travis_cache/
 before_install:
-  - yes | ssh-keygen -t rsa -C your_em...@youremail.com -P '' -f ~/.ssh/id_rsa
-  - cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
-  - ln -s ~/.ssh/authorized_keys ~/.ssh/authorized_keys2
-  - chmod 600 ~/.ssh/*
-  - jdk_switcher use oraclejdk8
+  - sudo ls -lh $HOME/.cache/pip/
+  - sudo rm -rf $HOME/.cache/pip/* $HOME/.wheelhouse/*
+  - sudo chown -R travis.travis $HOME/.cache/pip
 install:
+  # Use recent docker-compose version
+  - sudo rm /usr/local/bin/docker-compose
 
 Review comment:
    Makes sense to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3778: [AIRFLOW-XXX] Fix some operator names in the docs

2018-08-21 Thread GitBox
codecov-io commented on issue #3778: [AIRFLOW-XXX] Fix some operator names in 
the docs
URL: 
https://github.com/apache/incubator-airflow/pull/3778#issuecomment-414792827
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=h1)
 Report
   > Merging 
[#3778](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3778/graphs/tree.svg?width=650=150=pr=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3778   +/-   ##
   ===
 Coverage   77.67%   77.67%   
   ===
 Files 204  204   
 Lines   1584015840   
   ===
 Hits1230412304   
 Misses   3536 3536
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=footer).
 Last update 
[27309b1...27814c0](https://codecov.io/gh/apache/incubator-airflow/pull/3778?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3777: Add Flipp to list of Airflow users

2018-08-21 Thread GitBox
codecov-io edited a comment on issue #3777: Add Flipp to list of Airflow users
URL: 
https://github.com/apache/incubator-airflow/pull/3777#issuecomment-414784611
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=h1)
 Report
   > Merging 
[#3777](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3777/graphs/tree.svg?width=650=150=pr=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3777   +/-   ##
   ===
 Coverage   77.67%   77.67%   
   ===
 Files 204  204   
 Lines   1584015840   
   ===
 Hits1230412304   
 Misses   3536 3536
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=footer).
 Last update 
[27309b1...f6326ff](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3777: Add Flipp to list of Airflow users

2018-08-21 Thread GitBox
codecov-io edited a comment on issue #3777: Add Flipp to list of Airflow users
URL: 
https://github.com/apache/incubator-airflow/pull/3777#issuecomment-414784611
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=h1)
 Report
   > Merging 
[#3777](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3777/graphs/tree.svg?token=WdLKlKHOAU=pr=650=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3777   +/-   ##
   ===
 Coverage   77.67%   77.67%   
   ===
 Files 204  204   
 Lines   1584015840   
   ===
 Hits1230412304   
 Misses   3536 3536
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=footer).
 Last update 
[27309b1...f6326ff](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3777: Add Flipp to list of Airflow users

2018-08-21 Thread GitBox
codecov-io commented on issue #3777: Add Flipp to list of Airflow users
URL: 
https://github.com/apache/incubator-airflow/pull/3777#issuecomment-414784611
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=h1)
 Report
   > Merging 
[#3777](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27309b13f17402eaa61d4e4fede8785effa8bbb7?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3777/graphs/tree.svg?width=650=pr=WdLKlKHOAU=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3777   +/-   ##
   ===
 Coverage   77.67%   77.67%   
   ===
 Files 204  204   
 Lines   1584015840   
   ===
 Hits1230412304   
 Misses   3536 3536
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=footer).
 Last update 
[27309b1...f6326ff](https://codecov.io/gh/apache/incubator-airflow/pull/3777?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2927) Task.get_task_instances() not setting start date if none provided

2018-08-21 Thread Marek Prussak (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marek Prussak updated AIRFLOW-2927:
---
Description: 
Task.get_task_instances() does not have any logic to set a default start_date. 
This is inconsistent with Dag.get_task_instances() which does.

[The offending 
function|https://github.com/apache/incubator-airflow/blob/b78c7fb8512f7a40f58b46530e9b3d5562fe84ea/airflow/models.py#L2867]

  was:
Task.get_task_instances() does not have any logic to set a default start_date. 
This is inconsistent with Dag.get_task_instances() which does.



> Task.get_task_instances() not setting start date if none provided
> -
>
> Key: AIRFLOW-2927
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2927
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Marek Prussak
>Priority: Minor
>
> Task.get_task_instances() does not have any logic to set a default 
> start_date. This is inconsistent with Dag.get_task_instances() which does.
> [The offending 
> function|https://github.com/apache/incubator-airflow/blob/b78c7fb8512f7a40f58b46530e9b3d5562fe84ea/airflow/models.py#L2867]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sethwilsonwishabi opened a new pull request #3777: Add Flipp to list of Airflow users

2018-08-21 Thread GitBox
sethwilsonwishabi opened a new pull request #3777: Add Flipp to list of Airflow 
users
URL: https://github.com/apache/incubator-airflow/pull/3777
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2927) Task.get_task_instances() not setting start date if none provided

2018-08-21 Thread Marek Prussak (JIRA)
Marek Prussak created AIRFLOW-2927:
--

 Summary: Task.get_task_instances() not setting start date if none 
provided
 Key: AIRFLOW-2927
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2927
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Marek Prussak


Task.get_task_instances() does not have any logic to set a default start_date. 
This is inconsistent with Dag.get_task_instances() which does.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tedmiston commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module

2018-08-21 Thread GitBox
tedmiston commented on issue #3760: [AIRFLOW-2909] Deprecate 
airflow.operators.sensors module
URL: 
https://github.com/apache/incubator-airflow/pull/3760#issuecomment-414772164
 
 
   @Fokko Sure, I just checked and I do not have this `build` directory in my 
incubator-airflow.  Is it possible that those files you're seeing locally are 
cached from a setuptools build or something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211658743
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
 
 Review comment:
   Good point, `jobReference` could be useful to set, as that is how you can 
specify an explicit `location` for the job to run.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2887) Add to BigQueryBaseCursor methods for creating and updating datasets

2018-08-21 Thread Iuliia Volkova (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iuliia Volkova updated AIRFLOW-2887:

Description: 
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  

  was:
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  and update datasets 
([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])

 


> Add to BigQueryBaseCursor methods for creating and updating datasets
> 
>
> Key: AIRFLOW-2887
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2887
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>
> In BigQueryBaseCursor exist only:
> def delete_dataset(self, project_id, dataset_id)
>  And there are no hooks to 
> create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2887) Add to BigQueryBaseCursor methods for creating and updating datasets

2018-08-21 Thread Iuliia Volkova (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iuliia Volkova updated AIRFLOW-2887:

Description: 
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hook to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  

  was:
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  


> Add to BigQueryBaseCursor methods for creating and updating datasets
> 
>
> Key: AIRFLOW-2887
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2887
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>
> In BigQueryBaseCursor exist only:
> def delete_dataset(self, project_id, dataset_id)
>  And there are no hook to 
> create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2887) Add to BigQueryBaseCursor methods for creating and updating datasets

2018-08-21 Thread Iuliia Volkova (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iuliia Volkova updated AIRFLOW-2887:

Description: 
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  and update datasets 
([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])

 

  was:
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  and update datasets 
([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])

 

If it's so, could I add methods and operators for those actions? 


> Add to BigQueryBaseCursor methods for creating and updating datasets
> 
>
> Key: AIRFLOW-2887
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2887
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>
> In BigQueryBaseCursor exist only:
> def delete_dataset(self, project_id, dataset_id)
>  And there are no hooks to 
> create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
>   and update datasets 
> ([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2887) Add to BigQueryBaseCursor methods for creating and updating datasets

2018-08-21 Thread Iuliia Volkova (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iuliia Volkova updated AIRFLOW-2887:

Description: 
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  and update datasets 
([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])

 

If it's so, could I add methods and operators for those actions? 

  was:
In BigQueryBaseCursor exist only:

def delete_dataset(self, project_id, dataset_id)

 And there are no hooks to 
create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
  and update datasets 
([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])

[~kaxilnaik], or I'm not right?

If it's so, could I add methods and operators for those actions? 


> Add to BigQueryBaseCursor methods for creating and updating datasets
> 
>
> Key: AIRFLOW-2887
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2887
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>
> In BigQueryBaseCursor exist only:
> def delete_dataset(self, project_id, dataset_id)
>  And there are no hooks to 
> create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
>   and update datasets 
> ([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/update])
>  
> If it's so, could I add methods and operators for those actions? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211637030
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
+for key in api_resource_configs['configuration']:
+configuration[key] = api_resource_configs['configuration'][key]
 
 Review comment:
   @tswast , check it pls


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211634204
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
 
 Review comment:
   I thought about this param like a possibility to set any settings in one 
dict, that supported by API (because of 'src_fmt_configs'). But if we say what 
it's only the way to set 'configuration' and we will not available this way to 
send any other params, for example, 'jobReference' or 'statistics', so it's not 
needed. What do you think? Use it only for 'configuration' ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211634204
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
 
 Review comment:
   I thought about this param like a possibility to set any settings in one 
dict, that supported by API (because of 'src_fmt_configs'). For example, 
'jobReference' or 'statistics': 
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs. But if we say 
what it's only the way to set 'configuration' so it's not needed. What do you 
think? Use it only for 'configuration' ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211634204
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
 
 Review comment:
   I thought about this param like a possibility to set any settings in one 
dict, that supported by API (because of 'src_fmt_configs'). But if we say what 
it's only the way to set 'configuration' and we will not available this way to 
send any other params, for example, 'jobReference' or 'statistics' .. so way 
what I could define one dict with all params and send it. So if it's not needed 
and it will be only for 'configuration' - it's of course not needed. What do 
you think? Use it only for 'configuration' ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211627543
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
+for key in api_resource_configs['configuration']:
+configuration[key] = api_resource_configs['configuration'][key]
 
 Review comment:
   There are many sub-keys of `query` that we care about. For example: 
`['query']['schemaUpdateOptions']` would get overridden, as would 
`['query']['query']`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211625138
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
+for key in api_resource_configs['configuration']:
+configuration[key] = api_resource_configs['configuration'][key]
 
 Review comment:
   mm.. no, it will re-write only keys what will be added by the user in 
api_resource_configs, or I didn't catch what you mean.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211623810
 
 

 ##
 File path: tests/contrib/hooks/test_bigquery_hook.py
 ##
 @@ -281,6 +281,18 @@ def test_run_query_sql_dialect_override(self, 
run_with_config):
 args, kwargs = run_with_config.call_args
 self.assertIs(args[0]['query']['useLegacySql'], bool_val)
 
+@mock.patch.object(hook.BigQueryBaseCursor, 'run_with_configuration')
+def test_api_resource_configs(self, run_with_config):
+for bool_val in [True, False]:
+cursor = hook.BigQueryBaseCursor(mock.Mock(), "project_id")
+cursor.run_query('query',
+ api_resource_configs={
+ 'configuration':
+ {'query': {'useQueryCache': bool_val}}})
+
+args, kwargs = run_with_config.call_args
+self.assertIs(args[0]['query']['useQueryCache'], bool_val)
 
 Review comment:
   Double-check that the query string still appears in the resource. I think it 
might be overwritten with the current logic.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211623025
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
 
 Review comment:
   Why require an extra `configuration` key? What other keys would be present?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
tswast commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r211622902
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -656,6 +671,10 @@ def run_query(self,
 configuration['query'][
 'schemaUpdateOptions'] = schema_update_options
 
+if 'configuration' in api_resource_configs:
+for key in api_resource_configs['configuration']:
+configuration[key] = api_resource_configs['configuration'][key]
 
 Review comment:
   There are pretty few top-level keys: "query" and "labels" are the only two I 
can think of. Wouldn't specifying a `query` key in `api_resource_configs` 
overwrite all the options that were just applied?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside removed a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside removed a comment on issue #3733: [AIRFLOW-491] Add cache parameter 
in BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414666013
 
 
   @ashb , yeap.. thx.. I'm trying to understand how I broke it)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xoen commented on a change in pull request #3504: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-08-21 Thread GitBox
xoen commented on a change in pull request #3504: [AIRFLOW-2310]: Add AWS Glue 
Job Compatibility to Airflow
URL: https://github.com/apache/incubator-airflow/pull/3504#discussion_r211610465
 
 

 ##
 File path: airflow/contrib/aws_glue_job_hook.py
 ##
 @@ -0,0 +1,212 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.aws_hook import AwsHook
+import os.path
+import time
+
+
+class AwsGlueJobHook(AwsHook):
+"""
+Interact with AWS Glue - create job, trigger, crawler
+
+:param job_name: unique job name per AWS account
+:type str
+:param desc: job description
+:type str
+:param concurrent_run_limit: The maximum number of concurrent runs allowed 
for a job
+:type int
+:param script_location: path to etl script either on s3 or local
+:type str
+:param conns: A list of connections used by the job
+:type list
+:param retry_limit: Maximum number of times to retry this job if it fails
+:type int
+:param num_of_dpus: Number of AWS Glue DPUs to allocate to this Job
+:type int
+:param region_name: aws region name (example: us-east-1)
+:type region_name: str
+:param s3_bucket: S3 bucket where logs and local etl script will be 
uploaded
+:type str
+:param iam_role_name: AWS IAM Role for Glue Job
+:type str
+"""
+
+def __init__(self,
+ job_name=None,
+ desc=None,
+ concurrent_run_limit=None,
+ script_location=None,
+ conns=None,
+ retry_limit=None,
+ num_of_dpus=None,
+ aws_conn_id='aws_default',
+ region_name=None,
+ iam_role_name=None,
+ s3_bucket=None, *args, **kwargs):
+self.job_name = job_name
+self.desc = desc
+self.concurrent_run_limit = concurrent_run_limit or 1
+self.script_location = script_location
+self.conns = conns or ["s3"]
+self.retry_limit = retry_limit or 0
+self.num_of_dpus = num_of_dpus or 10
+self.aws_conn_id = aws_conn_id
+self.region_name = region_name
+self.s3_bucket = s3_bucket
+self.role_name = iam_role_name
+self.S3_PROTOCOL = "s3://"
+self.S3_ARTIFACTS_PREFIX = 'artifacts/glue-scripts/'
+self.S3_GLUE_LOGS = 'logs/glue-logs/'
+super(AwsGlueJobHook, self).__init__(*args, **kwargs)
+
+def get_conn(self):
+conn = self.get_client_type('glue', self.region_name)
+return conn
+
+def list_jobs(self):
+conn = self.get_conn()
+return conn.get_jobs()
+
+def get_iam_execution_role(self):
+"""
+:return: iam role for job execution
+"""
+iam_client = self.get_client_type('iam', self.region_name)
+
+try:
+glue_execution_role = iam_client.get_role(RoleName=self.role_name)
+self.log.info("Iam Role Name: {}".format(self.role_name))
+return glue_execution_role
+except Exception as general_error:
+raise AirflowException(
+'Failed to create aws glue job, error: {error}'.format(
+error=str(general_error)
+)
+)
+
+def initialize_job(self, script_arguments=None):
+"""
+Initializes connection with AWS Glue
+to run job
+:return:
+"""
+if self.s3_bucket is None:
+raise AirflowException(
+'Could not initialize glue job, '
+'error: Specify Parameter `s3_bucket`'
+)
+
+glue_client = self.get_conn()
+
+try:
+job_response = self.get_or_create_glue_job()
+job_name = job_response['Name']
+job_run = glue_client.start_job_run(
+JobName=job_name,
+Arguments=script_arguments
+)
+return self.job_completion(job_name, job_run['JobRunId'])
+except Exception as general_error:
+raise AirflowException(
+'Failed to run aws glue job, error: 

[jira] [Closed] (AIRFLOW-2242) Add Gousto to companies list

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-2242.
--
Resolution: Won't Fix

Please open a PR to add your company, re-open this Jira and we'll merge it.

> Add Gousto to companies list
> 
>
> Key: AIRFLOW-2242
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2242
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: Dejan
>Assignee: Dejan
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414666013
 
 
   @ashb , yeap.. thx.. I'm trying to understand how I broke it)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-1287) Python ShortCircuitOperator missing DAG filter when setting downstream deps as skipped

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-1287.
--
Resolution: Duplicate

> Python ShortCircuitOperator missing DAG filter when setting downstream deps 
> as skipped
> --
>
> Key: AIRFLOW-1287
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1287
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: Airflow 1.8
>Reporter: Luke Rohde
>Assignee: Luke Rohde
>Priority: Major
>
> The ShortCircuitOperator sets status to skipped for task instances by name 
> and execution date, but fails to filter by DAG. So if multiple DAGs have the 
> tasks with the same name, this sets downstream tasks to skipped on other DAGs.
> https://github.com/apache/incubator-airflow/pull/2351



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-408) Not enough arguments for format string

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-408.
---
Resolution: Fixed

> Not enough arguments for format string
> --
>
> Key: AIRFLOW-408
> URL: https://issues.apache.org/jira/browse/AIRFLOW-408
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Li Xuanji
>Assignee: Li Xuanji
>Priority: Major
>
> https://landscape.io/github/apache/incubator-airflow/559/modules/airflow/bin/cli.py#L68



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1395) sql_alchemy_conn_cmd does not work when executor is non SequentialExecutor

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587375#comment-16587375
 ] 

Ash Berlin-Taylor commented on AIRFLOW-1395:


There were some issues with {{_cmd}} options not working.

This may have been fixed (by correctly checking for cmd options if the non-cmd 
form isn't in the config) in 1.9.0

> sql_alchemy_conn_cmd does not work when executor is non SequentialExecutor
> --
>
> Key: AIRFLOW-1395
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1395
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.8
> Environment: Linux
>Reporter: mohamed younuskhan
>Assignee: mohamed younuskhan
>Priority: Major
>
> I was trying to use sql_alchemy_conn_cmd to proctect the postgress password. 
> I am using LocalExecutor as executor. Airflow initialization fails with below 
> exception. It appears the it load the default config and along with 
> overridden environment variable. When executor was already overridden and 
> sql_alchemy_conn was set to defaul sqlite. The error check fails and throws 
> below exception.
>   File "/bin/airflow", line 17, in 
> from airflow import configuration
>   File "/lib/python2.7/site-packages/airflow/__init__.py", line 29, in 
> 
> from airflow import configuration as conf
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 784, in 
> conf.read(AIRFLOW_CONFIG)
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 636, in read
> self._validate()
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 548, in _validate
> self.get('core', 'executor')))
> airflow.exceptions.AirflowConfigException: error: cannot use sqlite with the 
> LocalExecutor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-1328) Daily DAG execute the past day

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-1328.
--
Resolution: Not A Bug

> Daily DAG execute the past day
> --
>
> Key: AIRFLOW-1328
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1328
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: Airflow 1.8
> Environment: debian jessie
>Reporter: Pierre-Antoine Tible
>Priority: Major
>
> Hello,
> I'm running Airflow 1.8 under debian jessie. I installed it via pip.
> I am using the LocalScheduler with a Mysql.
> I made a simple DAG with a BashOperator for a daily task (two times) : 
> +_default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime.now() - timedelta(days=1, seconds=6),
> 'email': ['XXX'],
> 'email_on_failure': True,
> 'email_on_retry': False,
> 'retries': 1,
> 'retry_delay': timedelta(minutes=5),
> 'execution_timeout': None,
> #'catchup': False,
> #'backfill': False,
> # 'queue': 'bash_queue',
> # 'pool': 'backfill',
> # 'priority_weight': 10,
> # 'end_date': datetime(2016, 1, 1),
> }
> dag = DAG('campaign-reminder', default_args=default_args, 
> schedule_interval="0,0 7,15 * * *", concurrency=1, max_active_runs=1)
> dag.catchup = False
> t1 = BashOperator(
> task_id='campaign-reminder',
> bash_command=' ',
> dag=dag)_+
> I did it today, it works, but the execution date was "06-19T15:00:00", we are 
> the 20th, so it's one day behind the schedule.
> My first though was a mistake with the start_date, so I put a datetime() and 
> it did the same ...
> I don't understand why.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1572) Add Carbonite to to companies list

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1572.

Resolution: Fixed

> Add Carbonite to to companies list
> --
>
> Key: AIRFLOW-1572
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1572
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Adam Boscarino
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1791) Unexpected "AttributeError: 'unicode' object has no attribute 'val'" from Variable.setdefault

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1791.

   Resolution: Fixed
Fix Version/s: 1.9.0

> Unexpected "AttributeError: 'unicode' object has no attribute 'val'" from 
> Variable.setdefault
> -
>
> Key: AIRFLOW-1791
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1791
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: Airflow 1.8
> Environment: Python 2.7, Airflow 1.8.2
>Reporter: Shawn Wang
>Priority: Major
> Fix For: 1.9.0
>
>
> In Variable.setdefault method,
> {code:python}
> obj = Variable.get(key, default_var=default_sentinel, 
> deserialize_json=False)
> if obj is default_sentinel:
> // ...
> else:
> if deserialize_json:
> return json.loads(obj.val)
> else:
> return obj.val
> {code}
> While obj is retrieved by "get" method which has already fetched the val 
> attribute from obj, so this "obj.val" throws the AttributeError.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1146) izip use in Python 3.4

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1146.

   Resolution: Fixed
Fix Version/s: 1.9.0

> izip use in Python 3.4
> --
>
> Key: AIRFLOW-1146
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1146
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hive_hooks
>Affects Versions: Airflow 1.8
>Reporter: Alexander Panzhin
>Priority: Major
> Fix For: 1.9.0
>
>
> Python 3 no longer has itertools.izip, but it is still used in 
> airflow/hooks/hive_hooks.py
> This causes all kinds of havoc.
> This needs fixed, if this is to be used on Python 3+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-2182) Configured

2018-08-21 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-2182.
--
Resolution: Invalid

> Configured
> --
>
> Key: AIRFLOW-2182
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2182
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: authentication
>Reporter: Richard Ferrer
>Assignee: Richard Ferrer
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
ashb commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery 
query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414654266
 
 
   Your error seems to be specific to Py 2.7:
   
   ```
   ==
   11) ERROR: test_bql_deprecation_warning 
(tests.contrib.hooks.test_bigquery_hook.TestBigQueryBaseCursor)
   --
  Traceback (most recent call last):
   .tox/py27-backend_postgres/lib/python2.7/site-packages/mock/mock.py line 
1305 in patched
 return func(*args, **keywargs)
   tests/contrib/hooks/test_bigquery_hook.py line 227 in 
test_bql_deprecation_warning
 w[0].message.args[0])
  IndexError: list index out of range
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-169) Hide expire dags in UI

2018-08-21 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587340#comment-16587340
 ] 

jack commented on AIRFLOW-169:
--

To work around this simply "trun off" the dags from the UI. then you have a 
button to hide them.

However I can see how it may be a problem with the cli command. If you have 
many dags that ran once or had end date the list can be large for no good 
reason.

> Hide expire dags in UI
> --
>
> Key: AIRFLOW-169
> URL: https://issues.apache.org/jira/browse/AIRFLOW-169
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: ui
>Reporter: Sumit Maheshwari
>Priority: Major
>
> It would be great if we've option to hide expired schedules from UI. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1395) sql_alchemy_conn_cmd does not work when executor is non SequentialExecutor

2018-08-21 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587331#comment-16587331
 ] 

jack commented on AIRFLOW-1395:
---

Your error says you are trying to work with SQLite & LocalExecutor - +which you 
can't.+

For LocalExecutor  you need to use PostgreSQL or MySQL.

Your title isn't related to the details of the error.

> sql_alchemy_conn_cmd does not work when executor is non SequentialExecutor
> --
>
> Key: AIRFLOW-1395
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1395
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.8
> Environment: Linux
>Reporter: mohamed younuskhan
>Assignee: mohamed younuskhan
>Priority: Major
>
> I was trying to use sql_alchemy_conn_cmd to proctect the postgress password. 
> I am using LocalExecutor as executor. Airflow initialization fails with below 
> exception. It appears the it load the default config and along with 
> overridden environment variable. When executor was already overridden and 
> sql_alchemy_conn was set to defaul sqlite. The error check fails and throws 
> below exception.
>   File "/bin/airflow", line 17, in 
> from airflow import configuration
>   File "/lib/python2.7/site-packages/airflow/__init__.py", line 29, in 
> 
> from airflow import configuration as conf
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 784, in 
> conf.read(AIRFLOW_CONFIG)
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 636, in read
> self._validate()
>   File "/lib/python2.7/site-packages/airflow/configuration.py", line 
> 548, in _validate
> self.get('core', 'executor')))
> airflow.exceptions.AirflowConfigException: error: cannot use sqlite with the 
> LocalExecutor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ron819 commented on issue #2638: AIRFLOW-1649 - Adding xcom_push to S3KeySensor wildcard matches.

2018-08-21 Thread GitBox
ron819 commented on issue #2638: AIRFLOW-1649 - Adding xcom_push to S3KeySensor 
wildcard matches.
URL: 
https://github.com/apache/incubator-airflow/pull/2638#issuecomment-414648350
 
 
   @andylockran any chance to revisit this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414645184
 
 
   @kaxil and @tswast , pls check


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside edited a comment on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414644891
 
 
   @ashb , hi Ash, sorry for what I'm pulling you, but I know what only you can 
help me with travis :) I got a strange error and already had re-triggered 
Travis once (got 2 checks with the same error). Thank you in advance!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-08-21 Thread GitBox
xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'api_resource_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414644891
 
 
   @ashb , hi Ash, sorry for what I'm pulling you, but I know what only you can 
help me this travis :) I got a strange error and already had re-triggered 
Travis once (got 2 checks with the same error). Thank you in advance!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #3776: [AIRFLOW-XXX] Fix minor typo

2018-08-21 Thread GitBox
Fokko closed pull request #3776: [AIRFLOW-XXX] Fix minor typo
URL: https://github.com/apache/incubator-airflow/pull/3776
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/executors/base_executor.py 
b/airflow/executors/base_executor.py
index 8baed1a250..a989dc4408 100644
--- a/airflow/executors/base_executor.py
+++ b/airflow/executors/base_executor.py
@@ -193,7 +193,7 @@ def execute_async(self,
 
 def end(self):  # pragma: no cover
 """
-This method is called when the caller is done submitting job and is
+This method is called when the caller is done submitting job and
 wants to wait synchronously for the job submitted previously to be
 all done.
 """


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2923) LatestOnlyOperator cascade skip through all_done and dummy

2018-08-21 Thread Dmytro Kulyk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kulyk updated AIRFLOW-2923:
--
Description: 
DAG with consolidating point (calc_ready : dummy)

as per [https://airflow.apache.org/concepts.html#latest-run-only] given task 
should be ran even catching up an execution DagRuns for a previous periods
 However, LatestOnlyOperator cascading through calc_ready despite of it is a 
dummy and trigger_rule=all_done
 Same behavior when trigger_rule=all_success
{code}
t_ready = DummyOperator(
  task_id = 'calc_ready',
  trigger_rule = TriggerRule.ALL_DONE,
  dag=dag)
{code}
!screenshot-1.png!

  was:
DAG with consolidating point (calc_ready : dummy) 

as per https://airflow.apache.org/concepts.html#latest-run-only given task 
should be ran even catching up an execution DagRuns for a previous periods
However, LatestOnlyOperator cascading through `calc_ready` despite of it is a 
`dummy` and `trigger_rule`=`all_done`
Same behavior when `trigger_rule`=`all_success`

{code:python}
t_ready = DummyOperator(
  task_id = 'calc_ready',
  trigger_rule = TriggerRule.ALL_DONE,
  dag=dag)
{code} 

 !screenshot-1.png!


> LatestOnlyOperator cascade skip through all_done and dummy
> --
>
> Key: AIRFLOW-2923
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2923
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.9.0
> Environment: CeleryExecutor, 2-nodes cluster, RMQ, PostgreSQL
>Reporter: Dmytro Kulyk
>Priority: Major
>  Labels: cascade, latestonly, skip
> Attachments: cube_update.py, screenshot-1.png
>
>
> DAG with consolidating point (calc_ready : dummy)
> as per [https://airflow.apache.org/concepts.html#latest-run-only] given task 
> should be ran even catching up an execution DagRuns for a previous periods
>  However, LatestOnlyOperator cascading through calc_ready despite of it is a 
> dummy and trigger_rule=all_done
>  Same behavior when trigger_rule=all_success
> {code}
> t_ready = DummyOperator(
>   task_id = 'calc_ready',
>   trigger_rule = TriggerRule.ALL_DONE,
>   dag=dag)
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2923) LatestOnlyOperator cascade skip through all_done and dummy

2018-08-21 Thread Dmytro Kulyk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kulyk updated AIRFLOW-2923:
--
Environment: CeleryExecutor, 2-nodes cluster, RMQ, PostgreSQL

> LatestOnlyOperator cascade skip through all_done and dummy
> --
>
> Key: AIRFLOW-2923
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2923
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.9.0
> Environment: CeleryExecutor, 2-nodes cluster, RMQ, PostgreSQL
>Reporter: Dmytro Kulyk
>Priority: Major
>  Labels: cascade, latestonly, skip
> Attachments: cube_update.py, screenshot-1.png
>
>
> DAG with consolidating point (calc_ready : dummy) 
> as per https://airflow.apache.org/concepts.html#latest-run-only given task 
> should be ran even catching up an execution DagRuns for a previous periods
> However, LatestOnlyOperator cascading through `calc_ready` despite of it is a 
> `dummy` and `trigger_rule`=`all_done`
> Same behavior when `trigger_rule`=`all_success`
> {code:python}
> t_ready = DummyOperator(
>   task_id = 'calc_ready',
>   trigger_rule = TriggerRule.ALL_DONE,
>   dag=dag)
> {code} 
>  !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2923) LatestOnlyOperator cascade skip through all_done and dummy

2018-08-21 Thread Dmytro Kulyk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kulyk updated AIRFLOW-2923:
--
Labels: cascade latestonly skip  (was: )

> LatestOnlyOperator cascade skip through all_done and dummy
> --
>
> Key: AIRFLOW-2923
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2923
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.9.0
>Reporter: Dmytro Kulyk
>Priority: Major
>  Labels: cascade, latestonly, skip
> Attachments: cube_update.py, screenshot-1.png
>
>
> DAG with consolidating point (calc_ready : dummy) 
> as per https://airflow.apache.org/concepts.html#latest-run-only given task 
> should be ran even catching up an execution DagRuns for a previous periods
> However, LatestOnlyOperator cascading through `calc_ready` despite of it is a 
> `dummy` and `trigger_rule`=`all_done`
> Same behavior when `trigger_rule`=`all_success`
> {code:python}
> t_ready = DummyOperator(
>   task_id = 'calc_ready',
>   trigger_rule = TriggerRule.ALL_DONE,
>   dag=dag)
> {code} 
>  !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'src_fmt_configs'

2018-08-21 Thread GitBox
xnuinside commented on issue #3733: [AIRFLOW-491] Add cache parameter in 
BigQuery query method - with 'src_fmt_configs'
URL: 
https://github.com/apache/incubator-airflow/pull/3733#issuecomment-414561148
 
 
   @kaxil, no problem) I will make changes today


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services