[jira] [Created] (AIRFLOW-3592) Logs cannot be viewed while in rescheduled state

2018-12-29 Thread Stefan Seelmann (JIRA)
Stefan Seelmann created AIRFLOW-3592:


 Summary: Logs cannot be viewed while in rescheduled state
 Key: AIRFLOW-3592
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3592
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: webserver
Affects Versions: 1.10.1
Reporter: Stefan Seelmann
 Fix For: 1.10.2, 2.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-12-29 Thread GitBox
seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of 
sensors
URL: 
https://github.com/apache/incubator-airflow/pull/3596#issuecomment-450479933
 
 
   I created several sub-tasks in 
https://issues.apache.org/jira/browse/AIRFLOW-2747


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3591) Fix start date, end date, duration for rescheduled tasks

2018-12-29 Thread Stefan Seelmann (JIRA)
Stefan Seelmann created AIRFLOW-3591:


 Summary: Fix start date, end date, duration for rescheduled tasks
 Key: AIRFLOW-3591
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3591
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: webserver
Affects Versions: 1.10.1
Reporter: Stefan Seelmann
 Fix For: 1.10.2, 2.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-29 Thread GitBox
odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450485058
 
 
   @dimberman I mentioned it in 
https://github.com/apache/incubator-airflow/pull/3770/files#diff-bbf16e7665ac448883f2ceeb40db35cdR624


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3594) Update License Headers in Python Files

2018-12-29 Thread Felix Uellendall (JIRA)
Felix Uellendall created AIRFLOW-3594:
-

 Summary: Update License Headers in Python Files
 Key: AIRFLOW-3594
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3594
 Project: Apache Airflow
  Issue Type: Task
Reporter: Felix Uellendall
Assignee: Felix Uellendall


Some Python Files still have an old version of the Apache License.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3594) Unify different License Header

2018-12-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730667#comment-16730667
 ] 

ASF GitHub Bot commented on AIRFLOW-3594:
-

feluelle commented on pull request #4399: [AIRFLOW-3594] Unify different 
License Header
URL: https://github.com/apache/incubator-airflow/pull/4399
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3594
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Some files have an old version of the Apache License. This PR updates these 
and so unifies the license header for all files.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unify different License Header
> --
>
> Key: AIRFLOW-3594
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3594
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Trivial
>
> Some Files still have an old version of the Apache License.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3462) Refactor: Move TaskReschedule out of models.py

2018-12-29 Thread Stefan Seelmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Seelmann reassigned AIRFLOW-3462:


Assignee: Stefan Seelmann

> Refactor: Move TaskReschedule out of models.py
> --
>
> Key: AIRFLOW-3462
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3462
> Project: Apache Airflow
>  Issue Type: Task
>  Components: models
>Affects Versions: 1.10.1
>Reporter: Fokko Driesprong
>Assignee: Stefan Seelmann
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3590) In case of reschedule executor should not log success

2018-12-29 Thread Stefan Seelmann (JIRA)
Stefan Seelmann created AIRFLOW-3590:


 Summary: In case of reschedule executor should not log success
 Key: AIRFLOW-3590
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3590
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: executor
Reporter: Stefan Seelmann
 Fix For: 1.10.2, 2.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3593) Allow '@' in usernames.

2018-12-29 Thread will-beta (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

will-beta updated AIRFLOW-3593:
---
Description: 
The username provided by *Azure Database for PostgreSQL server* has a '@'.
But *sql_alchemy_conn* and *result_backend* in the *airflow.cfg* do not allow 
it.

{panel:title=exception info"}
(virtualenv-airflow) AirFlowTest@AirFlowTest:~/airflow$ airflow initdb
[2018-12-29 11:00:40,925] {settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-12-29 11:00:41,418] {__init__.py:51} INFO - Using executor CeleryExecutor
DB: 
postgresql+psycopg2://admin%40pg-test1:***@pg-test1.postgres.database.chinacloudapi.cn/airflow
[2018-12-29 11:00:41,620] {db.py:338} INFO - Creating tables
Traceback (most recent call last):
  File "/home/AirFlowTest/virtualenv-airflow/bin/airflow", line 32, in 
args.func(args)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/airflow/bin/cli.py",
 line 1011, in initdb
db_utils.initdb(settings.RBAC)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/airflow/utils/db.py",
 line 92, in initdb
upgradedb()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/airflow/utils/db.py",
 line 346, in upgradedb
command.upgrade(config, 'heads')
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/alembic/command.py",
 line 174, in upgrade
script.run_env()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/alembic/script/base.py",
 line 416, in run_env
util.load_python_file(self.dir, 'env.py')
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/alembic/util/pyfiles.py",
 line 93, in load_python_file
module = load_module_py(module_id, path)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/alembic/util/compat.py",
 line 79, in load_module_py
mod = imp.load_source(module_id, path, fp)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/airflow/migrations/env.py",
 line 91, in 
run_migrations_online()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/airflow/migrations/env.py",
 line 78, in run_migrations_online
with connectable.connect() as connection:
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 2091, in connect
return self._connection_cls(self, **kwargs)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 90, in __init__
if connection is not None else engine.raw_connection()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 2177, in raw_connection
self.pool.unique_connection, _connection)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 2151, in _wrap_pool_connect
e, dialect, self)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 1465, in _handle_dbapi_exception_noconnection
exc_info
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py",
 line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
 line 2147, in _wrap_pool_connect
return fn()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 328, in unique_connection
return _ConnectionFairy._checkout(self)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 768, in _checkout
fairy = _ConnectionRecord.checkout(pool)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 516, in checkout
rec = pool._do_get()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 1140, in _do_get
self._dec_overflow()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py",
 line 66, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 1137, in _do_get
return self._create_connection()
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 333, in _create_connection
return _ConnectionRecord(self)
  File 
"/home/AirFlowTest/virtualenv-airflow/local/lib/python2.7/site-packages/sqlalchemy/pool.py",
 line 461, in __init__

[GitHub] stale[bot] commented on issue #3605: [AIRFLOW-1238] Decode URL-encoded characters.

2018-12-29 Thread GitBox
stale[bot] commented on issue #3605: [AIRFLOW-1238] Decode URL-encoded 
characters.
URL: 
https://github.com/apache/incubator-airflow/pull/3605#issuecomment-450503386
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp opened a new pull request #4401: [AIRFLOW-3596] Clean up undefined template variables.

2018-12-29 Thread GitBox
jmcarp opened a new pull request #4401: [AIRFLOW-3596] Clean up undefined 
template variables.
URL: https://github.com/apache/incubator-airflow/pull/4401
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3596
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3594) Unify different License Header

2018-12-29 Thread Felix Uellendall (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Uellendall updated AIRFLOW-3594:
--
Description: Some Files still have an old version of the Apache License.  
(was: Some Python Files still have an old version of the Apache License.)

> Unify different License Header
> --
>
> Key: AIRFLOW-3594
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3594
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Trivial
>
> Some Files still have an old version of the Apache License.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feluelle opened a new pull request #4400: [AIRFLOW-3595] Add tests for Hive2SambaOperator

2018-12-29 Thread GitBox
feluelle opened a new pull request #4400: [AIRFLOW-3595] Add tests for 
Hive2SambaOperator
URL: https://github.com/apache/incubator-airflow/pull/4400
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3595
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   - adds missing doc parameter destination_filepath
   - adds missing file close for tmp file (through ContextManager Usage)
   - refactoring
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] chinhngt commented on issue #4389: [AIRFLOW-3583] Fix AirflowException import

2018-12-29 Thread GitBox
chinhngt commented on issue #4389: [AIRFLOW-3583] Fix AirflowException import
URL: 
https://github.com/apache/incubator-airflow/pull/4389#issuecomment-450495458
 
 
   @jgao54 Thanks for taking a look. I must missed something then. Below is the 
exception I got when turning remote logging to wasb on:
   
   webserver_1  | Unable to load the config, contains a configuration error.
   webserver_1  | Traceback (most recent call last):
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 382, in 
resolve
   webserver_1  | found = getattr(found, frag)
   webserver_1  | AttributeError: module 'airflow.utils.log' has no attribute 
'wasb_task_handler'
   webserver_1  | 
   webserver_1  | During handling of the above exception, another exception 
occurred:
   webserver_1  | 
   webserver_1  | Traceback (most recent call last):
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 384, in 
resolve
   webserver_1  | self.importer(used)
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/utils/log/wasb_task_handler.py",
 line 23, in 
   webserver_1  | from airflow.contrib.hooks.wasb_hook import WasbHook
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/contrib/hooks/wasb_hook.py", 
line 21, in 
   webserver_1  | from airflow import AirflowException
   webserver_1  | ImportError: cannot import name 'AirflowException'
   webserver_1  | 
   webserver_1  | The above exception was the direct cause of the following 
exception:
   webserver_1  | 
   webserver_1  | Traceback (most recent call last):
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 558, in 
configure
   webserver_1  | handler = self.configure_handler(handlers[name])
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 708, in 
configure_handler
   webserver_1  | klass = self.resolve(cname)
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 391, in 
resolve
   webserver_1  | raise v
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 384, in 
resolve
   webserver_1  | self.importer(used)
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/utils/log/wasb_task_handler.py",
 line 23, in 
   webserver_1  | from airflow.contrib.hooks.wasb_hook import WasbHook
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/contrib/hooks/wasb_hook.py", 
line 21, in 
   webserver_1  | from airflow import AirflowException
   webserver_1  | ValueError: Cannot resolve 
'airflow.utils.log.wasb_task_handler.WasbTaskHandler': cannot import name 
'AirflowException'
   webserver_1  | 
   webserver_1  | During handling of the above exception, another exception 
occurred:
   webserver_1  | 
   webserver_1  | Traceback (most recent call last):
   webserver_1  |   File "/usr/local/bin/airflow", line 21, in 
   webserver_1  | from airflow import configuration
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/__init__.py", line 36, in 

   webserver_1  | from airflow import settings
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/settings.py", line 259, in 

   webserver_1  | configure_logging()
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/logging_config.py", line 72, in 
configure_logging
   webserver_1  | raise e
   webserver_1  |   File 
"/usr/local/lib/python3.5/dist-packages/airflow/logging_config.py", line 67, in 
configure_logging
   webserver_1  | dictConfig(logging_config)
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 795, in 
dictConfig
   webserver_1  | dictConfigClass(config).configure()
   webserver_1  |   File "/usr/lib/python3.5/logging/config.py", line 566, in 
configure
   webserver_1  | '%r: %s' % (name, e))
   webserver_1  | ValueError: Unable to configure handler 'processor': Cannot 
resolve 'airflow.utils.log.wasb_task_handler.WasbTaskHandler': cannot import 
name 'AirflowException'
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stale[bot] commented on issue #1964: [AIRFLOW-722] Add Celery queue sensor

2018-12-29 Thread GitBox
stale[bot] commented on issue #1964: [AIRFLOW-722] Add Celery queue sensor
URL: 
https://github.com/apache/incubator-airflow/pull/1964#issuecomment-450503388
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stale[bot] commented on issue #2747: AIRFLOW-1772: Fix bug with handling cron expressions as an schedule i…

2018-12-29 Thread GitBox
stale[bot] commented on issue #2747: AIRFLOW-1772: Fix bug with handling cron 
expressions as an schedule i…
URL: 
https://github.com/apache/incubator-airflow/pull/2747#issuecomment-450503385
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3594) Unify different License Header

2018-12-29 Thread Felix Uellendall (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Uellendall updated AIRFLOW-3594:
--
Summary: Unify different License Header  (was: Update License Headers in 
Python Files)

> Unify different License Header
> --
>
> Key: AIRFLOW-3594
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3594
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Trivial
>
> Some Python Files still have an old version of the Apache License.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3595) Add tests for HiveToSambaOperator

2018-12-29 Thread Felix Uellendall (JIRA)
Felix Uellendall created AIRFLOW-3595:
-

 Summary: Add tests for HiveToSambaOperator
 Key: AIRFLOW-3595
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3595
 Project: Apache Airflow
  Issue Type: Test
Reporter: Felix Uellendall
Assignee: Felix Uellendall






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3327) BiqQuery job checking doesn't include location, which api requires outside US/EU

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3327.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> BiqQuery job checking doesn't include location, which api requires outside 
> US/EU
> 
>
> Key: AIRFLOW-3327
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3327
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Daniel Swiegers
>Assignee: Kaxil Naik
>Priority: Minor
>  Labels: google-cloud-bigquery
> Fix For: 1.10.2, 2.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We use this api but don't set / pass through the geographical location.
> Which is required in areas other than US and EU.
> Can be seen in contrib/hooks/big_query_hook.py poll_job_complete
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get]
> |The geographic location of the job. Required except for US and EU. See 
> details at 
> https://cloud.google.com/bigquery/docs/locations#specifying_your_location.|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feluelle opened a new pull request #4399: [AIRFLOW-3594] Unify different License Header

2018-12-29 Thread GitBox
feluelle opened a new pull request #4399: [AIRFLOW-3594] Unify different 
License Header
URL: https://github.com/apache/incubator-airflow/pull/4399
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3594
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Some files have an old version of the Apache License. This PR updates these 
and so unifies the license header for all files.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3595) Add tests for HiveToSambaOperator

2018-12-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730694#comment-16730694
 ] 

ASF GitHub Bot commented on AIRFLOW-3595:
-

feluelle commented on pull request #4400: [AIRFLOW-3595] Add tests for 
Hive2SambaOperator
URL: https://github.com/apache/incubator-airflow/pull/4400
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3595
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   - adds missing doc parameter destination_filepath
   - adds missing file close for tmp file (through ContextManager Usage)
   - refactoring
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add tests for HiveToSambaOperator
> -
>
> Key: AIRFLOW-3595
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3595
> Project: Apache Airflow
>  Issue Type: Test
>Reporter: Felix Uellendall
>Assignee: Felix Uellendall
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4400: [AIRFLOW-3595] Add tests for Hive2SambaOperator

2018-12-29 Thread GitBox
codecov-io edited a comment on issue #4400: [AIRFLOW-3595] Add tests for 
Hive2SambaOperator
URL: 
https://github.com/apache/incubator-airflow/pull/4400#issuecomment-450495465
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=h1)
 Report
   > Merging 
[#4400](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/e3fd7d4d809e18eef85ad24c9c6dbd2ce1c782a1?src=pr=desc)
 will **increase** coverage by `0.18%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4400/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4400  +/-   ##
   ==
   + Coverage   78.17%   78.35%   +0.18% 
   ==
 Files 204  204  
 Lines   1652916529  
   ==
   + Hits1292112951  +30 
   + Misses   3608 3578  -30
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/hive\_to\_samba\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV90b19zYW1iYV9vcGVyYXRvci5weQ==)
 | `100% <100%> (+100%)` | :arrow_up: |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/hooks/samba\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9zYW1iYV9ob29rLnB5)
 | `38.88% <0%> (+38.88%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=footer).
 Last update 
[e3fd7d4...5fe5acc](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4400: [AIRFLOW-3595] Add tests for Hive2SambaOperator

2018-12-29 Thread GitBox
codecov-io commented on issue #4400: [AIRFLOW-3595] Add tests for 
Hive2SambaOperator
URL: 
https://github.com/apache/incubator-airflow/pull/4400#issuecomment-450495465
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=h1)
 Report
   > Merging 
[#4400](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/e3fd7d4d809e18eef85ad24c9c6dbd2ce1c782a1?src=pr=desc)
 will **increase** coverage by `0.18%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4400/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4400  +/-   ##
   ==
   + Coverage   78.17%   78.35%   +0.18% 
   ==
 Files 204  204  
 Lines   1652916529  
   ==
   + Hits1292112951  +30 
   + Misses   3608 3578  -30
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/hive\_to\_samba\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV90b19zYW1iYV9vcGVyYXRvci5weQ==)
 | `100% <100%> (+100%)` | :arrow_up: |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/hooks/samba\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4400/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9zYW1iYV9ob29rLnB5)
 | `38.88% <0%> (+38.88%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=footer).
 Last update 
[e3fd7d4...5fe5acc](https://codecov.io/gh/apache/incubator-airflow/pull/4400?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3596) Clean up undefined template variables

2018-12-29 Thread Josh Carp (JIRA)
Josh Carp created AIRFLOW-3596:
--

 Summary: Clean up undefined template variables
 Key: AIRFLOW-3596
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3596
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Josh Carp
Assignee: Josh Carp


Several jinja templates refer to variables that are never defined. We should 
either provide those variables or stop using them in the templates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-496) HiveServer2Hook invokes incorrect Auth mechanism when user not specified

2018-12-29 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730891#comment-16730891
 ] 

jack commented on AIRFLOW-496:
--

The problem is not in Impala it's in Airflow.  Airflow uses Impala as a library 
so it should send parameters in the required format. 

I assume it's caused here:

https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/hive_hooks.py#L766

> HiveServer2Hook invokes incorrect Auth mechanism when user not specified
> 
>
> Key: AIRFLOW-496
> URL: https://issues.apache.org/jira/browse/AIRFLOW-496
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hive_hooks
>Reporter: Shreyas Joshi
>Assignee: Sandish Kumar HN
>Priority: Major
>
> h3. Summary
> {{HiveServer2Hook}} Seems to be ignoring the auth_mechanism when the user is 
> not specified. I am not entirely sure if the solution should be should change 
> impyala or Airflow.
> h3. Reproducing the problem
> With this connection string for Hive: 
> {{AIRFLOW_CONN_GH_HIVE=hive2://@localhost:1/}} (No user name and no 
> password)
>  I get the following error from {{HiveServer2hook}}:
> {code}
> from airflow.hooks import HiveServer2Hook
> hive_hook = HiveServer2Hook (hiveserver2_conn_id='GH_HIVE')
> {code}
> {noformat}
> [2016-09-08 14:30:52,420] {base_hook.py:53} INFO - Using connection to: 
> localhost
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/airflow/hooks/hive_hooks.py",
>  line 464, in get_conn
> database=db.schema or 'default')
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/impala/dbapi.py",
>  line 147, in connect
> auth_mechanism=auth_mechanism)
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/impala/hiveserver2.py",
>  line 658, in connect
> transport.open()
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/thrift_sasl/__init__.py",
>  line 72, in open
> message=("Could not start SASL: %s" % self.sasl.getError()))
> thriftpy.transport.TTransportException: TTransportException(type=1, 
> message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no 
> mechanism available: No worthy mechs found'")
> {noformat}
> h3. More detail
> [Here|https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/hive_hooks.py#L591]
>  {{db.login}} ends up being an empty string rather than {{None}}. This seems 
> to cause impala to try sasl. Changing {{db.login}} from an empty string to 
> {{None}} seems to fix the issue. 
> So, the following does not work
> {code}
> from impala.dbapi import connect
> connect (host='localhost', port=1, user='', auth_mechanism='PLAIN', 
> database= 'default')
> {code}
> The error is:
> {noformat}
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/impala/dbapi.py",
>  line 147, in connect
> auth_mechanism=auth_mechanism)
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/impala/hiveserver2.py",
>  line 658, in connect
> transport.open()
>   File 
> "/Users/shreyasjoshis/python-envs/default-env/lib/python3.5/site-packages/thrift_sasl/__init__.py",
>  line 72, in open
> message=("Could not start SASL: %s" % self.sasl.getError()))
> thriftpy.transport.TTransportException: TTransportException(type=1, 
> message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no 
> mechanism available: No worthy mechs found'")
> {noformat}
> But the following does:
> {code}
> from impala.dbapi import connect
> connect (host='localhost', port=1, user=None, auth_mechanism='PLAIN', 
> database= 'default')
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jgao54 commented on issue #4389: [AIRFLOW-3583] Fix AirflowException import

2018-12-29 Thread GitBox
jgao54 commented on issue #4389: [AIRFLOW-3583] Fix AirflowException import
URL: 
https://github.com/apache/incubator-airflow/pull/4389#issuecomment-450528332
 
 
   @chinhngt actually you are right, was able to reproduce. I'd expect airflow 
init module to be imported but turns out for logging config it's not the case. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3327) BiqQuery job checking doesn't include location, which api requires outside US/EU

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3327:

Fix Version/s: (was: 2.0.0)

> BiqQuery job checking doesn't include location, which api requires outside 
> US/EU
> 
>
> Key: AIRFLOW-3327
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3327
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Daniel Swiegers
>Assignee: Kaxil Naik
>Priority: Minor
>  Labels: google-cloud-bigquery
> Fix For: 1.10.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We use this api but don't set / pass through the geographical location.
> Which is required in areas other than US and EU.
> Can be seen in contrib/hooks/big_query_hook.py poll_job_complete
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get]
> |The geographic location of the job. Required except for US and EU. See 
> details at 
> https://cloud.google.com/bigquery/docs/locations#specifying_your_location.|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-29 Thread GitBox
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450533177
 
 
   @odracci yeah that LGTM then. @Fokko good to go!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-2790) snakebite syntax error: baseTime = min(time * (1L << retries), cap);

2018-12-29 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi reassigned AIRFLOW-2790:
-

Assignee: Yohei Onishi

> snakebite syntax error: baseTime = min(time * (1L << retries), cap);
> 
>
> Key: AIRFLOW-2790
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2790
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Affects Versions: 1.9.0
> Environment: Amazon Linux
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> Does anybody know how can I fix this issue?
>  * Got the following error when importing 
> airflow.operators.sensors.ExternalTaskSensor.
>  * apache-airflow 1.9.0 depends on snakebite 2.11.0 and it does not work with 
> Python3. https://github.com/spotify/snakebite/issues/250
> [2018-07-23 06:42:51,828] \{models.py:288} ERROR - Failed to import: 
> /home/airflow/airflow/dags/example_task_sensor2.py
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 285, 
> in process_file
> m = imp.load_source(mod_name, filepath)
>   File "/usr/lib64/python3.6/imp.py", line 172, in load_source
> module = _load(spec)
>   File "", line 675, in _load
>   File "", line 655, in _load_unlocked
>   File "", line 678, in exec_module
>   File "", line 205, in _call_with_frames_removed
>   File "/home/airflow/airflow/dags/example_task_sensor2.py", line 10, in 
> 
> from airflow.operators.sensors import ExternalTaskSensor
>   File "/usr/local/lib/python3.6/site-packages/airflow/operators/sensors.py", 
> line 34, in 
> from airflow.hooks.hdfs_hook import HDFSHook
>   File "/usr/local/lib/python3.6/site-packages/airflow/hooks/hdfs_hook.py", 
> line 20, in 
> from snakebite.client import Client, HAClient, Namenode, AutoConfigClient
>   File "/usr/local/lib/python3.6/site-packages/snakebite/client.py", line 1473
> baseTime = min(time * (1L << retries), cap);
> ^



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-29 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730906#comment-16730906
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

OK will do

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3316) GCS to BQ operator leaves schema_fields operator unset when autodetect=True

2018-12-29 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730901#comment-16730901
 ] 

jack commented on AIRFLOW-3316:
---

I'm unable to reproduce this issue.

first, {color:#24292e}schema_fields is optional field. You don't need to assign 
None. If there is no schema then don't specify this field.{color}

{color:#24292e}second, even if you specified schema_fields = None  it doesn't 
matter as this is the default value of schema_fields.{color}

{color:#24292e}The block of {color}
{code:java}
if not self.schema_fields:{code}
  is there in cases that schema_fields need to be overwrite after this block 
either it will have a value or it will be None.

 

{color:#24292e}Please provide your DAG for us to test. {color}

> GCS to BQ operator leaves schema_fields operator unset when autodetect=True
> ---
>
> Key: AIRFLOW-3316
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3316
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.1
>Reporter: Conrad Lee
>Assignee: Conrad Lee
>Priority: Minor
>
> When I use the GoogleCloudStorageToBigQueryOperator to load data from Parquet 
> into BigQuery, I leave the schema_fields argument set to 'None' and set 
> autodetect=True.
>  
> This causes the following error: 
>  
> {code:java}
> [2018-11-08 09:42:03,690] {models.py:1736} ERROR - local variable 
> 'schema_fields' referenced before assignment
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/home/airflow/gcs/plugins/bq_operator_updated.py", line 2018, in 
> execut
> schema_fields=schema_fields
> UnboundLocalError: local variable 'schema_fields' referenced before assignmen
> {code}
>  
> The problem is this set of checks in which the schema_fields variable is set 
> neglects to cover all the cases
> {code:java}
> if not self.schema_fields:
>   if self.schema_object and self.source_format != 'DATASTORE_BACKUP':
> gcs_hook = GoogleCloudStorageHook(
> google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, 
> delegate_to=self.delegate_to)
> schema_fields = json.loads(gcs_hook.download(
>   self.bucket,
>   self.schema_object).decode("utf-8"))
>   elif self.schema_object is None and self.autodetect is False:
> raise ValueError('At least one of `schema_fields`, `schema_object`, '
> 'or `autodetect` must be passed.')
> else:
> schema_fields = self.schema_fields
> {code}
> After the `elif` we need to handle the case where autodetect is set to True.  
> This can be done by simply adding two lines:
> {code:java}
> if not self.schema_fields:
>   if self.schema_object and self.source_format != 'DATASTORE_BACKUP':
> gcs_hook = GoogleCloudStorageHook(
> google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, 
> delegate_to=self.delegate_to)
> schema_fields = json.loads(gcs_hook.download(
>   self.bucket,
>   self.schema_object).decode("utf-8"))
>   elif self.schema_object is None and self.autodetect is False:
> raise ValueError('At least one of `schema_fields`, `schema_object`, '
> 'or `autodetect` must be passed.')
>   else:
> schema_fiels = None
> else:
> schema_fields = self.schema_fields{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4364: [AIRFLOW-3550] Standardize GKE hook.

2018-12-29 Thread GitBox
kaxil closed pull request #4364: [AIRFLOW-3550] Standardize GKE hook.
URL: https://github.com/apache/incubator-airflow/pull/4364
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/gcp_container_hook.py 
b/airflow/contrib/hooks/gcp_container_hook.py
index 3934f07a95..4a610e56c9 100644
--- a/airflow/contrib/hooks/gcp_container_hook.py
+++ b/airflow/contrib/hooks/gcp_container_hook.py
@@ -21,7 +21,7 @@
 import time
 
 from airflow import AirflowException, version
-from airflow.hooks.base_hook import BaseHook
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
 
 from google.api_core.exceptions import AlreadyExists, NotFound
 from google.api_core.gapic_v1.method import DEFAULT
@@ -34,15 +34,24 @@
 OPERATIONAL_POLL_INTERVAL = 15
 
 
-class GKEClusterHook(BaseHook):
+class GKEClusterHook(GoogleCloudBaseHook):
 
-def __init__(self, project_id, location):
-self.project_id = project_id
+def __init__(self,
+ gcp_conn_id='google_cloud_default',
+ delegate_to=None,
+ location=None):
+super(GKEClusterHook, self).__init__(
+gcp_conn_id=gcp_conn_id, delegate_to=delegate_to)
+self._client = None
 self.location = location
 
-# Add client library info for better error tracking
-client_info = ClientInfo(client_library_version='airflow_v' + 
version.version)
-self.client = 
container_v1.ClusterManagerClient(client_info=client_info)
+def get_client(self):
+if self._client is None:
+credentials = self._get_credentials()
+# Add client library info for better error tracking
+client_info = ClientInfo(client_library_version='airflow_v' + 
version.version)
+self._client = 
container_v1.ClusterManagerClient(credentials=credentials, 
client_info=client_info)
+return self._client
 
 @staticmethod
 def _dict_to_proto(py_dict, proto):
@@ -60,13 +69,15 @@ def _dict_to_proto(py_dict, proto):
 dict_json_str = json.dumps(py_dict)
 return json_format.Parse(dict_json_str, proto)
 
-def wait_for_operation(self, operation):
+def wait_for_operation(self, operation, project_id=None):
 """
 Given an operation, continuously fetches the status from Google Cloud 
until either
 completion or an error occurring
 
 :param operation: The Operation to wait for
 :type operation: A google.cloud.container_V1.gapic.enums.Operator
+:param project_id: Google Cloud Platform project ID
+:type project_id: str
 :return: A new, updated operation fetched from Google Cloud
 """
 self.log.info("Waiting for OPERATION_NAME %s" % operation.name)
@@ -79,20 +90,22 @@ def wait_for_operation(self, operation):
 raise exceptions.GoogleCloudError(
 "Operation has failed with status: %s" % operation.status)
 # To update status of operation
-operation = self.get_operation(operation.name)
+operation = self.get_operation(operation.name, 
project_id=project_id or self.project_id)
 return operation
 
-def get_operation(self, operation_name):
+def get_operation(self, operation_name, project_id=None):
 """
 Fetches the operation from Google Cloud
 
 :param operation_name: Name of operation to fetch
 :type operation_name: str
+:param project_id: Google Cloud Platform project ID
+:type project_id: str
 :return: The new, updated operation from Google Cloud
 """
-return self.client.get_operation(project_id=self.project_id,
- zone=self.location,
- operation_id=operation_name)
+return self.get_client().get_operation(project_id=project_id or 
self.project_id,
+   zone=self.location,
+   operation_id=operation_name)
 
 @staticmethod
 def _append_label(cluster_proto, key, val):
@@ -114,7 +127,7 @@ def _append_label(cluster_proto, key, val):
 cluster_proto.resource_labels.update({key: val})
 return cluster_proto
 
-def delete_cluster(self, name, retry=DEFAULT, timeout=DEFAULT):
+def delete_cluster(self, name, project_id=None, retry=DEFAULT, 
timeout=DEFAULT):
 """
 Deletes the cluster, including the Kubernetes endpoint and all
 worker nodes. Firewalls and routes that were configured during
@@ -125,6 +138,8 @@ def delete_cluster(self, name, retry=DEFAULT, 
timeout=DEFAULT):
 
 :param name: The name of the cluster to delete

[jira] [Resolved] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3568.
-
   Resolution: Fixed
Fix Version/s: 1.10.2

https://github.com/apache/incubator-airflow/pull/4371

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
> Fix For: 1.10.2
>
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> 

[jira] [Commented] (AIRFLOW-2939) `set` fails in case of `exisiting_files is None` and in case of `json.dumps`

2018-12-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730853#comment-16730853
 ] 

ASF GitHub Bot commented on AIRFLOW-2939:
-

kaxil commented on pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix 
TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `set` fails in case of `exisiting_files is None` and in case of `json.dumps`
> 
>
> Key: AIRFLOW-2939
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2939
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Kiyoshi Nomo
>Assignee: Yohei Onishi
>Priority: Major
> Fix For: 1.10.2
>
>
> h1. Problems
> h2. TypeError: 'NoneType' object is not iterable
> [https://github.com/apache/incubator-airflow/blob/06b62c42b0b55ea55b86b130317594738d2f36a2/airflow/contrib/operators/gcs_to_s3.py#L91]
>  
> {code:java}
> >>> set(None)
> Traceback (most recent call last):
> File "", line 1, in 
> TypeError: 'NoneType' object is not iterable
> {code}
>  
> h2. TypeError: set(['a']) is not JSON serializable
> [https://github.com/apache/incubator-airflow/blob/b78c7fb8512f7a40f58b46530e9b3d5562fe84ea/airflow/models.py#L4483]
>  
> {code:python}
> >>> json.dumps(set(['a']))
> Traceback (most recent call last):
> File "", line 1, in 
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/__init__.py", 
> line 244, in dumps
> return _default_encoder.encode(obj)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 207, in encode
> chunks = self.iterencode(o, _one_shot=True)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 270, in iterencode
> return _iterencode(o, 0)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 184, in default
> raise TypeError(repr(o) + " is not JSON serializable")
> TypeError: set(['a']) is not JSON serializable
> {code}
>  
> h1. Solution
>  * Check that the existing fils is not None.
>  * Convert it to the `set` and return it to the `list` after get to the 
> difference of files.
> {code:python}
> if existing_files is not None:
> files = list(set(files) - set(existing_files))
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3371) BigQueryHook's Ability to Create View

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3371:

Fix Version/s: (was: 2.0.0)
   1.10.2

> BigQueryHook's Ability to Create View
> -
>
> Key: AIRFLOW-3371
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3371
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Ryan Yuan
>Assignee: Ryan Yuan
>Priority: Major
> Fix For: 1.10.2
>
>
> Modify *BigQueryBaseCursor.create_empty_table()* to take in an optional 
> 'view' parameter to create view in BigQuery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3355) Fix BigQueryCursor.execute to work with Python3

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3355:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Fix BigQueryCursor.execute to work with Python3
> ---
>
> Key: AIRFLOW-3355
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3355
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp, hooks
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 1.10.2
>
>
> {{BigQueryCursor.execute}} uses {{dict.iteritems}} internally, so it fails 
> with Python3 if binding parameters are provided.
> {code}
> In [1]: import sys
> In [2]: sys.version
> Out[2]: '3.6.6 (default, Sep 12 2018, 18:26:19) \n[GCC 8.0.1 20180414 
> (experimental) [trunk revision 259383]]'
> In [3]: from airflow.contrib.hooks.bigquery_hook import BigQueryHook
> In [4]: hook = BigQueryHook()
> In [5]: conn = hook.get_conn()
> [2018-11-15 19:01:35,856] {discovery.py:267} INFO - URL being requested: GET 
> https://www.googleapis.com/discovery/v1/apis/bigquery/v2/rest
> In [6]: cur = conn.cursor()
> In [7]: cur.execute("SELECT count(*) FROM ds.t WHERE c = %(v)d", {"v": 0})
> ---
> AttributeErrorTraceback (most recent call last)
>  in 
> > 1 cur.execute("SELECT count(*) FROM ds.t WHERE c = %(v)d", {"v": 0})
> ~/dev/incubator-airflow/airflow/contrib/hooks/bigquery_hook.py in 
> execute(self, operation, parameters)
>1561 """
>1562 sql = _bind_parameters(operation,
> -> 1563parameters) if parameters else 
> operation
>1564 self.job_id = self.run_query(sql)
>1565
> ~/dev/incubator-airflow/airflow/contrib/hooks/bigquery_hook.py in 
> _bind_parameters(operation, parameters)
>1684 # inspired by MySQL Python Connector (conversion.py)
>1685 string_parameters = {}
> -> 1686 for (name, value) in parameters.iteritems():
>1687 if value is None:
>1688 string_parameters[name] = 'NULL'
> AttributeError: 'dict' object has no attribute 'iteritems'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3332) Add BigQuery Streaming insert_all to BigQueryHook

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3332:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Add BigQuery Streaming insert_all to BigQueryHook
> -
>
> Key: AIRFLOW-3332
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3332
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Ryan Yuan
>Assignee: Ryan Yuan
>Priority: Major
> Fix For: 1.10.2
>
>
> Add a function to BigQueryHook to allow inserting one or more rows into a 
> BigQuery table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2863) GKEClusterHook catches wrong exception

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-2863.
-
   Resolution: Fixed
Fix Version/s: 1.10.2

> GKEClusterHook catches wrong exception
> --
>
> Key: AIRFLOW-2863
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2863
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Cameron Moberg
>Assignee: Cameron Moberg
>Priority: Minor
> Fix For: 1.10.2
>
>
> Instead of successfully catching the error and reporting success, it reports 
> a failure, since it catches the wrong error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3550) GKEClusterHook doesn't use gcp_conn_id

2018-12-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730856#comment-16730856
 ] 

ASF GitHub Bot commented on AIRFLOW-3550:
-

kaxil commented on pull request #4364: [AIRFLOW-3550] Standardize GKE hook.
URL: https://github.com/apache/incubator-airflow/pull/4364
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> GKEClusterHook doesn't use gcp_conn_id
> --
>
> Key: AIRFLOW-3550
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3550
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Wilson Lian
>Priority: Major
>
> The hook doesn't inherit from GoogleCloudBaseHook. API calls are made using 
> the default service account (if present).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2917) Set AIRFLOW__CORE__SQL_ALCHEMY_CONN only when needed for k8s executor

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2917:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Set AIRFLOW__CORE__SQL_ALCHEMY_CONN only when needed for k8s executor
> -
>
> Key: AIRFLOW-2917
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2917
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor
>Affects Versions: 1.10.0
>Reporter: John Cheng
>Assignee: John Cheng
>Priority: Minor
> Fix For: 1.10.2
>
>
> In Kubernetes executor, `AIRFLOW__CORE__SQL_ALCHEMY_CONN` is set as an 
> environment variable even when it is specified in configmap or secrets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-2997) Support for clustered tables in Bigquery hooks/operators

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik reopened AIRFLOW-2997:
-

> Support for clustered tables in Bigquery hooks/operators
> 
>
> Key: AIRFLOW-2997
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2997
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Gordon Ball
>Priority: Minor
> Fix For: 1.10.2
>
>
> Bigquery support for clustered tables was added (at GCP "Beta" level) on 
> 2018-07-30. This feature allows load or table-creating query operations to 
> request that data be stored sorted by a subset of columns, allowing more 
> efficient (and potentially cheaper) subsequent queries.
>  Support for specifying fields to cluster on should be added to at least the 
> bigquery hook, load-from-GCS operator and query operator.
>  Documentation: https://cloud.google.com/bigquery/docs/clustered-tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2640) Add Cassandra table sensor

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2640:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Add Cassandra table sensor
> --
>
> Key: AIRFLOW-2640
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2640
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 1.10.2
>
>
> Just like a partition sensor for Hive, add a sensor to wait for a table to be 
> created in a Cassandra cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2916) Add argument `verify` for AwsHook() and S3 related sensors/operators

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2916:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Add argument `verify` for AwsHook() and S3 related sensors/operators
> 
>
> Key: AIRFLOW-2916
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2916
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks, operators
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Minor
> Fix For: 1.10.2
>
>
> The AwsHook() and S3-related operators/sensors are depending on package boto3.
> In boto3, when we initiate a client or a resource, argument `verify` is 
> provided (https://boto3.readthedocs.io/en/latest/reference/core/session.html 
> ).
> It is useful when
>  # users want to use a different CA cert bundle than the one used by botocore.
>  # users want to have '--no-verify-ssl'. This is especially useful when we're 
> using on-premises S3 or other implementations of object storage, like IBM's 
> Cloud Object Storage.
> However, this feature is not provided in Airflow for S3 yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2889) Fix typos detected by github.com/client9/misspell

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2889:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Fix typos detected by github.com/client9/misspell
> -
>
> Key: AIRFLOW-2889
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2889
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kazuhiro Sera
>Priority: Minor
> Fix For: 1.10.2
>
>
> Fixing typos is sometimes very hard. It's not so easy to visually review 
> them. Recently, I discovered a very useful tool for it, 
> [misspell](https://github.com/client9/misspell).
> This pull request fixes minor typos detected by 
> [misspell](https://github.com/client9/misspell) except for the false 
> positives. If you would like me to work on other files as well, let me know.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-491) Add cache parameter in BigQuery query method

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-491:
---
Fix Version/s: (was: 2.0.0)
   1.10.2

> Add cache parameter in BigQuery query method
> 
>
> Key: AIRFLOW-491
> URL: https://issues.apache.org/jira/browse/AIRFLOW-491
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, gcp
>Affects Versions: 1.7.1
>Reporter: Chris Riccomini
>Assignee: Iuliia Volkova
>Priority: Major
> Fix For: 1.10.2
>
>
> The current BigQuery query() method does not have a user_query_cache 
> parameter. This param always defaults to true (see 
> [here|https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.query]).
>  I'd like to disable query caching for some data consistency checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2758) Add a sensor for MongoDB

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2758:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Add a sensor for MongoDB
> 
>
> Key: AIRFLOW-2758
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2758
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 1.10.2
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2755) k8s workers think DAGs are always in `/tmp/dags`

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2755:

Fix Version/s: (was: 2.0.0)
   1.10.2

> k8s workers think DAGs are always in `/tmp/dags`
> 
>
> Key: AIRFLOW-2755
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2755
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, worker
>Reporter: Aldo
>Assignee: Aldo
>Priority: Minor
> Fix For: 1.10.2
>
>
> We have Airflow configured to use the `KubernetesExecutor` and run tasks in 
> newly created pods.
> I tried to use the `PythonOperator` to import the python callable from a 
> python module located in the DAGs directory as [that should be 
> possible|https://github.com/apache/incubator-airflow/blob/c7a472ed6b0d8a4720f57ba1140c8cf665757167/airflow/__init__.py#L42].
>  Airflow complained that the module was not found.
> After a fair amount of digging we found that the issue was that the workers 
> have the `AIRFLOW__CORE__DAGS_FOLDER` environment variable set to `/tmp/dags` 
> as [you can see from the 
> code|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/kubernetes/worker_configuration.py#L84].
> Unset that environment variable from within the task's pod and running the 
> task manually worked as expected. I think that this path should be 
> configurable (I'll give it a try to add a `kubernetes.worker_dags_folder` 
> configuration).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2655) Default Kubernetes worker configurations are inconsistent

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2655:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Default Kubernetes worker configurations are inconsistent
> -
>
> Key: AIRFLOW-2655
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2655
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor
>Affects Versions: 1.10.0
>Reporter: Shintaro Murakami
>Priority: Minor
> Fix For: 1.10.2
>
>
> if optional config `airflow_configmap` is not set, the worker configured with 
> `LocalExecutor` and sql_alchemy_conn starts with `sqlite`.
> This combination is not allowed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3402) Set default kubernetes affinity and toleration settings in airflow.cfg

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3402:

Fix Version/s: 1.10.2

> Set default kubernetes affinity and toleration settings in airflow.cfg
> --
>
> Key: AIRFLOW-3402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3402
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: kubernetes
>Reporter: Kevin Pullin
>Priority: Major
> Fix For: 1.10.2
>
>
> Currently airflow supports setting kubernetes `affinity` and `toleration` 
> configuration inside dags using either a `KubernetesExecutorConfig` 
> definition or using the `KubernetesPodOperator`.
> In order to reduce having to set and maintain this configuration in every 
> dag, it'd be useful to have the ability to set these globally in the 
> airflow.cfg file.  One use case is to force all kubernetes pods to run on a 
> particular set of dedicated airflow nodes, which requires both affinity rules 
> and tolerations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] stale[bot] closed pull request #4127: Bug Fix: Secrets object and key separated by ":"

2018-12-29 Thread GitBox
stale[bot] closed pull request #4127: Bug Fix: Secrets object and key separated 
by ":"
URL: https://github.com/apache/incubator-airflow/pull/4127
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/kubernetes/worker_configuration.py 
b/airflow/contrib/kubernetes/worker_configuration.py
index 74658e384a..83fa93e431 100644
--- a/airflow/contrib/kubernetes/worker_configuration.py
+++ b/airflow/contrib/kubernetes/worker_configuration.py
@@ -97,7 +97,7 @@ def _get_secrets(self):
 """Defines any necessary secrets for the pod executor"""
 worker_secrets = []
 for env_var_name, obj_key_pair in 
six.iteritems(self.kube_config.kube_secrets):
-k8s_secret_obj, k8s_secret_key = obj_key_pair.split('=')
+k8s_secret_obj, k8s_secret_key = obj_key_pair.split(':')
 worker_secrets.append(
 Secret('env', env_var_name, k8s_secret_obj, k8s_secret_key))
 return worker_secrets


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-2939) `set` fails in case of `exisiting_files is None` and in case of `json.dumps`

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-2939.
-
   Resolution: Fixed
Fix Version/s: 1.10.2

Resolved by https://github.com/apache/incubator-airflow/pull/4371

> `set` fails in case of `exisiting_files is None` and in case of `json.dumps`
> 
>
> Key: AIRFLOW-2939
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2939
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Kiyoshi Nomo
>Assignee: Yohei Onishi
>Priority: Major
> Fix For: 1.10.2
>
>
> h1. Problems
> h2. TypeError: 'NoneType' object is not iterable
> [https://github.com/apache/incubator-airflow/blob/06b62c42b0b55ea55b86b130317594738d2f36a2/airflow/contrib/operators/gcs_to_s3.py#L91]
>  
> {code:java}
> >>> set(None)
> Traceback (most recent call last):
> File "", line 1, in 
> TypeError: 'NoneType' object is not iterable
> {code}
>  
> h2. TypeError: set(['a']) is not JSON serializable
> [https://github.com/apache/incubator-airflow/blob/b78c7fb8512f7a40f58b46530e9b3d5562fe84ea/airflow/models.py#L4483]
>  
> {code:python}
> >>> json.dumps(set(['a']))
> Traceback (most recent call last):
> File "", line 1, in 
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/__init__.py", 
> line 244, in dumps
> return _default_encoder.encode(obj)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 207, in encode
> chunks = self.iterencode(o, _one_shot=True)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 270, in iterencode
> return _iterencode(o, 0)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 184, in default
> raise TypeError(repr(o) + " is not JSON serializable")
> TypeError: set(['a']) is not JSON serializable
> {code}
>  
> h1. Solution
>  * Check that the existing fils is not None.
>  * Convert it to the `set` and return it to the `list` after get to the 
> difference of files.
> {code:python}
> if existing_files is not None:
> files = list(set(files) - set(existing_files))
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2887) Add to BigQueryBaseCursor methods for creating insert dataset

2018-12-29 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2887:

Fix Version/s: (was: 2.0.0)
   1.10.2

> Add to BigQueryBaseCursor methods for creating insert dataset
> -
>
> Key: AIRFLOW-2887
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2887
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
> Fix For: 1.10.2
>
>
> In BigQueryBaseCursor exist only:
> def delete_dataset(self, project_id, dataset_id)
>  And there are no hook to 
> create([https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert)]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-29 Thread GitBox
kaxil closed pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on 
GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/gcs_to_s3.py 
b/airflow/contrib/operators/gcs_to_s3.py
index 23a4e9cec8..6029661f37 100644
--- a/airflow/contrib/operators/gcs_to_s3.py
+++ b/airflow/contrib/operators/gcs_to_s3.py
@@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 if files:
 hook = GoogleCloudStorageHook(
diff --git a/airflow/contrib/operators/s3_to_gcs_operator.py 
b/airflow/contrib/operators/s3_to_gcs_operator.py
index 6fbe2c0b83..9008c2da1c 100644
--- a/airflow/contrib/operators/s3_to_gcs_operator.py
+++ b/airflow/contrib/operators/s3_to_gcs_operator.py
@@ -152,7 +152,7 @@ def execute(self, context):
 else:
 existing_files.append(f)
 
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 if len(files) > 0:
 self.log.info('{0} files are going to be synced: {1}.'.format(
 len(files), files))


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services