[jira] [Reopened] (AIRFLOW-3562) Remove DagBag dependency inside the webserver

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reopened AIRFLOW-3562:
---

> Remove DagBag dependency inside the webserver
> -
>
> Key: AIRFLOW-3562
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3562
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Peter van 't Hof
>Assignee: Peter van 't Hof
>Priority: Critical
> Fix For: 2.0.0
>
>
> Currently the webserver does depend on the dag directory and reparsing the 
> dag files. Because this happens with gunicorn each process has there own 
> DagBag. The means the webserver is statefull and depending which process is 
> assigned a request the information can be changed. If all information will 
> come from the database this will not happen. The only process that should add 
> dag/dagruns is the scheduler.
> The webserver should have enough on only the database. Currently not all 
> information is already in the database. This could be fixable, already 
> started this in  AIRFLOW-3561.
> I do notice this might give some conflicts with the brace change that is 
> coming. Any suggestion how to approach this?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3561) Improve some views

2018-12-26 Thread Peter van 't Hof (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter van 't Hof resolved AIRFLOW-3561.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Merged

> Improve some views
> --
>
> Key: AIRFLOW-3561
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3561
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Reporter: Peter van 't Hof
>Assignee: Peter van 't Hof
>Priority: Minor
> Fix For: 2.0.0
>
>
> Some views does interaction with the dag bag while is not needed for the 
> query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3562) Remove DagBag dependency inside the webserver

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned AIRFLOW-3562:
-

Assignee: Peter van 't Hof

> Remove DagBag dependency inside the webserver
> -
>
> Key: AIRFLOW-3562
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3562
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Peter van 't Hof
>Assignee: Peter van 't Hof
>Priority: Critical
> Fix For: 2.0.0
>
>
> Currently the webserver does depend on the dag directory and reparsing the 
> dag files. Because this happens with gunicorn each process has there own 
> DagBag. The means the webserver is statefull and depending which process is 
> assigned a request the information can be changed. If all information will 
> come from the database this will not happen. The only process that should add 
> dag/dagruns is the scheduler.
> The webserver should have enough on only the database. Currently not all 
> information is already in the database. This could be fixable, already 
> started this in  AIRFLOW-3561.
> I do notice this might give some conflicts with the brace change that is 
> coming. Any suggestion how to approach this?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-3562) Remove DagBag dependency inside the webserver

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong closed AIRFLOW-3562.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Remove DagBag dependency inside the webserver
> -
>
> Key: AIRFLOW-3562
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3562
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0, 1.10.1
>Reporter: Peter van 't Hof
>Assignee: Peter van 't Hof
>Priority: Critical
> Fix For: 2.0.0
>
>
> Currently the webserver does depend on the dag directory and reparsing the 
> dag files. Because this happens with gunicorn each process has there own 
> DagBag. The means the webserver is statefull and depending which process is 
> assigned a request the information can be changed. If all information will 
> come from the database this will not happen. The only process that should add 
> dag/dagruns is the scheduler.
> The webserver should have enough on only the database. Currently not all 
> information is already in the database. This could be fixable, already 
> started this in  AIRFLOW-3561.
> I do notice this might give some conflicts with the brace change that is 
> coming. Any suggestion how to approach this?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729402#comment-16729402
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

OK I will check when I have time.

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: 

[GitHub] XD-DENG opened a new pull request #4375: [AIRFLOW-XXX] Fix/complete example code in plugins.rst

2018-12-26 Thread GitBox
XD-DENG opened a new pull request #4375: [AIRFLOW-XXX] Fix/complete example 
code in plugins.rst
URL: https://github.com/apache/incubator-airflow/pull/4375
 
 
   ### Jira
   
   Pure doc change
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   How `AppBuilderBaseView` should be imported was not included in the example 
code. This may confuse users (in the whole codebase, the only location in which 
this was mentioned is `tests/plugins/test_plugin.py`. Normal users of Airflow 
may never get to know it).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729399#comment-16729399
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

Thanks, will do.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3561) Improve some views

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729401#comment-16729401
 ] 

ASF GitHub Bot commented on AIRFLOW-3561:
-

Fokko commented on pull request #4368: [AIRFLOW-3561] Improve queries
URL: https://github.com/apache/incubator-airflow/pull/4368
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve some views
> --
>
> Key: AIRFLOW-3561
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3561
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Reporter: Peter van 't Hof
>Assignee: Peter van 't Hof
>Priority: Minor
>
> Some views does interaction with the dag bag while is not needed for the 
> query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko closed pull request #4368: [AIRFLOW-3561] Improve queries

2018-12-26 Thread GitBox
Fokko closed pull request #4368: [AIRFLOW-3561] Improve queries
URL: https://github.com/apache/incubator-airflow/pull/4368
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/migrations/versions/c8ffec048a3b_add_fields_to_dag.py 
b/airflow/migrations/versions/c8ffec048a3b_add_fields_to_dag.py
new file mode 100644
index 00..a74e0b50a4
--- /dev/null
+++ b/airflow/migrations/versions/c8ffec048a3b_add_fields_to_dag.py
@@ -0,0 +1,44 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""add fields to dag
+
+Revision ID: c8ffec048a3b
+Revises: 41f5f12752f8
+Create Date: 2018-12-23 21:55:46.463634
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = 'c8ffec048a3b'
+down_revision = '41f5f12752f8'
+branch_labels = None
+depends_on = None
+
+from alembic import op
+import sqlalchemy as sa
+
+
+def upgrade():
+op.add_column('dag', sa.Column('description', sa.Text(), nullable=True))
+op.add_column('dag', sa.Column('default_view', sa.String(25), 
nullable=True))
+
+
+def downgrade():
+op.drop_column('dag', 'description')
+op.drop_column('dag', 'default_view')
diff --git a/airflow/models/__init__.py b/airflow/models/__init__.py
index aa93f5cb88..4bff05b721 100755
--- a/airflow/models/__init__.py
+++ b/airflow/models/__init__.py
@@ -2999,15 +2999,29 @@ class DagModel(Base):
 fileloc = Column(String(2000))
 # String representing the owners
 owners = Column(String(2000))
+# Description of the dag
+description = Column(Text)
+# Default view of the inside the webserver
+default_view = Column(String(25))
 
 def __repr__(self):
 return "".format(self=self)
 
+@property
+def timezone(self):
+return settings.TIMEZONE
+
 @classmethod
 @provide_session
 def get_current(cls, dag_id, session=None):
 return session.query(cls).filter(cls.dag_id == dag_id).first()
 
+def get_default_view(self):
+if self.default_view is None:
+return configuration.conf.get('webserver', 
'dag_default_view').lower()
+else:
+return self.default_view
+
 
 @functools.total_ordering
 class DAG(BaseDag, LoggingMixin):
@@ -3109,7 +3123,7 @@ def __init__(
 'core', 'max_active_runs_per_dag'),
 dagrun_timeout=None,
 sla_miss_callback=None,
-default_view=configuration.conf.get('webserver', 
'dag_default_view').lower(),
+default_view=None,
 orientation=configuration.conf.get('webserver', 'dag_orientation'),
 catchup=configuration.conf.getboolean('scheduler', 
'catchup_by_default'),
 on_success_callback=None, on_failure_callback=None,
@@ -3180,7 +3194,7 @@ def __init__(
 self.max_active_runs = max_active_runs
 self.dagrun_timeout = dagrun_timeout
 self.sla_miss_callback = sla_miss_callback
-self.default_view = default_view
+self._default_view = default_view
 self.orientation = orientation
 self.catchup = catchup
 self.is_subdag = False  # DagBag.bag_dag() will set this to True if 
appropriate
@@ -3249,6 +3263,13 @@ def __exit__(self, _type, _value, _tb):
 
 # /Context Manager --
 
+def get_default_view(self):
+"""This is only there for backward compatible jinja2 templates"""
+if self._default_view is None:
+return configuration.conf.get('webserver', 
'dag_default_view').lower()
+else:
+return self._default_view
+
 def date_range(self, start_date, num=None, end_date=timezone.utcnow()):
 if num:
 end_date = None
@@ -4201,6 +4222,8 @@ def sync_to_db(self, owner=None, sync_time=None, 
session=None):
 orm_dag.owners = owner
 orm_dag.is_active = True
 orm_dag.last_scheduler_run = sync_time
+orm_dag.default_view = self._default_view
+orm_dag.description = self.description
 session.merge(orm_dag)
 

[jira] [Comment Edited] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729400#comment-16729400
 ] 

jack edited comment on AIRFLOW-3568 at 12/27/18 7:31 AM:
-

[~yohei] No problem. note that it was approved but not yet merged. It might be 
few more days as maybe other committers will have more comments.

You are more than welcome to check the Jira for more issues to fix.


was (Author: jackjack10):
[~yohei] No problem. note that it was approved but not yet merged. It might be 
few more days as maybe other committers will have more comments. 

You are more than welcome to search more issues with the gcp label to fix:

https://issues.apache.org/jira/browse/AIRFLOW-3503?jql=project%20%3D%20AIRFLOW%20AND%20labels%20%3D%20gcp

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = 

[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729400#comment-16729400
 ] 

jack commented on AIRFLOW-3568:
---

[~yohei] No problem. note that it was approved but not yet merged. It might be 
few more days as maybe other committers will have more comments. 

You are more than welcome to search more issues with the gcp label to fix:

https://issues.apache.org/jira/browse/AIRFLOW-3503?jql=project%20%3D%20AIRFLOW%20AND%20labels%20%3D%20gcp

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 

[jira] [Assigned] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi reassigned AIRFLOW-3571:
-

Assignee: Yohei Onishi

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729398#comment-16729398
 ] 

jack commented on AIRFLOW-3571:
---

[~yohei] There is an open PR for adding support for locations to the 
BigQueryHook  https://github.com/apache/incubator-airflow/pull/4324

Once it's merged (I assume it would be soon) you are welcome to open PRs for 
extending the Operators to support it.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3572) Docs implies that providing remote_log_conn_id is mandatory

2018-12-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Villas Bôas Chaves updated AIRFLOW-3572:
---
Description: 
The docs on setting up remote logging with S3 currently states:

> To enable this feature, airflow.cfg must be configured as in this example:

... example ...

> In the above example, Airflow will try to use {{S3Hook('MyS3Conn')}}.

When in fact you can just leave `remote_log_conn_id =` (or nothing, because 
that's the default) if you have your credentials set in the process 
environment, because Airflow will fallback to boto3 config auto-detection.

This behavior is by design as the comment hints 
[here|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/aws_hook.py#L155-L158]
 so it should be documented.

  was:
The docs on setting up remote logging with S3 currently states:

> To enable this feature, airflow.cfg must be configured as in this example:

... example ...

> In the above example, Airflow will try to use {{S3Hook('MyS3Conn')}}.

When in fact you can just leave `remote_log_conn_id =` (or nothing, because 
that's the default) if you have your credentials set in the process 
environment, because Airflow will fallback to boto3 config auto-detection.


This behavior is by purpose as the comment hints 
[here|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/aws_hook.py#L155-L158]
 so it should be documented.


> Docs implies that providing remote_log_conn_id is mandatory
> ---
>
> Key: AIRFLOW-3572
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3572
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.10.1
>Reporter: Victor Villas Bôas Chaves
>Priority: Minor
>
> The docs on setting up remote logging with S3 currently states:
> > To enable this feature, airflow.cfg must be configured as in this example:
> ... example ...
> > In the above example, Airflow will try to use {{S3Hook('MyS3Conn')}}.
> When in fact you can just leave `remote_log_conn_id =` (or nothing, because 
> that's the default) if you have your credentials set in the process 
> environment, because Airflow will fallback to boto3 config auto-detection.
> This behavior is by design as the comment hints 
> [here|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/aws_hook.py#L155-L158]
>  so it should be documented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3572) Docs implies that providing remote_log_conn_id is mandatory

2018-12-26 Thread JIRA
Victor Villas Bôas Chaves created AIRFLOW-3572:
--

 Summary: Docs implies that providing remote_log_conn_id is 
mandatory
 Key: AIRFLOW-3572
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3572
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.10.1
Reporter: Victor Villas Bôas Chaves


The docs on setting up remote logging with S3 currently states:

> To enable this feature, airflow.cfg must be configured as in this example:

... example ...

> In the above example, Airflow will try to use {{S3Hook('MyS3Conn')}}.

When in fact you can just leave `remote_log_conn_id =` (or nothing, because 
that's the default) if you have your credentials set in the process 
environment, because Airflow will fallback to boto3 config auto-detection.


This behavior is by purpose as the comment hints 
[here|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/aws_hook.py#L155-L158]
 so it should be documented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmcarp commented on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer operator.

2018-12-26 Thread GitBox
jmcarp commented on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer 
operator.
URL: 
https://github.com/apache/incubator-airflow/pull/4331#issuecomment-450062981
 
 
   Updated!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729281#comment-16729281
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

[~jackjack10] My PR is approved. Thank you for your support.

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] 

[GitHub] kaxil commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] 
fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244072798
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   Yup, yup. You are right


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
yohei1126 commented on a change in pull request #4371: 
[AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / 
S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244072388
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   If you think `xcom` should support `set`, this is a bug on `xcom` since the 
current implementation uses `json.dumps` and it only supports JSON serializable 
( `set` is not JSON serializable).
   
   ```
   >>> list = [1,2,3,1]
   >>> json.dumps(list).encode('UTF-8')
   b'[1, 2, 3, 1]'
   >>> s = set(list)
   >>> json.dumps(s).encode('UTF-8')
   Traceback (most recent call last):
 File "", line 1, in 
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/__init__.py", line 
231, in dumps
   return _default_encoder.encode(obj)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
199, in encode
   chunks = self.iterencode(o, _one_shot=True)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
257, in iterencode
   return _iterencode(o, 0)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
179, in default
   raise TypeError(f'Object of type {o.__class__.__name__} '
   TypeError: Object of type set is not JSON serializable
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
yohei1126 commented on a change in pull request #4371: 
[AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / 
S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244072388
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   If you think `xcom` should support `set`, this is a bug on `xcom` since the 
current implementation only supports JSON serializable only ( `set` is not JSON 
serializable).
   
   ```
   >>> list = [1,2,3,1]
   >>> json.dumps(list).encode('UTF-8')
   b'[1, 2, 3, 1]'
   >>> s = set(list)
   >>> json.dumps(s).encode('UTF-8')
   Traceback (most recent call last):
 File "", line 1, in 
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/__init__.py", line 
231, in dumps
   return _default_encoder.encode(obj)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
199, in encode
   chunks = self.iterencode(o, _one_shot=True)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
257, in iterencode
   return _iterencode(o, 0)
 File 
"/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 
179, in default
   raise TypeError(f'Object of type {o.__class__.__name__} '
   TypeError: Object of type set is not JSON serializable
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
yohei1126 commented on a change in pull request #4371: 
[AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / 
S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244071926
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   added traceback on the description. According to the traceback model.py 
internally uses `json.dumps(value).encode` but it does not support set since 
set is not JSON serializable


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer operator.

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4331: [AIRFLOW-3531] Add gcs to gcs 
transfer operator.
URL: 
https://github.com/apache/incubator-airflow/pull/4331#issuecomment-447721611
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=h1)
 Report
   > Merging 
[#4331](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/319a659dda9d50ca8c1596904315c9664fc74720?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4331/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4331  +/-   ##
   ==
   + Coverage   78.15%   78.15%   +<.01% 
   ==
 Files 204  204  
 Lines   1649916499  
   ==
   + Hits1289412895   +1 
   + Misses   3605 3604   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4331/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.81% <0%> (+0.04%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=footer).
 Last update 
[319a659...37a6ee5](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
yohei1126 commented on a change in pull request #4371: 
[AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / 
S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244071926
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   added traceback on the description. According to the traceback model.py 
internally uses json.dumps(value).encode but it does not support set since set 
is not JSON serializable


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer operator.

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4331: [AIRFLOW-3531] Add gcs to gcs 
transfer operator.
URL: 
https://github.com/apache/incubator-airflow/pull/4331#issuecomment-447721611
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=h1)
 Report
   > Merging 
[#4331](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/319a659dda9d50ca8c1596904315c9664fc74720?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4331/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4331  +/-   ##
   ==
   + Coverage   78.15%   78.15%   +<.01% 
   ==
 Files 204  204  
 Lines   1649916499  
   ==
   + Hits1289412895   +1 
   + Misses   3605 3604   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4331/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.81% <0%> (+0.04%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=footer).
 Last update 
[319a659...37a6ee5](https://codecov.io/gh/apache/incubator-airflow/pull/4331?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
yohei1126 commented on a change in pull request #4371: 
[AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / 
S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244070449
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   Actually the error log says it does not support `set`: ` TypeError: 
'NoneType' object is not iterable.` but it works with list.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729263#comment-16729263
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

It seems {color:#d04437}GoogleCloudStorageToBigQueryOperator{color} does not 
support regions other than US / EU.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244067381
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -41,11 +42,12 @@ def __init__(self,
 def get_client(self, project_id):
 # type: (str) -> Client
 """
-Provides a client for interacting with Cloud Spanner API.
+Provides a client for interacting with the Cloud Spanner API.
 
-:param project_id: The ID of the project which owns the instances, 
tables and data.
+:param project_id: TThe ID of the  GCP project that owns the Cloud 
Spanner
 
 Review comment:
   looks like a typo.
   
   ```suggestion
   :param project_id: The ID of the  GCP project that owns the Cloud 
Spanner
   ```
   
   I haven't reviewed entire PR yet!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] 
fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244067118
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_s3.py
 ##
 @@ -101,7 +101,7 @@ def execute(self, context):
 # Google Cloud Storage and not in S3
 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key)
 existing_files = s3_hook.list_keys(bucket_name)
-files = set(files) - set(existing_files)
+files = list(set(files) - set(existing_files))
 
 Review comment:
   not sure on this. `xcom` push should be able to support any pickleable 
object like list, set


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on issue #4256: [AIRFLOW-3417] Use the platformVersion only for the FARGATE launch type

2018-12-26 Thread GitBox
jmcarp commented on issue #4256: [AIRFLOW-3417] Use the platformVersion only 
for the FARGATE launch type
URL: 
https://github.com/apache/incubator-airflow/pull/4256#issuecomment-450049094
 
 
   I left a few minor questions on the diffs, but this looks good to me. cc 
@kaxil or @ashb to take a look when you have time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing options to DatadogHook.

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing 
options to DatadogHook.
URL: https://github.com/apache/incubator-airflow/pull/4362#discussion_r244066305
 
 

 ##
 File path: airflow/contrib/hooks/datadog_hook.py
 ##
 @@ -76,12 +65,18 @@ def send_metric(self, metric_name, datapoint, tags=None):
 :type datapoint: int or float
 :param tags: A list of tags associated with the metric
 :type tags: list
+:param type: Type of your metric either: gauge, rate, or count
+:type type: string
 
 Review comment:
   `string` -> `str`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing options to DatadogHook.

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing 
options to DatadogHook.
URL: https://github.com/apache/incubator-airflow/pull/4362#discussion_r244066287
 
 

 ##
 File path: airflow/contrib/hooks/datadog_hook.py
 ##
 @@ -47,26 +47,15 @@ def __init__(self, datadog_conn_id='datadog_default'):
 # for all metric submissions.
 self.host = conn.host
 
-if self.api_key is None:
-raise AirflowException("api_key must be specified in the "
-   "Datadog connection details")
-if self.app_key is None:
-raise AirflowException("app_key must be specified in the "
-   "Datadog connection details")
-
 self.log.info("Setting up api keys for Datadog")
-options = {
-'api_key': self.api_key,
-'app_key': self.app_key
-}
-initialize(**options)
+initialize(api_key=self.api_key, app_key=self.app_key)
 
 def validate_response(self, response):
 if response['status'] != 'ok':
 self.log.error("Datadog returned: %s", response)
 raise AirflowException("Error status received from Datadog")
 
-def send_metric(self, metric_name, datapoint, tags=None):
+def send_metric(self, metric_name, datapoint, tags=None, type_=None, 
interval=None):
 
 Review comment:
   typo .. `type_` -> `type`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4256: [AIRFLOW-3417] Use the platformVersion only for the FARGATE launch type

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4256: [AIRFLOW-3417] Use the 
platformVersion only for the FARGATE launch type
URL: https://github.com/apache/incubator-airflow/pull/4256#discussion_r244066490
 
 

 ##
 File path: tests/contrib/operators/test_ecs_operator.py
 ##
 @@ -133,6 +134,38 @@ def test_execute_without_failures(self, check_mock, 
wait_mock):
 self.assertEqual(self.ecs.arn,
  
'arn:aws:ecs:us-east-1:012345678910:task/d8c67b3c-ac87-4ffe-a847-4785bc3a8b55')
 
+@mock.patch.object(ECSOperator, '_wait_for_task_ended')
+@mock.patch.object(ECSOperator, '_check_success_task')
+def test_fargate_run(self, check_mock, wait_mock):
 
 Review comment:
   I think this test is almost the same as `test_execute_without_failures`. 
Could we use parameterized tests instead of writing two very similar tests, or 
would that make the code more confusing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing options to DatadogHook.

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing 
options to DatadogHook.
URL: https://github.com/apache/incubator-airflow/pull/4362#discussion_r244066383
 
 

 ##
 File path: airflow/contrib/hooks/datadog_hook.py
 ##
 @@ -121,21 +117,37 @@ def post_event(self, title, text, tags=None, 
alert_type=None, aggregation_key=No
 :type title: str
 :param text: The body of the event (more information)
 :type text: str
-:param tags: List of string tags to apply to the event
-:type tags: list
+:param aggregation_key: Key that can be used to aggregate this event 
in a stream
+:type aggregation_key: str
 :param alert_type: The alert type for the event, one of
 ["error", "warning", "info", "success"]
 :type alert_type: str
-:param aggregation_key: Key that can be used to aggregate this event 
in a stream
-:type aggregation_key: str
+:date_happened: POSIX timestamp of the event; defaults to now
 
 Review comment:
   `:type date_happened:`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing options to DatadogHook.

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4362: [AIRFLOW-3559] Add missing 
options to DatadogHook.
URL: https://github.com/apache/incubator-airflow/pull/4362#discussion_r244066443
 
 

 ##
 File path: airflow/contrib/hooks/datadog_hook.py
 ##
 @@ -47,26 +47,15 @@ def __init__(self, datadog_conn_id='datadog_default'):
 # for all metric submissions.
 self.host = conn.host
 
-if self.api_key is None:
-raise AirflowException("api_key must be specified in the "
-   "Datadog connection details")
-if self.app_key is None:
-raise AirflowException("app_key must be specified in the "
-   "Datadog connection details")
-
 self.log.info("Setting up api keys for Datadog")
-options = {
-'api_key': self.api_key,
-'app_key': self.app_key
-}
-initialize(**options)
+initialize(api_key=self.api_key, app_key=self.app_key)
 
 def validate_response(self, response):
 if response['status'] != 'ok':
 self.log.error("Datadog returned: %s", response)
 raise AirflowException("Error status received from Datadog")
 
-def send_metric(self, metric_name, datapoint, tags=None):
+def send_metric(self, metric_name, datapoint, tags=None, type_=None, 
interval=None):
 
 Review comment:
   or may be change the name to `metric_type`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4256: [AIRFLOW-3417] Use the platformVersion only for the FARGATE launch type

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4256: [AIRFLOW-3417] Use the 
platformVersion only for the FARGATE launch type
URL: https://github.com/apache/incubator-airflow/pull/4256#discussion_r244066387
 
 

 ##
 File path: tests/contrib/operators/test_ecs_operator.py
 ##
 @@ -63,26 +65,26 @@ def setUp(self, aws_hook_mock):
 configuration.load_test_config()
 
 self.aws_hook_mock = aws_hook_mock
-self.ecs = ECSOperator(
-task_id='task',
-task_definition='t',
-cluster='c',
-overrides={},
-aws_conn_id=None,
-region_name='eu-west-1',
-group='group',
-placement_constraints=[
-{
-'expression': 'attribute:ecs.instance-type =~ t2.*',
-'type': 'memberOf'
-}
-],
-network_configuration={
+self.ecs_operator_args = {
+'task_id': 'task',
+'task_definition': 't',
+'cluster': 'c',
+'overrides': {},
+'aws_conn_id': None,
+'region_name': 'eu-west-1',
+'group': 'group',
+'placement_constraints': [{
+'expression': 'attribute:ecs.instance-type =~ t2.*',
+'type': 'memberOf'
+}],
+'network_configuration': {
 'awsvpcConfiguration': {
 'securityGroups': ['sg-123abc']
 }
 }
-)
+}
+self.ecs = ECSOperator(**self.ecs_operator_args)
+self.ecs_fargate = ECSOperator(launch_type='FARGATE', 
**self.ecs_operator_args)
 
 Review comment:
   Do we need to instantiate a new fargate operator before each test if only 
one test uses it? What do you think about setting `self.launch_type` in the 
relevant tests instead?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)
Yohei Onishi created AIRFLOW-3571:
-

 Summary: GoogleCloudStorageToBigQueryOperator succeeds to 
uploading CSV file from GCS to BiqQuery but a task is failed
 Key: AIRFLOW-3571
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib
Affects Versions: 1.10.0
Reporter: Yohei Onishi


I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
 
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/fr-stg-datalake/jobs/job_QQE9TDEu88mfdw_fJHHEo9FtjXja?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
1003, in run_with_configuratio
err.resp.status
Exception: ('BigQuery job status check failed. Final error was: %s', 404
{code}
The task failed to find a job {color:#FF}fmy-project:job_abc123{color} but 
the correct job id is{color:#FF} 
my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
not actual id.)
I suppose the operator does not treat zone properly.
 
{code:java}
$ bq show -j my-project:asia-northeast1:job_abc123
Job my-project:asia-northeast1:job_abc123

Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
Billing Tier Labels
-- - - -- 
-- 
- -- -- 
load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer operator.

2018-12-26 Thread GitBox
kaxil commented on issue #4331: [AIRFLOW-3531] Add gcs to gcs transfer operator.
URL: 
https://github.com/apache/incubator-airflow/pull/4331#issuecomment-450048479
 
 
   Any updates on this PR @jmcarp 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4359: [AIRFLOW-3150] Make execution_date templated in TriggerDagRunOperator

2018-12-26 Thread GitBox
kaxil commented on issue #4359: [AIRFLOW-3150] Make execution_date templated in 
TriggerDagRunOperator
URL: 
https://github.com/apache/incubator-airflow/pull/4359#issuecomment-450048262
 
 
   @Fokko Any opinion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4298: [AIRFLOW-3478] Make sure that the session is closed

2018-12-26 Thread GitBox
kaxil commented on issue #4298: [AIRFLOW-3478] Make sure that the session is 
closed
URL: 
https://github.com/apache/incubator-airflow/pull/4298#issuecomment-450048020
 
 
   It still has some conflicts, can you resolve them :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3571:
--
Description: 
I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
  
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
1003, in run_with_configuratio
err.resp.status
Exception: ('BigQuery job status check failed. Final error was: %s', 404
{code}
The task failed to find a job {color:#ff}fmy-project:job_abc123{color} but 
the correct job id is{color:#ff} 
my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
not actual id.)
 I suppose the operator does not treat zone properly.
  
{code:java}
$ bq show -j my-project:asia-northeast1:job_abc123
Job my-project:asia-northeast1:job_abc123

Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
Billing Tier Labels
-- - - -- 
-- 
- -- -- 
load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
{code}

  was:
I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
 
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/fr-stg-datalake/jobs/job_QQE9TDEu88mfdw_fJHHEo9FtjXja?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File 

[GitHub] r39132 closed pull request #4336: [AIRFLOW-XXX] Fix trivial grammar error in doc

2018-12-26 Thread GitBox
r39132 closed pull request #4336: [AIRFLOW-XXX] Fix trivial grammar error in doc
URL: https://github.com/apache/incubator-airflow/pull/4336
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/tutorial.rst b/docs/tutorial.rst
index 69670d7b02..1b8847c4d6 100644
--- a/docs/tutorial.rst
+++ b/docs/tutorial.rst
@@ -390,10 +390,10 @@ Let's run a few commands to validate this script further.
 # print the list of active DAGs
 airflow list_dags
 
-# prints the list of tasks the "tutorial" dag_id
+# prints the list of tasks in the "tutorial" DAG
 airflow list_tasks tutorial
 
-# prints the hierarchy of tasks in the tutorial DAG
+# prints the hierarchy of tasks in the "tutorial" DAG
 airflow list_tasks tutorial --tree
 
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450040364
 
 
   @odracci I sent you an email to set up a quick hangout to ask you more about 
this (will post call summary here). I want to see if this is a shortcoming of 
how our dev environment is set up. You should be able to just define hostpath 
as a volume mount the way you would any other volume mount.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450037129
 
 
   @dimberman I implemented the hostpath because is the only way I found to 
develop DAGs when using kubernetes on the local environment. With 
docker-compose we usually share the `dags` folder, `hostpath` is the equivalent 
when kubernetes is being used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] odracci commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
odracci commented on a change in pull request #3770: [AIRFLOW-3281] Fix 
Kubernetes operator with git-sync
URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r244057766
 
 

 ##
 File path: airflow/contrib/kubernetes/worker_configuration.py
 ##
 @@ -50,13 +55,13 @@ def _get_init_containers(self, volume_mounts):
 'value': self.kube_config.git_branch
 }, {
 'name': 'GIT_SYNC_ROOT',
-'value': os.path.join(
-self.worker_airflow_dags,
-self.kube_config.git_subpath
-)
+'value': self.kube_config.git_sync_root
 }, {
 'name': 'GIT_SYNC_DEST',
-'value': 'dags'
+'value': self.kube_config.git_sync_dest
+}, {
+'name': 'GIT_SYNC_DEPTH',
 
 Review comment:
   I'm sorry, I haven't seen that comment.
   `GIT_SYNC_DEPTH` just clone the repo with the last commit instead of 
downloading the whole repo history, it speeds up the deployment.
   The revision being in the path directory is how git-sync works and is not 
possible to remove it.
   I solved the problem with `kube_config.git_sync_root` which includes the 
dags_folder + revision + git_sub_path
   /cc @jzucker2 @dimberman 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file

2018-12-26 Thread GitBox
BasPH commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file
URL: 
https://github.com/apache/incubator-airflow/pull/4370#issuecomment-450035982
 
 
   This page describes the exact same thing I experienced: 
https://github.com/isaacs/github/issues/361. Doesn't sound like something that 
can be configured from the repository end.
   
   However, I think squashing commits by the Airflow committers would help 
prevent lots of force pushes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix 
Kubernetes operator with git-sync
URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r244055880
 
 

 ##
 File path: airflow/contrib/kubernetes/worker_configuration.py
 ##
 @@ -50,13 +55,13 @@ def _get_init_containers(self, volume_mounts):
 'value': self.kube_config.git_branch
 }, {
 'name': 'GIT_SYNC_ROOT',
-'value': os.path.join(
-self.worker_airflow_dags,
-self.kube_config.git_subpath
-)
+'value': self.kube_config.git_sync_root
 }, {
 'name': 'GIT_SYNC_DEST',
-'value': 'dags'
+'value': self.kube_config.git_sync_dest
+}, {
+'name': 'GIT_SYNC_DEPTH',
 
 Review comment:
   @odracci can you please address @jzucker2's question.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix 
Kubernetes operator with git-sync
URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r244055880
 
 

 ##
 File path: airflow/contrib/kubernetes/worker_configuration.py
 ##
 @@ -50,13 +55,13 @@ def _get_init_containers(self, volume_mounts):
 'value': self.kube_config.git_branch
 }, {
 'name': 'GIT_SYNC_ROOT',
-'value': os.path.join(
-self.worker_airflow_dags,
-self.kube_config.git_subpath
-)
+'value': self.kube_config.git_sync_root
 }, {
 'name': 'GIT_SYNC_DEST',
-'value': 'dags'
+'value': self.kube_config.git_sync_dest
+}, {
+'name': 'GIT_SYNC_DEPTH',
 
 Review comment:
   @odracci ^^^


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450035432
 
 
   @odracci @Fokko overall this looks incredible. Thank you so much @odracci 
this actually addresses multiple things I was currently looking into (i.e. the 
web assets, the git path stuff). One question is still why do we need special 
features just for hostpath? Hostpath should just be a volume mount like any 
other, and frankly I don't really want to actively encourage people to use it 
(it's generally not advised and I think there's even been talk about 
deprecating it). Other than that LGTM :).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
dimberman commented on a change in pull request #3770: [AIRFLOW-3281] Fix 
Kubernetes operator with git-sync
URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r244055645
 
 

 ##
 File path: scripts/ci/kubernetes/docker/build.sh
 ##
 @@ -34,8 +34,18 @@ fi
 echo "Airflow directory $AIRFLOW_ROOT"
 echo "Airflow Docker directory $DIRNAME"
 
+if [[ ${PYTHON_VERSION} == '3' ]]; then
+  PYTHON_DOCKER_IMAGE=python:3.6-slim
+else
+  PYTHON_DOCKER_IMAGE=python:2.7-slim
+fi
+
 cd $AIRFLOW_ROOT
-python setup.py sdist -q
+docker run -ti --rm -e SLUGIFY_USES_TEXT_UNIDECODE -v ${AIRFLOW_ROOT}:/airflow 
\
+-w /airflow ${PYTHON_DOCKER_IMAGE} 
./scripts/ci/kubernetes/docker/compile.sh
 
 Review comment:
   This makes a lot of sense. Thanks for setting this up.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4353#issuecomment-449454398
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=h1)
 Report
   > Merging 
[#4353](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fb52d7b7ca7f08ffdaaa31558eb864aed3fdc9da?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4353/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4353  +/-   ##
   ==
   - Coverage   78.15%   78.14%   -0.02% 
   ==
 Files 204  202   -2 
 Lines   1649916486  -13 
   ==
   - Hits1289512883  -12 
   + Misses   3604 3603   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.48% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.39% <0%> (-0.03%)` | :arrow_down: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `88.6% <0%> (ø)` | :arrow_up: |
   | 
[airflow/models/dagpickle.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncGlja2xlLnB5)
 | | |
   | 
[airflow/models/base.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZS5weQ==)
 | | |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `32.53% <0%> (+0.25%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=footer).
 Last update 
[fb52d7b...7cb58eb](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4353#issuecomment-449454398
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=h1)
 Report
   > Merging 
[#4353](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fb52d7b7ca7f08ffdaaa31558eb864aed3fdc9da?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4353/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4353  +/-   ##
   ==
   - Coverage   78.15%   78.14%   -0.02% 
   ==
 Files 204  202   -2 
 Lines   1649916486  -13 
   ==
   - Hits1289512883  -12 
   + Misses   3604 3603   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.48% <0%> (-0.05%)` | :arrow_down: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.39% <0%> (-0.03%)` | :arrow_down: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `88.6% <0%> (ø)` | :arrow_up: |
   | 
[airflow/models/dagpickle.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncGlja2xlLnB5)
 | | |
   | 
[airflow/models/base.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZS5weQ==)
 | | |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `32.53% <0%> (+0.25%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=footer).
 Last update 
[fb52d7b...7cb58eb](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
codecov-io commented on issue #4354: [AIRFLOW-3446] Add Google Cloud BigTable 
operators
URL: 
https://github.com/apache/incubator-airflow/pull/4354#issuecomment-450030796
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=h1)
 Report
   > Merging 
[#4354](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fb52d7b7ca7f08ffdaaa31558eb864aed3fdc9da?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4354/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4354   +/-   ##
   ===
 Coverage   78.15%   78.15%   
   ===
 Files 204  204   
 Lines   1649916499   
   ===
 Hits1289512895   
 Misses   3604 3604
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=footer).
 Last update 
[fb52d7b...8b7891c](https://codecov.io/gh/apache/incubator-airflow/pull/4354?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4368: [AIRFLOW-3561] Improve queries

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4368: [AIRFLOW-3561] Improve queries
URL: 
https://github.com/apache/incubator-airflow/pull/4368#issuecomment-449655224
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=h1)
 Report
   > Merging 
[#4368](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/c014324be237523d23ceb7a6250218d28befddfd?src=pr=desc)
 will **increase** coverage by `0.02%`.
   > The diff coverage is `90.9%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4368/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4368  +/-   ##
   ==
   + Coverage   78.14%   78.17%   +0.02% 
   ==
 Files 202  204   +2 
 Lines   1648616530  +44 
   ==
   + Hits1288312922  +39 
   - Misses   3603 3608   +5
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.81% <100%> (+0.04%)` | :arrow_up: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.59% <100%> (+0.16%)` | :arrow_up: |
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `69.31% <80%> (-0.08%)` | :arrow_down: |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `32.28% <0%> (-0.26%)` | :arrow_down: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `88.6% <0%> (ø)` | :arrow_up: |
   | 
[airflow/models/dagpickle.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncGlja2xlLnB5)
 | `94.11% <0%> (ø)` | |
   | 
[airflow/models/base.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZS5weQ==)
 | `85.71% <0%> (ø)` | |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.41% <0%> (+0.02%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.52% <0%> (+0.04%)` | :arrow_up: |
   | ... and [1 
more](https://codecov.io/gh/apache/incubator-airflow/pull/4368/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=footer).
 Last update 
[c014324...4db22b1](https://codecov.io/gh/apache/incubator-airflow/pull/4368?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google 
Cloud BigTable operators
URL: https://github.com/apache/incubator-airflow/pull/4354#discussion_r244048635
 
 

 ##
 File path: setup.py
 ##
 @@ -189,6 +189,7 @@ def write_version(filename=os.path.join(*['airflow',
 'google-auth>=1.0.0, <2.0.0dev',
 'google-auth-httplib2>=0.0.1',
 'google-cloud-container>=0.1.1',
+'google-cloud-bigtable==0.31.0',
 
 Review comment:
   The problem is that those libraries are pretty unstable and relying on 
automated upgrade for future versions is pretty bad idea - last time when 
@DariuszAniszewski  upgraded 0.30.0 (I think) -> 0.31.0 it actually broke the 
operator and some fixes had to be made (and we cannot really rely on future 
versions to not break anything). Other dependencies also made breaking changes 
quite recently when they automatically upgraded (infamous flask-appbuilder for 
example). There is a whole discussion on keeping fixed vs. upper-open version 
number on the devlist (I am involved but had no time to take a closer look at 
this yet). I think for now - until general approach is implemented, I think 
it's safer to keep it fixed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4367: [AIRFLOW-3551] Improve BashOperator Test Coverage

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4367: [AIRFLOW-3551] Improve BashOperator 
Test Coverage
URL: 
https://github.com/apache/incubator-airflow/pull/4367#issuecomment-449651441
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=h1)
 Report
   > Merging 
[#4367](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/7a6acbf5b343e4a6895d1cc8af75ecc02b4fd0e8?src=pr=desc)
 will **increase** coverage by `1.29%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4367/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4367  +/-   ##
   ==
   + Coverage   78.12%   79.42%   +1.29% 
   ==
 Files 202  204   +2 
 Lines   1648318586+2103 
   ==
   + Hits1287814762+1884 
   - Misses   3605 3824 +219
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/bash\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvYmFzaF9vcGVyYXRvci5weQ==)
 | `92.98% <100%> (+1.6%)` | :arrow_up: |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `28.47% <0%> (-4.07%)` | :arrow_down: |
   | 
[airflow/models/dagpickle.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncGlja2xlLnB5)
 | `94.11% <0%> (ø)` | |
   | 
[airflow/models/base.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZS5weQ==)
 | `85.71% <0%> (ø)` | |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `94.35% <0%> (+1.67%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `79.11% <0%> (+1.72%)` | :arrow_up: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `84.35% <0%> (+2.13%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `66.72% <0%> (+2.24%)` | :arrow_up: |
   | 
[airflow/models/connection.py](https://codecov.io/gh/apache/incubator-airflow/pull/4367/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY29ubmVjdGlvbi5weQ==)
 | `93.44% <0%> (+4.83%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=footer).
 Last update 
[7a6acbf...ba6109f](https://codecov.io/gh/apache/incubator-airflow/pull/4367?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google 
Cloud BigTable operators
URL: https://github.com/apache/incubator-airflow/pull/4354#discussion_r244048057
 
 

 ##
 File path: airflow/contrib/hooks/gcp_bigtable_hook.py
 ##
 @@ -0,0 +1,232 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from google.cloud.bigtable import Client
+from google.cloud.bigtable.cluster import Cluster
+from google.cloud.bigtable.instance import Instance
+from google.cloud.bigtable.table import Table
+from google.cloud.bigtable_admin_v2 import enums
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+
+
+# noinspection PyAbstractClass
+class BigtableHook(GoogleCloudBaseHook):
+"""
+Hook for Google Cloud Bigtable APIs.
+"""
+
+_client = None
+
+def __init__(self,
+ gcp_conn_id='google_cloud_default',
+ delegate_to=None):
+super(BigtableHook, self).__init__(gcp_conn_id, delegate_to)
+
+def get_client(self, project_id):
+if not self._client:
+self._client = Client(project=project_id, 
credentials=self._get_credentials(), admin=True)
+return self._client
+
+def get_instance(self, project_id, instance_id):
 
 Review comment:
   That's indeed  a good idea in general for all GCP operators. We have not 
done it so far in neither of them - we've implemented around 30 operators so 
far and all of them have PROJECT_ID required. I think it's a good improvement 
proposal to all 30 operators - maybe we could implement it as a separate PR 
(because it has some consequences - in a number of places we validate that 
project_id is needed, we have unit tests testing it and some validation rules 
implemented. What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4298: [AIRFLOW-3478] Make sure that the session is closed

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4298: [AIRFLOW-3478] Make sure that the 
session is closed
URL: 
https://github.com/apache/incubator-airflow/pull/4298#issuecomment-445580870
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=h1)
 Report
   > Merging 
[#4298](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/01880dcb3f19be8cff24eb2b23137e0aef8cdd6c?src=pr=desc)
 will **decrease** coverage by `0.05%`.
   > The diff coverage is `77.88%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4298/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4298  +/-   ##
   ==
   - Coverage   78.13%   78.08%   -0.06% 
   ==
 Files 202  201   -1 
 Lines   1648716447  -40 
   ==
   - Hits1288212842  -40 
 Misses   3605 3605
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `32.79% <ø> (+0.51%)` | :arrow_up: |
   | 
[airflow/api/common/experimental/mark\_tasks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9tYXJrX3Rhc2tzLnB5)
 | `97.6% <100%> (-0.06%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.34% <100%> (ø)` | |
   | 
[airflow/api/common/experimental/delete\_dag.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9kZWxldGVfZGFnLnB5)
 | `88% <100%> (ø)` | :arrow_up: |
   | 
[airflow/utils/cli\_action\_loggers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9jbGlfYWN0aW9uX2xvZ2dlcnMucHk=)
 | `73.33% <100%> (-0.87%)` | :arrow_down: |
   | 
[airflow/www\_rbac/utils.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy91dGlscy5weQ==)
 | `73.95% <100%> (-0.14%)` | :arrow_down: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `80.41% <100%> (ø)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.31% <100%> (-0.09%)` | :arrow_down: |
   | 
[airflow/www\_rbac/security.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9zZWN1cml0eS5weQ==)
 | `92.85% <100%> (+0.09%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.2% <68.11%> (-0.28%)` | :arrow_down: |
   | ... and [12 
more](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=footer).
 Last update 
[01880dc...c7aed41](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4298: [AIRFLOW-3478] Make sure that the session is closed

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4298: [AIRFLOW-3478] Make sure that the 
session is closed
URL: 
https://github.com/apache/incubator-airflow/pull/4298#issuecomment-445580870
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=h1)
 Report
   > Merging 
[#4298](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/01880dcb3f19be8cff24eb2b23137e0aef8cdd6c?src=pr=desc)
 will **decrease** coverage by `0.05%`.
   > The diff coverage is `77.88%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4298/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4298  +/-   ##
   ==
   - Coverage   78.13%   78.08%   -0.06% 
   ==
 Files 202  201   -1 
 Lines   1648716447  -40 
   ==
   - Hits1288212842  -40 
 Misses   3605 3605
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `32.79% <ø> (+0.51%)` | :arrow_up: |
   | 
[airflow/api/common/experimental/mark\_tasks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9tYXJrX3Rhc2tzLnB5)
 | `97.6% <100%> (-0.06%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.34% <100%> (ø)` | |
   | 
[airflow/api/common/experimental/delete\_dag.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9kZWxldGVfZGFnLnB5)
 | `88% <100%> (ø)` | :arrow_up: |
   | 
[airflow/utils/cli\_action\_loggers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9jbGlfYWN0aW9uX2xvZ2dlcnMucHk=)
 | `73.33% <100%> (-0.87%)` | :arrow_down: |
   | 
[airflow/www\_rbac/utils.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy91dGlscy5weQ==)
 | `73.95% <100%> (-0.14%)` | :arrow_down: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `80.41% <100%> (ø)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.31% <100%> (-0.09%)` | :arrow_down: |
   | 
[airflow/www\_rbac/security.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9zZWN1cml0eS5weQ==)
 | `92.85% <100%> (+0.09%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.2% <68.11%> (-0.28%)` | :arrow_down: |
   | ... and [12 
more](https://codecov.io/gh/apache/incubator-airflow/pull/4298/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=footer).
 Last update 
[01880dc...c7aed41](https://codecov.io/gh/apache/incubator-airflow/pull/4298?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4354: [AIRFLOW-3446] Add Google 
Cloud BigTable operators
URL: https://github.com/apache/incubator-airflow/pull/4354#discussion_r244047354
 
 

 ##
 File path: airflow/contrib/hooks/gcp_bigtable_hook.py
 ##
 @@ -0,0 +1,232 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from google.cloud.bigtable import Client
+from google.cloud.bigtable.cluster import Cluster
+from google.cloud.bigtable.instance import Instance
+from google.cloud.bigtable.table import Table
+from google.cloud.bigtable_admin_v2 import enums
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+
+
+# noinspection PyAbstractClass
+class BigtableHook(GoogleCloudBaseHook):
 
 Review comment:
   @DariuszAniszewski  is on holidays, but let me add a comment on that - this 
is very similar story with Spanner hook, we are testing GCP hooks with System 
tests (and real GCP) rather than with mock unit tests. The logic in those hooks 
is really simple - usually simply calling the corresponding library method, so 
tests for each would be quite redundant - for example tests for update_cluster 
would test if Cluster constructor is called once and then if update() method is 
called once. That's still possible and easy to do (and we can still do it) but 
I don't think it adds a lot of value. 
   
   Do you think it makes sense to have such tests @jmcarp ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4365: [AIRFLOW-1684] - Branching based on XCom variable (Docs)

2018-12-26 Thread GitBox
codecov-io commented on issue #4365: [AIRFLOW-1684] - Branching based on XCom 
variable (Docs)
URL: 
https://github.com/apache/incubator-airflow/pull/4365#issuecomment-450022623
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=h1)
 Report
   > Merging 
[#4365](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0e365665957d09c1aa612014b40994e3634ef70e?src=pr=desc)
 will **increase** coverage by `0.04%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4365/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4365  +/-   ##
   ==
   + Coverage   78.12%   78.17%   +0.04% 
   ==
 Files 202  202  
 Lines   1648316527  +44 
   ==
   + Hits1287812920  +42 
   - Misses   3605 3607   +2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4365/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (+0.08%)` | :arrow_up: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4365/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `84.35% <0%> (+2.13%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=footer).
 Last update 
[0e36566...a405454](https://codecov.io/gh/apache/incubator-airflow/pull/4365?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #4348: [AIRFLOW-XXX] Add section to Updating.md regarding timezones

2018-12-26 Thread GitBox
Fokko closed pull request #4348: [AIRFLOW-XXX] Add section to Updating.md 
regarding timezones
URL: https://github.com/apache/incubator-airflow/pull/4348
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/UPDATING.md b/UPDATING.md
index 575fc0a3c5..851c299db1 100644
--- a/UPDATING.md
+++ b/UPDATING.md
@@ -202,6 +202,7 @@ There are five roles created for Airflow by default: Admin, 
User, Op, Viewer, an
 - All ModelViews in Flask-AppBuilder follow a different pattern from 
Flask-Admin. The `/admin` part of the URL path will no longer exist. For 
example: `/admin/connection` becomes `/connection/list`, 
`/admin/connection/new` becomes `/connection/add`, `/admin/connection/edit` 
becomes `/connection/edit`, etc.
 - Due to security concerns, the new webserver will no longer support the 
features in the `Data Profiling` menu of old UI, including `Ad Hoc Query`, 
`Charts`, and `Known Events`.
 - HiveServer2Hook.get_results() always returns a list of tuples, even when a 
single column is queried, as per Python API 2.
+- **UTC is now the default timezone**: Either reconfigure your workflows 
scheduling in UTC or set `default_timezone` as explained in 
https://airflow.apache.org/timezone.html#default-time-zone 
 
 ### airflow.contrib.sensors.hdfs_sensors renamed to 
airflow.contrib.sensors.hdfs_sensor
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-3459) Refactor: Move DagPickle out of models.py

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned AIRFLOW-3459:
-

Assignee: Bas Harenslak

> Refactor: Move DagPickle out of models.py
> -
>
> Key: AIRFLOW-3459
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3459
> Project: Apache Airflow
>  Issue Type: Task
>  Components: models
>Affects Versions: 1.10.1
>Reporter: Fokko Driesprong
>Assignee: Bas Harenslak
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-3459) Refactor: Move DagPickle out of models.py

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong closed AIRFLOW-3459.
-
Resolution: Fixed

> Refactor: Move DagPickle out of models.py
> -
>
> Key: AIRFLOW-3459
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3459
> Project: Apache Airflow
>  Issue Type: Task
>  Components: models
>Affects Versions: 1.10.1
>Reporter: Fokko Driesprong
>Assignee: Bas Harenslak
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4353#issuecomment-449454398
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=h1)
 Report
   > Merging 
[#4353](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/7a6acbf5b343e4a6895d1cc8af75ecc02b4fd0e8?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4353/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4353  +/-   ##
   ==
   + Coverage   78.12%   78.14%   +0.01% 
   ==
 Files 202  202  
 Lines   1648316486   +3 
   ==
   + Hits1287812883   +5 
   + Misses   3605 3603   -2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (+0.08%)` | :arrow_up: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `82.6% <0%> (+0.38%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=footer).
 Last update 
[7a6acbf...f2eb903](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4353#issuecomment-449454398
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=h1)
 Report
   > Merging 
[#4353](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/7a6acbf5b343e4a6895d1cc8af75ecc02b4fd0e8?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4353/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4353  +/-   ##
   ==
   + Coverage   78.12%   78.14%   +0.01% 
   ==
 Files 202  202  
 Lines   1648316486   +3 
   ==
   + Hits1287812883   +5 
   + Misses   3605 3603   -2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `92.76% <0%> (+0.08%)` | :arrow_up: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/4353/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `82.6% <0%> (+0.38%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=footer).
 Last update 
[7a6acbf...f2eb903](https://codecov.io/gh/apache/incubator-airflow/pull/4353?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #4374: [AIRFLOW-3459] Move DagPickle to separate file

2018-12-26 Thread GitBox
Fokko closed pull request #4374: [AIRFLOW-3459] Move DagPickle to separate file
URL: https://github.com/apache/incubator-airflow/pull/4374
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/api/common/experimental/delete_dag.py 
b/airflow/api/common/experimental/delete_dag.py
index e7d15772c9..d04355f940 100644
--- a/airflow/api/common/experimental/delete_dag.py
+++ b/airflow/api/common/experimental/delete_dag.py
@@ -47,7 +47,7 @@ def delete_dag(dag_id, keep_records_in_log=True):
 count = 0
 
 # noinspection PyUnresolvedReferences,PyProtectedMember
-for m in models.Base._decl_class_registry.values():
+for m in models.base.Base._decl_class_registry.values():
 if hasattr(m, "dag_id"):
 if keep_records_in_log and m.__name__ == 'Log':
 continue
diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 143e2b34aa..c8840011e3 100644
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -34,7 +34,6 @@
 from builtins import input
 from collections import namedtuple
 
-from airflow.models.connection import Connection
 from airflow.utils.timezone import parse as parsedate
 import json
 from tabulate import tabulate
@@ -56,9 +55,9 @@
 from airflow import configuration as conf
 from airflow.exceptions import AirflowException, AirflowWebServerTimeout
 from airflow.executors import GetDefaultExecutor
-from airflow.models import (DagModel, DagBag, TaskInstance,
-DagPickle, DagRun, Variable, DagStat, DAG)
-
+from airflow.models import DagModel, DagBag, TaskInstance, DagRun, Variable, 
DagStat, DAG
+from airflow.models.connection import Connection
+from airflow.models.dagpickle import DagPickle
 from airflow.ti_deps.dep_context import (DepContext, SCHEDULER_DEPS)
 from airflow.utils import cli as cli_utils
 from airflow.utils import db as db_utils
diff --git a/airflow/jobs.py b/airflow/jobs.py
index 8472ecd383..a82190fc9d 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -43,6 +43,7 @@
 from airflow import executors, models, settings
 from airflow.exceptions import AirflowException
 from airflow.models import DAG, DagRun
+from airflow.models.dagpickle import DagPickle
 from airflow.settings import Stats
 from airflow.task.task_runner import get_task_runner
 from airflow.ti_deps.dep_context import DepContext, QUEUE_DEPS, RUN_DEPS
@@ -61,7 +62,7 @@
 from airflow.utils.sqlalchemy import UtcDateTime
 from airflow.utils.state import State
 
-Base = models.Base
+Base = models.base.Base
 ID_LEN = models.ID_LEN
 
 
@@ -2428,7 +2429,7 @@ def _execute(self, session=None):
 pickle_id = None
 if not self.donot_pickle and self.executor.__class__ not in (
 executors.LocalExecutor, executors.SequentialExecutor):
-pickle = models.DagPickle(self.dag)
+pickle = DagPickle(self.dag)
 session.add(pickle)
 session.commit()
 pickle_id = pickle.id
diff --git a/airflow/migrations/env.py b/airflow/migrations/env.py
index 4e1977635a..76c2eb329b 100644
--- a/airflow/migrations/env.py
+++ b/airflow/migrations/env.py
@@ -36,7 +36,7 @@
 # for 'autogenerate' support
 # from myapp import mymodel
 # target_metadata = mymodel.Base.metadata
-target_metadata = models.Base.metadata
+target_metadata = models.base.Base.metadata
 
 # other values from the config, defined by the needs of env.py,
 # can be acquired:
diff --git a/airflow/models/__init__.py b/airflow/models/__init__.py
index aa93f5cb88..fcd23fe1fb 100755
--- a/airflow/models/__init__.py
+++ b/airflow/models/__init__.py
@@ -28,6 +28,8 @@
 from builtins import ImportError as BuiltinImportError, bytes, object, str
 from future.standard_library import install_aliases
 
+from airflow.models.base import Base
+
 try:
 # Fix Python > 3.7 deprecation
 from collections.abc import Hashable
@@ -64,10 +66,10 @@
 
 from sqlalchemy import (
 Boolean, Column, DateTime, Float, ForeignKey, ForeignKeyConstraint, Index,
-Integer, LargeBinary, PickleType, String, Text, UniqueConstraint, MetaData,
-and_, asc, func, or_, true as sqltrue
+Integer, LargeBinary, PickleType, String, Text, UniqueConstraint, and_, 
asc,
+func, or_, true as sqltrue
 )
-from sqlalchemy.ext.declarative import declarative_base, declared_attr
+from sqlalchemy.ext.declarative import declared_attr
 from sqlalchemy.orm import reconstructor, relationship, synonym
 
 from croniter import (
@@ -84,6 +86,7 @@
 )
 from airflow.dag.base_dag import BaseDag, BaseDagBag
 from airflow.lineage import apply_lineage, prepare_lineage
+from airflow.models.dagpickle import DagPickle
 from airflow.ti_deps.deps.not_in_retry_period_dep import NotInRetryPeriodDep
 from airflow.ti_deps.deps.prev_dagrun_dep import PrevDagrunDep
 

[GitHub] Fokko commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file

2018-12-26 Thread GitBox
Fokko commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file
URL: 
https://github.com/apache/incubator-airflow/pull/4370#issuecomment-450018959
 
 
   I didn't know about this restriction, happy to change it :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is closed

2018-12-26 Thread GitBox
Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is 
closed
URL: 
https://github.com/apache/incubator-airflow/pull/4298#issuecomment-450018792
 
 
   Rebased :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file

2018-12-26 Thread GitBox
BasPH commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file
URL: 
https://github.com/apache/incubator-airflow/pull/4370#issuecomment-450018524
 
 
   Spotted a mistake in my code, closed the issue to fix it first, then 
discovered I couldn't re-open because I done a force push... when will the 
Airflow project change its contributing policy? ;)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-2568) Implement a Azure Container Instances operator

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong closed AIRFLOW-2568.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Implement a Azure Container Instances operator
> --
>
> Key: AIRFLOW-2568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2568
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Niels Zeilemaker
>Assignee: Niels Zeilemaker
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko closed pull request #4121: [AIRFLOW-2568] Azure Container Instances operator

2018-12-26 Thread GitBox
Fokko closed pull request #4121: [AIRFLOW-2568] Azure Container Instances 
operator
URL: https://github.com/apache/incubator-airflow/pull/4121
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/airflow/contrib/example_dags/example_azure_container_instances_operator.py 
b/airflow/contrib/example_dags/example_azure_container_instances_operator.py
new file mode 100644
index 00..181a30b50e
--- /dev/null
+++ b/airflow/contrib/example_dags/example_azure_container_instances_operator.py
@@ -0,0 +1,54 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow import DAG
+from airflow.contrib.operators.azure_container_instances_operator import 
AzureContainerInstancesOperator
+from datetime import datetime, timedelta
+
+default_args = {
+'owner': 'airflow',
+'depends_on_past': False,
+'start_date': datetime(2018, 11, 1),
+'email': ['airf...@example.com'],
+'email_on_failure': False,
+'email_on_retry': False,
+'retries': 1,
+'retry_delay': timedelta(minutes=5),
+}
+
+dag = DAG(
+'aci_example',
+default_args=default_args,
+schedule_interval=timedelta(1)
+)
+
+t1 = AzureContainerInstancesOperator(
+ci_conn_id='azure_container_instances_default',
+registry_conn_id=None,
+resource_group='resource-group',
+name='aci-test-{{ ds }}',
+image='hello-world',
+region='WestUS2',
+environment_variables={},
+volumes=[],
+memory_in_gb=4.0,
+cpu=1.0,
+task_id='start_container',
+dag=dag
+)
diff --git a/airflow/contrib/hooks/azure_container_instance_hook.py 
b/airflow/contrib/hooks/azure_container_instance_hook.py
new file mode 100644
index 00..5ad64de6d7
--- /dev/null
+++ b/airflow/contrib/hooks/azure_container_instance_hook.py
@@ -0,0 +1,167 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import os
+
+from airflow.hooks.base_hook import BaseHook
+from airflow.exceptions import AirflowException
+
+from azure.common.client_factory import get_client_from_auth_file
+from azure.common.credentials import ServicePrincipalCredentials
+
+from azure.mgmt.containerinstance import ContainerInstanceManagementClient
+
+
+class AzureContainerInstanceHook(BaseHook):
+"""
+A hook to communicate with Azure Container Instances.
+
+This hook requires a service principal in order to work.
+After creating this service principal
+(Azure Active Directory/App Registrations), you need to fill in the
+client_id (Application ID) as login, the generated password as password,
+and tenantId and subscriptionId in the extra's field as a json.
+
+:param conn_id: connection id of a service principal which will be used
+to start the container instance
+:type conn_id: str
+"""
+
+def __init__(self, conn_id='azure_default'):
+self.conn_id = conn_id
+self.connection = self.get_conn()
+
+def get_conn(self):
+conn = self.get_connection(self.conn_id)
+key_path = conn.extra_dejson.get('key_path', False)
+if key_path:
+if key_path.endswith('.json'):
+self.log.info('Getting connection using a JSON key file.')
+return 
get_client_from_auth_file(ContainerInstanceManagementClient,
+ key_path)
+else:
+raise AirflowException('Unrecognised extension for key file.')
+
+if os.environ.get('AZURE_AUTH_LOCATION'):
+   

[jira] [Resolved] (AIRFLOW-1413) FTPSensor fails when 550 error message text differs from the expected

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-1413.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> FTPSensor fails when 550 error message text differs from the expected
> -
>
> Key: AIRFLOW-1413
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1413
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Michal Dziemianko
>Assignee: Michal Dziemianko
>Priority: Minor
> Fix For: 2.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The FTPSensor relies on an error message returned by the FTPHook. The 
> expected message is  "Can't check for file existence" and is used to check 
> for errors in following way:
> {code:none}
> try:
> hook.get_mod_time(self.path)
> except ftplib.error_perm as e:
> error = str(e).split(None, 1)
> if error[1] != "Can't check for file existence":
> raise e
> {code}
> However on my system (Linux/Antregos) this message text is different, leading 
> to inevitable dag termination.  Moreover the actual format of the exception 
> on my system is  different and the split returns '-' instead of the actual 
> message.
> Testing for the error message text is not a good idea - this text is not 
> consistent across platforms and locales (I tried changing locale and I can 
> get it to return me localised messages).
> Instead, as a quick fix, I suggest testing for the error code in the 
> following way:
> {code:none}
> error = str(e).split(None, 1)
> if error[0] != "550":
> raise e
> {code}
> This is more reliable, although still not perfect as per FTP specification 
> the codes are just indicatory rather then mandatory. Moreover certain codes 
> (4xx series) are transient and arguably should not cause an exception if 
> recovery is possible (as an example host unavailable can be a temporary 
> network issue) within time limit.
> I am going to provide PR with series of improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko closed pull request #2450: [Airflow-1413] Fix FTPSensor failing on error message with unexpected text.

2018-12-26 Thread GitBox
Fokko closed pull request #2450: [Airflow-1413] Fix FTPSensor failing on error 
message with unexpected text.
URL: https://github.com/apache/incubator-airflow/pull/2450
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/sensors/ftp_sensor.py 
b/airflow/contrib/sensors/ftp_sensor.py
index efdedd7a62..69900d5205 100644
--- a/airflow/contrib/sensors/ftp_sensor.py
+++ b/airflow/contrib/sensors/ftp_sensor.py
@@ -17,6 +17,7 @@
 # specific language governing permissions and limitations
 # under the License.
 import ftplib
+import re
 
 from airflow.contrib.hooks.ftp_hook import FTPHook, FTPSHook
 from airflow.sensors.base_sensor_operator import BaseSensorOperator
@@ -26,33 +27,65 @@
 class FTPSensor(BaseSensorOperator):
 """
 Waits for a file or directory to be present on FTP.
-
-:param path: Remote file or directory path
-:type path: str
-:param ftp_conn_id: The connection to run the sensor against
-:type ftp_conn_id: str
 """
+
 template_fields = ('path',)
 
+"""Errors that are transient in nature, and where action can be retried"""
+transient_errors = [421, 425, 426, 434, 450, 451, 452]
+
+error_code_pattern = re.compile("([\d]+)")
+
 @apply_defaults
-def __init__(self, path, ftp_conn_id='ftp_default', *args, **kwargs):
+def __init__(
+self,
+path,
+ftp_conn_id='ftp_default',
+fail_on_transient_errors=True,
+*args,
+**kwargs):
+"""
+Create a new FTP sensor
+
+:param path: Remote file or directory path
+:type path: str
+:param fail_on_transient_errors: Fail on all errors,
+including 4xx transient errors. Default True.
+:type fail_on_transient_errors: bool
+:param ftp_conn_id: The connection to run the sensor against
+:type ftp_conn_id: str
+"""
+
 super(FTPSensor, self).__init__(*args, **kwargs)
 
 self.path = path
 self.ftp_conn_id = ftp_conn_id
+self.fail_on_transient_errors = fail_on_transient_errors
 
 def _create_hook(self):
 """Return connection hook."""
 return FTPHook(ftp_conn_id=self.ftp_conn_id)
 
+def _get_error_code(self, e):
+"""Extract error code from ftp exception"""
+try:
+matches = self.error_code_pattern.match(str(e))
+code = int(matches.group(0))
+return code
+except ValueError:
+return e
+
 def poke(self, context):
 with self._create_hook() as hook:
 self.log.info('Poking for %s', self.path)
 try:
 hook.get_mod_time(self.path)
 except ftplib.error_perm as e:
-error = str(e).split(None, 1)
-if error[1] != "Can't check for file existence":
+self.log.info('Ftp error encountered: %s', str(e))
+error_code = self._get_error_code(e)
+if ((error_code != 550) and
+(self.fail_on_transient_errors or
+(error_code not in self.transient_errors))):
 raise e
 
 return False
diff --git a/tests/contrib/sensors/test_ftp_sensor.py 
b/tests/contrib/sensors/test_ftp_sensor.py
index cf0fdb4918..8996dc08e0 100644
--- a/tests/contrib/sensors/test_ftp_sensor.py
+++ b/tests/contrib/sensors/test_ftp_sensor.py
@@ -49,8 +49,13 @@ def test_poke(self):
task_id="test_task")
 
 self.hook_mock.get_mod_time.side_effect = \
-[error_perm("550: Can't check for file existence"), None]
+[error_perm("550: Can't check for file existence"),
+error_perm("550: Directory or file does not exist"),
+error_perm("550 - Directory or file does not exist"),
+None]
 
+self.assertFalse(op.poke(None))
+self.assertFalse(op.poke(None))
 self.assertFalse(op.poke(None))
 self.assertTrue(op.poke(None))
 
@@ -66,6 +71,28 @@ def test_poke_fails_due_error(self):
 
 self.assertTrue("530" in str(context.exception))
 
+def test_poke_fail_on_transient_error(self):
+op = FTPSensor(path="foobar.json", ftp_conn_id="bob_ftp",
+   task_id="test_task")
+
+self.hook_mock.get_mod_time.side_effect = \
+error_perm("434: Host unavailable")
+
+with self.assertRaises(error_perm) as context:
+op.execute(None)
+
+self.assertTrue("434" in str(context.exception))
+
+def test_poke_ignore_transient_error(self):
+op = FTPSensor(path="foobar.json", ftp_conn_id="bob_ftp",
+   task_id="test_task", 

[jira] [Closed] (AIRFLOW-1684) Branching based on XCOM variable

2018-12-26 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong closed AIRFLOW-1684.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Branching based on XCOM variable
> 
>
> Key: AIRFLOW-1684
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1684
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: xcom
>Affects Versions: 1.7.0
> Environment: Centos 7, Airflow1.7
>Reporter: Virendhar Sivaraman
>Assignee: Elad
>Priority: Major
> Fix For: 2.0.0
>
>
> I would like to branch my dag based on a XCOM variable.
> Steps:
> 1. Populate XCOM in bash
> 2. pull the XCOM variable in a BranchPythonOperator and branch it out based 
> on the XCOM variable
> I've tried the documentation and researched on the internet - haven't been 
> successful.
> This feature will be helpful if its not available yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko closed pull request #4365: [AIRFLOW-1684] - Branching based on XCom variable (Docs)

2018-12-26 Thread GitBox
Fokko closed pull request #4365: [AIRFLOW-1684] - Branching based on XCom 
variable (Docs)
URL: https://github.com/apache/incubator-airflow/pull/4365
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/concepts.rst b/docs/concepts.rst
index eac7a8a7f1..766b69b783 100644
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -522,6 +522,37 @@ Not like this, where the join task is skipped
 
 .. image:: img/branch_bad.png
 
+The ``BranchPythonOperator`` can also be used with XComs allowing branching
+context to dynamically decide what branch to follow based on previous tasks.
+For example:
+
+.. code:: python
+
+  def branch_func(**kwargs):
+  ti = kwargs['ti']
+  xcom_value = int(ti.xcom_pull(task_ids='start_task'))
+  if xcom_value >= 5:
+  return 'continue_task'
+  else:
+  return 'stop_task'
+
+  start_op = BashOperator(
+  task_id='start_task',
+  bash_command="echo 5",
+  xcom_push=True,
+  dag=dag)
+
+  branch_op = BranchPythonOperator(
+  task_id='branch_task',
+  provide_context=True,
+  python_callable=branch_func,
+  dag=dag)
+
+  continue_op = DummyOperator(task_id='continue_task', dag=dag)
+  stop_op = DummyOperator(task_id='stop_task', dag=dag)
+
+  start_op >> branch_op >> [continue_op, stop_op]
+
 SubDAGs
 ===
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1684) Branching based on XCOM variable

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729168#comment-16729168
 ] 

ASF GitHub Bot commented on AIRFLOW-1684:
-

Fokko commented on pull request #4365: [AIRFLOW-1684] - Branching based on XCom 
variable (Docs)
URL: https://github.com/apache/incubator-airflow/pull/4365
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Branching based on XCOM variable
> 
>
> Key: AIRFLOW-1684
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1684
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: xcom
>Affects Versions: 1.7.0
> Environment: Centos 7, Airflow1.7
>Reporter: Virendhar Sivaraman
>Assignee: Elad
>Priority: Major
>
> I would like to branch my dag based on a XCOM variable.
> Steps:
> 1. Populate XCOM in bash
> 2. pull the XCOM variable in a BranchPythonOperator and branch it out based 
> on the XCOM variable
> I've tried the documentation and researched on the internet - haven't been 
> successful.
> This feature will be helpful if its not available yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2018-12-26 Thread GitBox
Fokko commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: 
https://github.com/apache/incubator-airflow/pull/3770#issuecomment-450016715
 
 
   @dimberman final thoughts? I'm not into the k8s part right now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4339: [AIRFLOW-3303] Deprecate old UI in favor of FAB

2018-12-26 Thread GitBox
Fokko commented on issue #4339: [AIRFLOW-3303] Deprecate old UI in favor of FAB
URL: 
https://github.com/apache/incubator-airflow/pull/4339#issuecomment-450016352
 
 
   @verdan Can you rebase? Would be good to get this merged, since we're now 
doing duplicate work :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4368: [AIRFLOW-3561] Improve queries

2018-12-26 Thread GitBox
Fokko commented on a change in pull request #4368: [AIRFLOW-3561] Improve 
queries
URL: https://github.com/apache/incubator-airflow/pull/4368#discussion_r244041406
 
 

 ##
 File path: airflow/models/__init__.py
 ##
 @@ -3249,6 +3259,17 @@ def __exit__(self, _type, _value, _tb):
 
 # /Context Manager --
 
+@property
+def default_view(self):
 
 Review comment:
   Awesome, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file

2018-12-26 Thread GitBox
Fokko commented on issue #4370: [AIRFLOW-3459] Move DagPickle to separate file
URL: 
https://github.com/apache/incubator-airflow/pull/4370#issuecomment-450015414
 
 
   @BasPH Why did you close this? :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244038642
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -168,39 +168,171 @@ def delete_instance(self, project_id, instance_id):
 """
 Deletes an existing Cloud Spanner instance.
 
-:param project_id: The ID of the project which owns the instances, 
tables and data.
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
 :type project_id: str
-:param instance_id: The ID of the instance.
+:param instance_id:  The ID of the Cloud Spanner instance.
 :type instance_id: str
 """
-client = self.get_client(project_id)
-instance = client.instance(instance_id)
+instance = self.get_client(project_id).instance(instance_id)
 try:
 instance.delete()
 return True
 except GoogleAPICallError as e:
-self.log.error('An error occurred: %s. Aborting.', e.message)
+self.log.error('An error occurred: %s. Exiting.', e.message)
+raise e
+
+def get_database(self, project_id, instance_id, database_id):
+# type: (str, str, str) -> Optional[Database]
+"""
+Retrieves a database in Cloud Spanner. If the database does not exist
+in the specified instance, it returns None.
+
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
+:type project_id: str
+:param instance_id: The ID of the Cloud Spanner instance.
+:type instance_id: str
+:param database_id: The ID of the database in Cloud Spanner.
+:type database_id: str
+:return:
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244038274
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -147,15 +149,13 @@ def _apply_to_instance(self, project_id, instance_id, 
configuration_name, node_c
 :param func: Method of the instance to be called.
 :type func: Callable
 """
-client = self.get_client(project_id)
-instance = client.instance(instance_id,
-   configuration_name=configuration_name,
-   node_count=node_count,
-   display_name=display_name)
+instance = self.get_client(project_id).instance(
 
 Review comment:
   I'd rather wait with hook tests to subsequent PRs (and maybe we solve it in 
a bit different way). I think it is rather difficult to test automatically the 
hooks to GCP using mocking/classic unit tests of Airflow. Please let me know 
what you think about the approach we use? Maybe we are missing something and we 
can add some unit test for hooks which could add value :)?
   
   We chose a slightly different strategy for testing GCP operators (and hooks):
   
   * First of all we try to make hooks as straightforward as possible - 
basically 1-1 operation of an existing library method but we make it 
synchronous - waiting for operation to complete. Then we put pretty much all 
logic into operators. 
   
   * What we realised is that we can mostly test with mocking and unit tests in 
this case, is whether this particular library method has been called. Which is 
a bit redundant - that's why we do not have those hook tests. 
   
   * instead we run automated system tests ("we" means the team from my company 
Polidea - 3 people who work on those operators). We use example DAGs 
(`example_gcp_spanner.py` in this case) to run them automatically with a real 
GCP_PROJECT. We even have a way to run them using (skipped by default) unit 
testss (see for example CloudSpannerExampleDagsTest). We have a separate and 
quite sophisticated environment for that - we run it automatically in GCP Cloud 
Build on every push and this way we test e-2-e whether the operators (and thus 
hooks) work with real GCP. 
   
   I am going to try to contribute it very soon to the community (it's already 
open-sourced and I am polishing it) so that others will be able to do the same 
with their own GCP projects. You can see it here 
https://github.com/PolideaInternal/airflow-breeze and I will be happy to 
involve you for comments/re view when we are ready to share it (I think in a 
couple of days).
   
   What do you think?
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3570) Scheduler stalled when statsd is enabled

2018-12-26 Thread Chinh Nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinh Nguyen updated AIRFLOW-3570:
--
Description: 
Web servers and workers appear to work fine with statsd_on = True and we can 
see metrics from them.

But scheduler stuck, scheduler python processes got into zoombie mode. 
Unfortunately, we don't have root access (it's inside Docker container) so we 
can't profile the processes easily. Anyone else seeing similar issue with 
scheduler and statsd integration?

  was:
Web servers and workers appear to work fine with statsd_on = True and we can 
see metrics from them.

But scheduler stuck, scheduler python processes got into zoombie mode and 
unfortunately since we don't have root access (it's inside Docker container) so 
we can't profile the processes easily. Anyone else seeing similar issue with 
scheduler and statsd integration?


> Scheduler stalled when statsd is enabled
> 
>
> Key: AIRFLOW-3570
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3570
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Chinh Nguyen
>Priority: Minor
>
> Web servers and workers appear to work fine with statsd_on = True and we can 
> see metrics from them.
> But scheduler stuck, scheduler python processes got into zoombie mode. 
> Unfortunately, we don't have root access (it's inside Docker container) so we 
> can't profile the processes easily. Anyone else seeing similar issue with 
> scheduler and statsd integration?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3570) Scheduler stalled when statsd is enabled

2018-12-26 Thread Chinh Nguyen (JIRA)
Chinh Nguyen created AIRFLOW-3570:
-

 Summary: Scheduler stalled when statsd is enabled
 Key: AIRFLOW-3570
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3570
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.10.1
Reporter: Chinh Nguyen


Web servers and workers appear to work fine with statsd_on = True and we can 
see metrics from them.

But scheduler seems to stuck, scheduler python processes got into zoombie mode 
and unfortunately since we don't have root access (it's inside Docker 
container) so we can't profile the processes easily. Anyone else seeing similar 
issue with scheduler and statsd integration?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3570) Scheduler stalled when statsd is enabled

2018-12-26 Thread Chinh Nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinh Nguyen updated AIRFLOW-3570:
--
Description: 
Web servers and workers appear to work fine with statsd_on = True and we can 
see metrics from them.

But scheduler stuck, scheduler python processes got into zoombie mode and 
unfortunately since we don't have root access (it's inside Docker container) so 
we can't profile the processes easily. Anyone else seeing similar issue with 
scheduler and statsd integration?

  was:
Web servers and workers appear to work fine with statsd_on = True and we can 
see metrics from them.

But scheduler seems to stuck, scheduler python processes got into zoombie mode 
and unfortunately since we don't have root access (it's inside Docker 
container) so we can't profile the processes easily. Anyone else seeing similar 
issue with scheduler and statsd integration?


> Scheduler stalled when statsd is enabled
> 
>
> Key: AIRFLOW-3570
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3570
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Chinh Nguyen
>Priority: Minor
>
> Web servers and workers appear to work fine with statsd_on = True and we can 
> see metrics from them.
> But scheduler stuck, scheduler python processes got into zoombie mode and 
> unfortunately since we don't have root access (it's inside Docker container) 
> so we can't profile the processes easily. Anyone else seeing similar issue 
> with scheduler and statsd integration?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
potiuk commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244035912
 
 

 ##
 File path: setup.py
 ##
 @@ -245,6 +245,7 @@ def write_version(filename=os.path.join(*['airflow',
 devel = [
 'click==6.7',
 'freezegun',
+'freezegun',
 
 Review comment:
   indeed. Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244021232
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -168,39 +168,171 @@ def delete_instance(self, project_id, instance_id):
 """
 Deletes an existing Cloud Spanner instance.
 
-:param project_id: The ID of the project which owns the instances, 
tables and data.
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
 :type project_id: str
-:param instance_id: The ID of the instance.
+:param instance_id:  The ID of the Cloud Spanner instance.
 :type instance_id: str
 """
-client = self.get_client(project_id)
-instance = client.instance(instance_id)
+instance = self.get_client(project_id).instance(instance_id)
 try:
 instance.delete()
 return True
 except GoogleAPICallError as e:
-self.log.error('An error occurred: %s. Aborting.', e.message)
+self.log.error('An error occurred: %s. Exiting.', e.message)
+raise e
+
+def get_database(self, project_id, instance_id, database_id):
+# type: (str, str, str) -> Optional[Database]
+"""
+Retrieves a database in Cloud Spanner. If the database does not exist
+in the specified instance, it returns None.
+
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
+:type project_id: str
+:param instance_id: The ID of the Cloud Spanner instance.
+:type instance_id: str
+:param database_id: The ID of the database in Cloud Spanner.
+:type database_id: str
+:return:
+"""
+
+instance = self.get_client(project_id=project_id).instance(
+instance_id=instance_id)
+if not instance.exists():
+raise AirflowException("The instance {} does not exist in project 
{} !".
+   format(instance_id, project_id))
+database = instance.database(database_id=database_id)
+if not database.exists():
+return None
+else:
+return database
+
+def create_database(self, project_id, instance_id, database_id, 
ddl_statements):
+# type: (str, str, str, [str]) -> bool
+"""
+Creates a new database in Cloud Spanner.
+
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
+:type project_id: str
+:param instance_id: The ID of the Cloud Spanner instance.
+:type instance_id: str
+:param database_id: The ID of the database to create in Cloud Spanner.
+:type database_id: str
+:param ddl_statements: The string list containing DDL for the new 
database.
+:type ddl_statements: [str]
+:return:
+"""
+
+instance = self.get_client(project_id=project_id).instance(
+instance_id=instance_id)
+if not instance.exists():
+raise AirflowException("The instance {} does not exist in project 
{} !".
+   format(instance_id, project_id))
+database = instance.database(database_id=database_id,
+ ddl_statements=ddl_statements)
+try:
+operation = database.create()  # type: Operation
+except GoogleAPICallError as e:
+self.log.error('An error occurred: %s. Exiting.', e.message)
+raise e
+
+if operation:
+result = operation.result()
+self.log.info(result)
+return True
 
 Review comment:
   Is this return value useful? It looks like this method either returns `True` 
or raises an exception, so I'm not sure when it would be useful to check the 
return value.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244021125
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -168,39 +168,171 @@ def delete_instance(self, project_id, instance_id):
 """
 Deletes an existing Cloud Spanner instance.
 
-:param project_id: The ID of the project which owns the instances, 
tables and data.
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
 :type project_id: str
-:param instance_id: The ID of the instance.
+:param instance_id:  The ID of the Cloud Spanner instance.
 :type instance_id: str
 """
-client = self.get_client(project_id)
-instance = client.instance(instance_id)
+instance = self.get_client(project_id).instance(instance_id)
 try:
 instance.delete()
 return True
 except GoogleAPICallError as e:
-self.log.error('An error occurred: %s. Aborting.', e.message)
+self.log.error('An error occurred: %s. Exiting.', e.message)
+raise e
+
+def get_database(self, project_id, instance_id, database_id):
+# type: (str, str, str) -> Optional[Database]
+"""
+Retrieves a database in Cloud Spanner. If the database does not exist
+in the specified instance, it returns None.
+
+:param project_id: The ID of the GCP project that owns the Cloud 
Spanner database.
+:type project_id: str
+:param instance_id: The ID of the Cloud Spanner instance.
+:type instance_id: str
+:param database_id: The ID of the database in Cloud Spanner.
+:type database_id: str
+:return:
 
 Review comment:
   Can you fill out the empty `:return:` values?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244021035
 
 

 ##
 File path: airflow/contrib/hooks/gcp_spanner_hook.py
 ##
 @@ -147,15 +149,13 @@ def _apply_to_instance(self, project_id, instance_id, 
configuration_name, node_c
 :param func: Method of the instance to be called.
 :type func: Callable
 """
-client = self.get_client(project_id)
-instance = client.instance(instance_id,
-   configuration_name=configuration_name,
-   node_count=node_count,
-   display_name=display_name)
+instance = self.get_client(project_id).instance(
 
 Review comment:
   I was going to ask for tests of changes to the hook, but I see that we don't 
test this hook at all at the moment. Do you have time to add tests in this PR, 
or should we add them in a separate PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4354: [AIRFLOW-3446] Add Google 
Cloud BigTable operators
URL: https://github.com/apache/incubator-airflow/pull/4354#discussion_r244018595
 
 

 ##
 File path: setup.py
 ##
 @@ -189,6 +189,7 @@ def write_version(filename=os.path.join(*['airflow',
 'google-auth>=1.0.0, <2.0.0dev',
 'google-auth-httplib2>=0.0.1',
 'google-cloud-container>=0.1.1',
+'google-cloud-bigtable==0.31.0',
 
 Review comment:
   Can we use a version range instead of pinning to an exact version?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database Deploy/Update/Delete operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4353: [AIRFLOW-3480] Add Database 
Deploy/Update/Delete operators
URL: https://github.com/apache/incubator-airflow/pull/4353#discussion_r244018499
 
 

 ##
 File path: setup.py
 ##
 @@ -245,6 +245,7 @@ def write_version(filename=os.path.join(*['airflow',
 devel = [
 'click==6.7',
 'freezegun',
+'freezegun',
 
 Review comment:
   Looks like a typo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & 
DayOfWeek Sensors
URL: 
https://github.com/apache/incubator-airflow/pull/4363#issuecomment-449730551
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=h1)
 Report
   > Merging 
[#4363](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/c014324be237523d23ceb7a6250218d28befddfd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4363/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4363   +/-   ##
   ===
 Coverage   78.14%   78.14%   
   ===
 Files 202  202   
 Lines   1648616486   
   ===
 Hits1288312883   
 Misses   3603 3603
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=footer).
 Last update 
[c014324...ecb62bf](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & 
DayOfWeek Sensors
URL: 
https://github.com/apache/incubator-airflow/pull/4363#issuecomment-449730551
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=h1)
 Report
   > Merging 
[#4363](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/c014324be237523d23ceb7a6250218d28befddfd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4363/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4363   +/-   ##
   ===
 Coverage   78.14%   78.14%   
   ===
 Files 202  202   
 Lines   1648616486   
   ===
 Hits1288312883   
 Misses   3603 3603
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=footer).
 Last update 
[c014324...ecb62bf](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4363: [AIRFLOW-3560] Add WeekEnd & 
DayOfWeek Sensors
URL: 
https://github.com/apache/incubator-airflow/pull/4363#issuecomment-449730551
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=h1)
 Report
   > Merging 
[#4363](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/c014324be237523d23ceb7a6250218d28befddfd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4363/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4363   +/-   ##
   ===
 Coverage   78.14%   78.14%   
   ===
 Files 202  202   
 Lines   1648616486   
   ===
 Hits1288312883   
 Misses   3603 3603
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=footer).
 Last update 
[c014324...ecb62bf](https://codecov.io/gh/apache/incubator-airflow/pull/4363?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3459) Refactor: Move DagPickle out of models.py

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729084#comment-16729084
 ] 

ASF GitHub Bot commented on AIRFLOW-3459:
-

BasPH commented on pull request #4374: [AIRFLOW-3459] Move DagPickle to 
separate file
URL: https://github.com/apache/incubator-airflow/pull/4374
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3459
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Move the DagPickle class to a separate file as part of [the big models.py 
refactor](https://issues.apache.org/jira/issues/?jql=text%20~%20%22Refactor%3A%20Move%20out%20of%20models.py%22%20AND%20reporter%20in%20(Fokko))
 (/airflow/models/dagpickle.py).
   
   As a result of the refactoring, I also split off models.Base into a separate 
file.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Only moved code around, nothing new.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor: Move DagPickle out of models.py
> -
>
> Key: AIRFLOW-3459
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3459
> Project: Apache Airflow
>  Issue Type: Task
>  Components: models
>Affects Versions: 1.10.1
>Reporter: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmcarp commented on a change in pull request #4354: [AIRFLOW-3446] Add Google Cloud BigTable operators

2018-12-26 Thread GitBox
jmcarp commented on a change in pull request #4354: [AIRFLOW-3446] Add Google 
Cloud BigTable operators
URL: https://github.com/apache/incubator-airflow/pull/4354#discussion_r244013373
 
 

 ##
 File path: airflow/contrib/hooks/gcp_bigtable_hook.py
 ##
 @@ -0,0 +1,232 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from google.cloud.bigtable import Client
+from google.cloud.bigtable.cluster import Cluster
+from google.cloud.bigtable.instance import Instance
+from google.cloud.bigtable.table import Table
+from google.cloud.bigtable_admin_v2 import enums
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+
+
+# noinspection PyAbstractClass
+class BigtableHook(GoogleCloudBaseHook):
+"""
+Hook for Google Cloud Bigtable APIs.
+"""
+
+_client = None
+
+def __init__(self,
+ gcp_conn_id='google_cloud_default',
+ delegate_to=None):
+super(BigtableHook, self).__init__(gcp_conn_id, delegate_to)
+
+def get_client(self, project_id):
+if not self._client:
+self._client = Client(project=project_id, 
credentials=self._get_credentials(), admin=True)
+return self._client
+
+def get_instance(self, project_id, instance_id):
 
 Review comment:
   Since this hook subclasses `GoogleCloudBaseHook`, you can get the project id 
optionally set on the connection with `self.project_id`. Can you make 
`project_id` optional and default to the connection field, like in 
`GoogleCloudStorageHook.create_bucket`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4363: [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4363: [AIRFLOW-3560] Add WeekEnd & 
DayOfWeek Sensors
URL: https://github.com/apache/incubator-airflow/pull/4363#discussion_r244010554
 
 

 ##
 File path: airflow/contrib/sensors/weekday_sensor.py
 ##
 @@ -0,0 +1,87 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import calendar
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils import timezone
+from airflow.utils.decorators import apply_defaults
+
+
+class DayOfWeekSensor(BaseSensorOperator):
+"""
+Waits until the first specified day of the week. For example, if the 
execution
+day of the task is '2018-12-22' (Saturday) and you pass '5', the task will 
wait
+until next Friday (Week Day: 5)
+
+:param week_day_number: Day of the week as an integer, where Monday is 1 
and
+Sunday is 7 (ISO Week Numbering)
+:type week_day_number: int
+:param use_task_execution_day: If ``True``, uses task's execution day to 
compare
+with week_day_number. Execution Date is Useful for backfilling.
+If ``False``, uses system's day of the week. Useful when you
+don't want to run anything on weekdays on the system.
+:type use_task_execution_day: bool
+"""
+
+@apply_defaults
+def __init__(self, week_day_number,
 
 Review comment:
   Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4363: [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

2018-12-26 Thread GitBox
kaxil commented on a change in pull request #4363: [AIRFLOW-3560] Add WeekEnd & 
DayOfWeek Sensors
URL: https://github.com/apache/incubator-airflow/pull/4363#discussion_r243996293
 
 

 ##
 File path: airflow/contrib/sensors/weekday_sensor.py
 ##
 @@ -0,0 +1,87 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import calendar
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils import timezone
+from airflow.utils.decorators import apply_defaults
+
+
+class DayOfWeekSensor(BaseSensorOperator):
+"""
+Waits until the first specified day of the week. For example, if the 
execution
+day of the task is '2018-12-22' (Saturday) and you pass '5', the task will 
wait
+until next Friday (Week Day: 5)
+
+:param week_day_number: Day of the week as an integer, where Monday is 1 
and
+Sunday is 7 (ISO Week Numbering)
+:type week_day_number: int
+:param use_task_execution_day: If ``True``, uses task's execution day to 
compare
+with week_day_number. Execution Date is Useful for backfilling.
+If ``False``, uses system's day of the week. Useful when you
+don't want to run anything on weekdays on the system.
+:type use_task_execution_day: bool
+"""
+
+@apply_defaults
+def __init__(self, week_day_number,
+ use_task_execution_day=False,
+ *args, **kwargs):
+super(DayOfWeekSensor, self).__init__(*args, **kwargs)
+self.week_day_number = week_day_number
+self.use_task_execution_day = use_task_execution_day
+
+def poke(self, context):
+if self.week_day_number > 7:
+raise ValueError(
+'Invalid value ({}) for week_day_number! '
+'Valid value: 1 <= week_day_number <= 
7'.format(self.week_day_number))
+self.log.info('Poking until weekday is %s, Today is %s',
+  calendar.day_name[self.week_day_number - 1],
+  calendar.day_name[timezone.utcnow().isoweekday() - 1])
+if self.use_task_execution_day:
+return context['execution_date'].isoweekday() == 
self.week_day_number
+else:
+return timezone.utcnow().isoweekday() == self.week_day_number
+
+
+class WeekEndSensor(BaseSensorOperator):
 
 Review comment:
   My reasoning behind keeping them in the same file is because they are 
related. But I can move it into a separate file if you think that would be 
better.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4373: [AIRFLOW-3569] Add "Trigger DAG" button in DAG page

2018-12-26 Thread GitBox
codecov-io edited a comment on issue #4373: [AIRFLOW-3569] Add "Trigger DAG" 
button in DAG page
URL: 
https://github.com/apache/incubator-airflow/pull/4373#issuecomment-449962906
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=h1)
 Report
   > Merging 
[#4373](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/c014324be237523d23ceb7a6250218d28befddfd?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4373/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4373   +/-   ##
   ===
 Coverage   78.14%   78.14%   
   ===
 Files 202  202   
 Lines   1648616486   
   ===
 Hits1288312883   
 Misses   3603 3603
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=footer).
 Last update 
[c014324...53806ee](https://codecov.io/gh/apache/incubator-airflow/pull/4373?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on a change in pull request #4372: AIRFLOW-3567 Show an error in case logs can't be fetched from S3

2018-12-26 Thread GitBox
XD-DENG commented on a change in pull request #4372: AIRFLOW-3567 Show an error 
in case logs can't be fetched from S3
URL: https://github.com/apache/incubator-airflow/pull/4372#discussion_r243989664
 
 

 ##
 File path: airflow/utils/log/s3_task_handler.py
 ##
 @@ -124,7 +124,8 @@ def s3_log_exists(self, remote_log_location):
 try:
 return self.hook.get_key(remote_log_location) is not None
 except Exception:
-pass
+msg = 'Could not list logs at {}'.format(remote_log_location)
+self.log.exception(msg)
 
 Review comment:
   Maybe not necessary to have this intermediary variable `msg` here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >