[GitHub] feng-tao commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on log table, will make delete dag optimized

2018-08-06 Thread GitBox
feng-tao commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on 
log table, will make delete dag optimized
URL: 
https://github.com/apache/incubator-airflow/pull/3709#issuecomment-410932006
 
 
   @vardancse , you need to add an alemic migration script for index change(e.g 
https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/211e584da130_add_ti_state_index.py)
   And you need to rebase your pr with master which will fix the CI issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vardancse commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on log table, will make delete dag optimized

2018-08-06 Thread GitBox
vardancse commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on 
log table, will make delete dag optimized
URL: 
https://github.com/apache/incubator-airflow/pull/3709#issuecomment-410930822
 
 
   @Fokko I think I might be missing something, could you please correct me why 
we need to do database migration? If you're talking about in existing 
configuration, then an index statement needs to be applied directly to metadata 
and if configuration is new, it will consider one mentioned in models.py
   
   Additionally, could you please help in knowing why checks have been failed? 
It seems to be stopped, Can I recheck it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
yrqls21 commented on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity 
in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410911196
 
 
   Definitely, looking forward for more of your good work too!
   
   Xiaodong 于2018年8月6日 周一下午7:15写道:
   
   > *@XD-DENG* commented on this pull request.
   > --
   >
   > In tests/models.py
   > 

   > :
   >
   > > +def test_process_file_cron_validity_check(self):
   > +"""
   > +test if an invalid cron expression
   > +as schedule interval can be identified
   > +"""
   > +invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
   > +dagbag = models.DagBag()
   > +
   > +for d in invalid_dag_files:
   > +dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
   > +
   > +files_with_cron_error = [os.path.split(k)[1]
   > + for k, v in dagbag.import_errors.items()
   > + if "Invalid Cron expression" in v]
   > +for d in invalid_dag_files:
   > +self.assertTrue(d in files_with_cron_error)
   >
   > :-| Crazy now LOL
   >
   > Thanks again! And look forward to your reviewing in any of my future
   > commits!
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
XD-DENG commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r208080167
 
 

 ##
 File path: tests/models.py
 ##
 @@ -1038,6 +1038,23 @@ def test_zip(self):
 dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, "test_zip.zip"))
 self.assertTrue(dagbag.get_dag("test_zip_dag"))
 
+def test_process_file_cron_validity_check(self):
+"""
+test if an invalid cron expression
+as schedule interval can be identified
+"""
+invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
+dagbag = models.DagBag()
+
+for d in invalid_dag_files:
+dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
+
+files_with_cron_error = [os.path.split(k)[1]
+ for k, v in dagbag.import_errors.items()
+ if "Invalid Cron expression" in v]
+for d in invalid_dag_files:
+self.assertTrue(d in files_with_cron_error)
 
 Review comment:
   :-|   Crazy now LOL
   
   Thanks again! And look forward to your reviewing in any of my future 
commits! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r208079735
 
 

 ##
 File path: tests/models.py
 ##
 @@ -1038,6 +1038,23 @@ def test_zip(self):
 dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, "test_zip.zip"))
 self.assertTrue(dagbag.get_dag("test_zip_dag"))
 
+def test_process_file_cron_validity_check(self):
+"""
+test if an invalid cron expression
+as schedule interval can be identified
+"""
+invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
+dagbag = models.DagBag()
+
+for d in invalid_dag_files:
+dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
+
+files_with_cron_error = [os.path.split(k)[1]
+ for k, v in dagbag.import_errors.items()
+ if "Invalid Cron expression" in v]
+for d in invalid_dag_files:
+self.assertTrue(d in files_with_cron_error)
 
 Review comment:
   LOL I was just messing around for the holy perfectionism spirit. We can even 
be more assertive if we know those two are import error from cron exception. 
Good job for making the change to improve Airflow! Thank you!
   
   (it's still `O(n)` tho ;))


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Cplo commented on issue #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items

2018-08-06 Thread GitBox
Cplo commented on issue #3697: [AIRFLOW-2854] kubernetes_pod_operator add more 
configuration items
URL: 
https://github.com/apache/incubator-airflow/pull/3697#issuecomment-410906442
 
 
   PTAL?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3711: [AIRFLOW-2863] Fix GKEClusterHook catching wrong exception

2018-08-06 Thread GitBox
codecov-io commented on issue #3711: [AIRFLOW-2863] Fix GKEClusterHook catching 
wrong exception
URL: 
https://github.com/apache/incubator-airflow/pull/3711#issuecomment-410885905
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=h1)
 Report
   > Merging 
[#3711](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/27b436ed93f3b13134088a6a6cc13d6012c435f3?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3711/graphs/tree.svg?width=650=150=pr=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3711   +/-   ##
   ===
 Coverage   77.56%   77.56%   
   ===
 Files 204  204   
 Lines   1576615766   
   ===
 Hits1222912229   
 Misses   3537 3537
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=footer).
 Last update 
[27b436e...820c581](https://codecov.io/gh/apache/incubator-airflow/pull/3711?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2787) Airflow scheduler dies on DAGs with NULL DagRun run_id

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570879#comment-16570879
 ] 

ASF subversion and git services commented on AIRFLOW-2787:
--

Commit 27b436ed93f3b13134088a6a6cc13d6012c435f3 in incubator-airflow's branch 
refs/heads/master from [~gwax]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=27b436e ]

AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id (#3629)



> Airflow scheduler dies on DAGs with NULL DagRun run_id
> --
>
> Key: AIRFLOW-2787
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2787
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Priority: Critical
>
> When a DagRun is created with NULL run_id, the scheduler subprocess will 
> crash when checking `is_backfill`:
> {noformat}
> Got an exception! Propagating...
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 347, in 
> helper
> pickle_dags)
>   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 53, 
> in wrapper
> result = func(*args, **kwargs)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1583, 
> in process_file
> self._process_dags(dagbag, dags, ti_keys_to_schedule)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1175, 
> in _process_dags
> self._process_task_instances(dag, tis_out)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 873, in 
> _process_task_instances
> if run.is_backfill:
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 4257, 
> in is_backfill
> if "backfill" in self.run_id:
> TypeError: argument of type 'NoneType' is not iterable
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2787) Airflow scheduler dies on DAGs with NULL DagRun run_id

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570878#comment-16570878
 ] 

ASF GitHub Bot commented on AIRFLOW-2787:
-

feng-tao closed pull request #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: https://github.com/apache/incubator-airflow/pull/3629
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index cf7eb0a64f..288bd4c937 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -5141,7 +5141,10 @@ def get_run(session, dag_id, execution_date):
 @property
 def is_backfill(self):
 from airflow.jobs import BackfillJob
-return self.run_id.startswith(BackfillJob.ID_PREFIX)
+return (
+self.run_id is not None and
+self.run_id.startswith(BackfillJob.ID_PREFIX)
+)
 
 @classmethod
 @provide_session
diff --git a/tests/models.py b/tests/models.py
index 1c88ea47f7..914b8bc6c4 100644
--- a/tests/models.py
+++ b/tests/models.py
@@ -946,11 +946,20 @@ def test_get_latest_runs(self):
 
 def test_is_backfill(self):
 dag = DAG(dag_id='test_is_backfill', start_date=DEFAULT_DATE)
+
 dagrun = self.create_dag_run(dag, execution_date=DEFAULT_DATE)
 dagrun.run_id = BackfillJob.ID_PREFIX + '_sfddsffds'
-dagrun2 = self.create_dag_run(dag, execution_date=DEFAULT_DATE + 
datetime.timedelta(days=1))
+
+dagrun2 = self.create_dag_run(
+dag, execution_date=DEFAULT_DATE + datetime.timedelta(days=1))
+
+dagrun3 = self.create_dag_run(
+dag, execution_date=DEFAULT_DATE + datetime.timedelta(days=2))
+dagrun3.run_id = None
+
 self.assertTrue(dagrun.is_backfill)
 self.assertFalse(dagrun2.is_backfill)
+self.assertFalse(dagrun3.is_backfill)
 
 def test_removed_task_instances_can_be_restored(self):
 def with_all_tasks_removed(dag):


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow scheduler dies on DAGs with NULL DagRun run_id
> --
>
> Key: AIRFLOW-2787
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2787
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Priority: Critical
>
> When a DagRun is created with NULL run_id, the scheduler subprocess will 
> crash when checking `is_backfill`:
> {noformat}
> Got an exception! Propagating...
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 347, in 
> helper
> pickle_dags)
>   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 53, 
> in wrapper
> result = func(*args, **kwargs)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1583, 
> in process_file
> self._process_dags(dagbag, dags, ti_keys_to_schedule)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 1175, 
> in _process_dags
> self._process_task_instances(dag, tis_out)
>   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 873, in 
> _process_task_instances
> if run.is_backfill:
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 4257, 
> in is_backfill
> if "backfill" in self.run_id:
> TypeError: argument of type 'NoneType' is not iterable
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao closed pull request #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
feng-tao closed pull request #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: https://github.com/apache/incubator-airflow/pull/3629
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/models.py b/airflow/models.py
index cf7eb0a64f..288bd4c937 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -5141,7 +5141,10 @@ def get_run(session, dag_id, execution_date):
 @property
 def is_backfill(self):
 from airflow.jobs import BackfillJob
-return self.run_id.startswith(BackfillJob.ID_PREFIX)
+return (
+self.run_id is not None and
+self.run_id.startswith(BackfillJob.ID_PREFIX)
+)
 
 @classmethod
 @provide_session
diff --git a/tests/models.py b/tests/models.py
index 1c88ea47f7..914b8bc6c4 100644
--- a/tests/models.py
+++ b/tests/models.py
@@ -946,11 +946,20 @@ def test_get_latest_runs(self):
 
 def test_is_backfill(self):
 dag = DAG(dag_id='test_is_backfill', start_date=DEFAULT_DATE)
+
 dagrun = self.create_dag_run(dag, execution_date=DEFAULT_DATE)
 dagrun.run_id = BackfillJob.ID_PREFIX + '_sfddsffds'
-dagrun2 = self.create_dag_run(dag, execution_date=DEFAULT_DATE + 
datetime.timedelta(days=1))
+
+dagrun2 = self.create_dag_run(
+dag, execution_date=DEFAULT_DATE + datetime.timedelta(days=1))
+
+dagrun3 = self.create_dag_run(
+dag, execution_date=DEFAULT_DATE + datetime.timedelta(days=2))
+dagrun3.run_id = None
+
 self.assertTrue(dagrun.is_backfill)
 self.assertFalse(dagrun2.is_backfill)
+self.assertFalse(dagrun3.is_backfill)
 
 def test_removed_task_instances_can_be_restored(self):
 def with_all_tasks_removed(dag):


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410876498
 
 
   thanks @gwax . LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r208049867
 
 

 ##
 File path: tests/models.py
 ##
 @@ -1038,6 +1038,23 @@ def test_zip(self):
 dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, "test_zip.zip"))
 self.assertTrue(dagbag.get_dag("test_zip_dag"))
 
+def test_process_file_cron_validity_check(self):
+"""
+test if an invalid cron expression
+as schedule interval can be identified
+"""
+invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
+dagbag = models.DagBag()
 
 Review comment:
   NIT: I understand that you want your test to show clearly that you are 
processing those two test files so it make sense to keep the 
`dagbag.process_file` line. Then I guess we can here process something like an 
empty/made up directory--the test DAG files in the default DAG_HOME may grow 
and then we have 2x parsing time growth for the additional file, and our CI is 
already painfully lengthy ;)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r208050009
 
 

 ##
 File path: tests/models.py
 ##
 @@ -1038,6 +1038,23 @@ def test_zip(self):
 dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, "test_zip.zip"))
 self.assertTrue(dagbag.get_dag("test_zip_dag"))
 
+def test_process_file_cron_validity_check(self):
+"""
+test if an invalid cron expression
+as schedule interval can be identified
+"""
+invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
+dagbag = models.DagBag()
+
+for d in invalid_dag_files:
+dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
+
+files_with_cron_error = [os.path.split(k)[1]
+ for k, v in dagbag.import_errors.items()
+ if "Invalid Cron expression" in v]
+for d in invalid_dag_files:
+self.assertTrue(d in files_with_cron_error)
 
 Review comment:
   Sry for being picky here, just NIT, it is still `O(n^2)` here since 
`files_with_cron_error ` is a list :P


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gwax commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
gwax commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL 
DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410866094
 
 
   @feng-tao rebased


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] abdul-stripe commented on issue #3657: [AIRFLOW-2145] fix deadlock on clearing running task instance

2018-08-06 Thread GitBox
abdul-stripe commented on issue #3657: [AIRFLOW-2145] fix deadlock on clearing 
running task instance
URL: 
https://github.com/apache/incubator-airflow/pull/3657#issuecomment-410862356
 
 
   friendly bump! cc @bolkedebruin @Fokko 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410859827
 
 
   feel free to ping me once you rebase with master and we are good to go.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
feng-tao commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410859702
 
 
   thanks @gwax . Could you rebase the pr with master? There was an issue with 
the master which fails the CI. It just get reverted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] OElesin commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-08-06 Thread GitBox
OElesin commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job 
Compatibility to Airflow
URL: 
https://github.com/apache/incubator-airflow/pull/3504#issuecomment-410857891
 
 
   @suma-ps and @cnidus ,
   
   I have plans of working fixing this. But I kinda get squashing commits mixed 
up at times. I would try to resolve this by weekend and resubmit the PR.
   
   Thanks @all


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
tedmiston commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency 
for Read the Docs
URL: 
https://github.com/apache/incubator-airflow/pull/3703#issuecomment-410855271
 
 
   @feng-tao That PR makes a related change to the docker run command for local 
development, but doesn't cover the Read the Docs use case since RTD doesn't use 
those Docker commands.  The main reason for the different workaround used here 
is that one cannot pass a custom env var into the RTD build environment as the 
Docker command does, so instead we set it based on an env var that they do 
provide.
   
   Here's the mailing list thread from Kaxil with a bit more context on this PR 
- 
https://lists.apache.org/thread.html/3b7867b900bee8bdf013e6932812faa3ea891849c2b2eba000f0b093@%3Cdev.airflow.apache.org%3E.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3703: [AIRFLOW-2857] Fix 
verify_gpl_dependency for Read the Docs
URL: 
https://github.com/apache/incubator-airflow/pull/3703#issuecomment-410578804
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=h1)
 Report
   > Merging 
[#3703](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/aa7b25b6afc069077c3e42ab13ef2de3efa7d4fc?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3703/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3703   +/-   ##
   ===
 Coverage   77.56%   77.56%   
   ===
 Files 204  204   
 Lines   1576615766   
   ===
 Hits1222912229   
 Misses   3537 3537
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=footer).
 Last update 
[aa7b25b...7007e95](https://codecov.io/gh/apache/incubator-airflow/pull/3703?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
feng-tao commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for 
Read the Docs
URL: 
https://github.com/apache/incubator-airflow/pull/3703#issuecomment-410851724
 
 
   I saw a different pr(https://github.com/apache/incubator-airflow/pull/3701). 
Will that one solve the same issue that this pr tries to address? @tedmiston 
@ashb @kaxil 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2755) k8s workers think DAGs are always in `/tmp/dags`

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570751#comment-16570751
 ] 

ASF GitHub Bot commented on AIRFLOW-2755:
-

Fokko closed pull request #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: https://github.com/apache/incubator-airflow/pull/3612
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index be286ea3dc..d4a7242118 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -562,6 +562,7 @@ elasticsearch_end_of_log_mark = end_of_log
 worker_container_repository =
 worker_container_tag =
 worker_container_image_pull_policy = IfNotPresent
+worker_dags_folder =
 
 # If True (default), worker pods will be deleted upon termination
 delete_worker_pods = True
diff --git a/airflow/contrib/executors/kubernetes_executor.py 
b/airflow/contrib/executors/kubernetes_executor.py
index 788d925c38..66c600ba65 100644
--- a/airflow/contrib/executors/kubernetes_executor.py
+++ b/airflow/contrib/executors/kubernetes_executor.py
@@ -115,6 +115,8 @@ def __init__(self):
 self.kubernetes_section, 'worker_container_repository')
 self.worker_container_tag = configuration.get(
 self.kubernetes_section, 'worker_container_tag')
+self.worker_dags_folder = configuration.get(
+self.kubernetes_section, 'worker_dags_folder')
 self.kube_image = '{}:{}'.format(
 self.worker_container_repository, self.worker_container_tag)
 self.kube_image_pull_policy = configuration.get(
diff --git a/airflow/contrib/kubernetes/worker_configuration.py 
b/airflow/contrib/kubernetes/worker_configuration.py
index c9f86b047a..88a5cf0a40 100644
--- a/airflow/contrib/kubernetes/worker_configuration.py
+++ b/airflow/contrib/kubernetes/worker_configuration.py
@@ -81,12 +81,14 @@ def _get_init_containers(self, volume_mounts):
 def _get_environment(self):
 """Defines any necessary environment variables for the pod executor"""
 env = {
-'AIRFLOW__CORE__DAGS_FOLDER': '/tmp/dags',
-'AIRFLOW__CORE__EXECUTOR': 'LocalExecutor',
-'AIRFLOW__CORE__SQL_ALCHEMY_CONN': conf.get('core', 
'SQL_ALCHEMY_CONN')
+"AIRFLOW__CORE__EXECUTOR": "LocalExecutor",
+"AIRFLOW__CORE__SQL_ALCHEMY_CONN": conf.get("core", 
"SQL_ALCHEMY_CONN"),
 }
+
 if self.kube_config.airflow_configmap:
 env['AIRFLOW__CORE__AIRFLOW_HOME'] = self.worker_airflow_home
+if self.kube_config.worker_dags_folder:
+env['AIRFLOW__CORE__DAGS_FOLDER'] = 
self.kube_config.worker_dags_folder
 return env
 
 def _get_secrets(self):
diff --git a/scripts/ci/kubernetes/kube/configmaps.yaml 
b/scripts/ci/kubernetes/kube/configmaps.yaml
index 97556bf840..3e64ae4e47 100644
--- a/scripts/ci/kubernetes/kube/configmaps.yaml
+++ b/scripts/ci/kubernetes/kube/configmaps.yaml
@@ -179,6 +179,7 @@ data:
 worker_container_repository = airflow
 worker_container_tag = latest
 worker_container_image_pull_policy = IfNotPresent
+worker_dags_folder = /tmp/dags
 delete_worker_pods = True
 git_repo = https://github.com/apache/incubator-airflow.git
 git_branch = master
diff --git a/tests/contrib/executors/test_kubernetes_executor.py 
b/tests/contrib/executors/test_kubernetes_executor.py
index 963efcb03b..d9da48ce3b 100644
--- a/tests/contrib/executors/test_kubernetes_executor.py
+++ b/tests/contrib/executors/test_kubernetes_executor.py
@@ -133,6 +133,22 @@ def test_worker_with_subpaths(self):
 "subPath should've been passed to volumeMount 
configuration"
 )
 
+def test_worker_environment_no_dags_folder(self):
+self.kube_config.worker_dags_folder = ''
+worker_config = WorkerConfiguration(self.kube_config)
+env = worker_config._get_environment()
+
+self.assertNotIn('AIRFLOW__CORE__DAGS_FOLDER', env)
+
+def test_worker_environment_when_dags_folder_specified(self):
+dags_folder = '/workers/path/to/dags'
+self.kube_config.worker_dags_folder = dags_folder
+
+worker_config = WorkerConfiguration(self.kube_config)
+env = worker_config._get_environment()
+
+self.assertEqual(dags_folder, env['AIRFLOW__CORE__DAGS_FOLDER'])
+
 
 if __name__ == '__main__':
 unittest.main()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above 

[jira] [Commented] (AIRFLOW-2755) k8s workers think DAGs are always in `/tmp/dags`

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570752#comment-16570752
 ] 

ASF subversion and git services commented on AIRFLOW-2755:
--

Commit 90e88dfe82d781236db39ce3b6dfd92d5f48f6e7 in incubator-airflow's branch 
refs/heads/master from Aldo Giambelluca
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=90e88df ]

[AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration (#3612)

It was previously hardcoded to `/tmp/dags`.
This causes problems with python import of modules in the DAGs folder.

> k8s workers think DAGs are always in `/tmp/dags`
> 
>
> Key: AIRFLOW-2755
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2755
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, worker
>Reporter: Aldo
>Assignee: Aldo
>Priority: Minor
> Fix For: 2.0.0
>
>
> We have Airflow configured to use the `KubernetesExecutor` and run tasks in 
> newly created pods.
> I tried to use the `PythonOperator` to import the python callable from a 
> python module located in the DAGs directory as [that should be 
> possible|https://github.com/apache/incubator-airflow/blob/c7a472ed6b0d8a4720f57ba1140c8cf665757167/airflow/__init__.py#L42].
>  Airflow complained that the module was not found.
> After a fair amount of digging we found that the issue was that the workers 
> have the `AIRFLOW__CORE__DAGS_FOLDER` environment variable set to `/tmp/dags` 
> as [you can see from the 
> code|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/kubernetes/worker_configuration.py#L84].
> Unset that environment variable from within the task's pod and running the 
> task manually worked as expected. I think that this path should be 
> configurable (I'll give it a try to add a `kubernetes.worker_dags_folder` 
> configuration).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko closed pull request #3612: [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration

2018-08-06 Thread GitBox
Fokko closed pull request #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: https://github.com/apache/incubator-airflow/pull/3612
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index be286ea3dc..d4a7242118 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -562,6 +562,7 @@ elasticsearch_end_of_log_mark = end_of_log
 worker_container_repository =
 worker_container_tag =
 worker_container_image_pull_policy = IfNotPresent
+worker_dags_folder =
 
 # If True (default), worker pods will be deleted upon termination
 delete_worker_pods = True
diff --git a/airflow/contrib/executors/kubernetes_executor.py 
b/airflow/contrib/executors/kubernetes_executor.py
index 788d925c38..66c600ba65 100644
--- a/airflow/contrib/executors/kubernetes_executor.py
+++ b/airflow/contrib/executors/kubernetes_executor.py
@@ -115,6 +115,8 @@ def __init__(self):
 self.kubernetes_section, 'worker_container_repository')
 self.worker_container_tag = configuration.get(
 self.kubernetes_section, 'worker_container_tag')
+self.worker_dags_folder = configuration.get(
+self.kubernetes_section, 'worker_dags_folder')
 self.kube_image = '{}:{}'.format(
 self.worker_container_repository, self.worker_container_tag)
 self.kube_image_pull_policy = configuration.get(
diff --git a/airflow/contrib/kubernetes/worker_configuration.py 
b/airflow/contrib/kubernetes/worker_configuration.py
index c9f86b047a..88a5cf0a40 100644
--- a/airflow/contrib/kubernetes/worker_configuration.py
+++ b/airflow/contrib/kubernetes/worker_configuration.py
@@ -81,12 +81,14 @@ def _get_init_containers(self, volume_mounts):
 def _get_environment(self):
 """Defines any necessary environment variables for the pod executor"""
 env = {
-'AIRFLOW__CORE__DAGS_FOLDER': '/tmp/dags',
-'AIRFLOW__CORE__EXECUTOR': 'LocalExecutor',
-'AIRFLOW__CORE__SQL_ALCHEMY_CONN': conf.get('core', 
'SQL_ALCHEMY_CONN')
+"AIRFLOW__CORE__EXECUTOR": "LocalExecutor",
+"AIRFLOW__CORE__SQL_ALCHEMY_CONN": conf.get("core", 
"SQL_ALCHEMY_CONN"),
 }
+
 if self.kube_config.airflow_configmap:
 env['AIRFLOW__CORE__AIRFLOW_HOME'] = self.worker_airflow_home
+if self.kube_config.worker_dags_folder:
+env['AIRFLOW__CORE__DAGS_FOLDER'] = 
self.kube_config.worker_dags_folder
 return env
 
 def _get_secrets(self):
diff --git a/scripts/ci/kubernetes/kube/configmaps.yaml 
b/scripts/ci/kubernetes/kube/configmaps.yaml
index 97556bf840..3e64ae4e47 100644
--- a/scripts/ci/kubernetes/kube/configmaps.yaml
+++ b/scripts/ci/kubernetes/kube/configmaps.yaml
@@ -179,6 +179,7 @@ data:
 worker_container_repository = airflow
 worker_container_tag = latest
 worker_container_image_pull_policy = IfNotPresent
+worker_dags_folder = /tmp/dags
 delete_worker_pods = True
 git_repo = https://github.com/apache/incubator-airflow.git
 git_branch = master
diff --git a/tests/contrib/executors/test_kubernetes_executor.py 
b/tests/contrib/executors/test_kubernetes_executor.py
index 963efcb03b..d9da48ce3b 100644
--- a/tests/contrib/executors/test_kubernetes_executor.py
+++ b/tests/contrib/executors/test_kubernetes_executor.py
@@ -133,6 +133,22 @@ def test_worker_with_subpaths(self):
 "subPath should've been passed to volumeMount 
configuration"
 )
 
+def test_worker_environment_no_dags_folder(self):
+self.kube_config.worker_dags_folder = ''
+worker_config = WorkerConfiguration(self.kube_config)
+env = worker_config._get_environment()
+
+self.assertNotIn('AIRFLOW__CORE__DAGS_FOLDER', env)
+
+def test_worker_environment_when_dags_folder_specified(self):
+dags_folder = '/workers/path/to/dags'
+self.kube_config.worker_dags_folder = dags_folder
+
+worker_config = WorkerConfiguration(self.kube_config)
+env = worker_config._get_environment()
+
+self.assertEqual(dags_folder, env['AIRFLOW__CORE__DAGS_FOLDER'])
+
 
 if __name__ == '__main__':
 unittest.main()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-2755) k8s workers think DAGs are always in `/tmp/dags`

2018-08-06 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2755.
---
   Resolution: Fixed
 Assignee: Aldo
Fix Version/s: 2.0.0

> k8s workers think DAGs are always in `/tmp/dags`
> 
>
> Key: AIRFLOW-2755
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2755
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, worker
>Reporter: Aldo
>Assignee: Aldo
>Priority: Minor
> Fix For: 2.0.0
>
>
> We have Airflow configured to use the `KubernetesExecutor` and run tasks in 
> newly created pods.
> I tried to use the `PythonOperator` to import the python callable from a 
> python module located in the DAGs directory as [that should be 
> possible|https://github.com/apache/incubator-airflow/blob/c7a472ed6b0d8a4720f57ba1140c8cf665757167/airflow/__init__.py#L42].
>  Airflow complained that the module was not found.
> After a fair amount of digging we found that the issue was that the workers 
> have the `AIRFLOW__CORE__DAGS_FOLDER` environment variable set to `/tmp/dags` 
> as [you can see from the 
> code|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/kubernetes/worker_configuration.py#L84].
> Unset that environment variable from within the task's pod and running the 
> task manually worked as expected. I think that this path should be 
> configurable (I'll give it a try to add a `kubernetes.worker_dags_folder` 
> configuration).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on log table, will make delete dag optimized

2018-08-06 Thread GitBox
Fokko commented on issue #3709: [AIRFLOW-2861][Improvement] Added index on log 
table, will make delete dag optimized
URL: 
https://github.com/apache/incubator-airflow/pull/3709#issuecomment-410840187
 
 
   I think this also requires a database migration.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #3711: [AIRFLOW-2863] Fix GKEClusterHook catching wrong exception

2018-08-06 Thread GitBox
Fokko commented on a change in pull request #3711: [AIRFLOW-2863] Fix 
GKEClusterHook catching wrong exception
URL: https://github.com/apache/incubator-airflow/pull/3711#discussion_r208014056
 
 

 ##
 File path: airflow/contrib/hooks/gcp_container_hook.py
 ##
 @@ -24,6 +24,7 @@
 from airflow.hooks.base_hook import BaseHook
 
 from google.api_core.exceptions import AlreadyExists
+from google.api_core.exceptions import NotFound
 
 Review comment:
   Please combine the imports


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cnidus commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-08-06 Thread GitBox
cnidus commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job Compatibility 
to Airflow
URL: 
https://github.com/apache/incubator-airflow/pull/3504#issuecomment-410836839
 
 
   @OElesin I'm also really interested in this submission. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
tedmiston commented on issue #3703: [AIRFLOW-2857] Fix verify_gpl_dependency 
for Read the Docs
URL: 
https://github.com/apache/incubator-airflow/pull/3703#issuecomment-410836135
 
 
   @ashb @kaxil Replying to all comments here since some are now hidden from 
lines changing in my force push.
   
   I added a clarifying comment for the env var setting as requested.
   
   So the mock dependency surprised me too... I sorta conflated 2 issues here.  
Here's what's happening.
   
   1\. When building the docs site locally, if I do a fresh install and try to 
build without `mock` in the `docs` extra.
   
   ```
   $ SLUGIFY_USES_TEXT_UNIDECODE=yes pip install -e .[doc]
   $ cd docs/
   $ ./build.sh
   sphinx-build -b html -d _build/doctrees   . _build/html
   Running Sphinx v1.7.6
   
   Configuration error:
   There is a programable error in your configuration file:
   
   Traceback (most recent call last):
 File 
"/Users/taylor/.local/share/virtualenvs/tmp-afb06f74a17a352/lib/python3.6/site-packages/sphinx/config.py",
 line 161, in __init__
   execfile_(filename, config)
 File 
"/Users/taylor/.local/share/virtualenvs/tmp-afb06f74a17a352/lib/python3.6/site-packages/sphinx/util/pycompat.py",
 line 150, in execfile_
   exec_(code, _globals)
 File "conf.py", line 16, in 
   import mock
   ModuleNotFoundError: No module named 'mock'
   
   make: *** [html] Error 2
   ```
   
   This happens because conf.py for Sphinx [imports mock on line 
16](https://github.com/apache/incubator-airflow/blob/1b4961a0c5ebf5e314130a3fc97fb06fcb7c0c1a/docs/conf.py#L16)
 which it uses to build `MOCK_MODULES`.  That appears to be [a Sphinx pattern 
for a build system not having some module 
dependencies](http://docs.readthedocs.io/en/latest/faq.html#i-get-import-errors-on-libraries-that-depend-on-c-modules),
 which given the modules in that list, makes sense to me in a limited 
environment where I'm not doing a big install with more extras.  Adding the 
mock change that's in this PR resolves that issue.
   
   2\. The CI issue fix... so if I try to replicate the 4 extras that I believe 
RTD is installing per .readthedocs.yml, with `mock` removed from the `docs` 
extra, then run the build script which effectively runs `make html` which 
effectively runs `sphinx-build -b html . _build/html`, I get the same error.
   
   ```
   $ READTHEDOCS=True pip install -e .[doc,docker,gcp_api,emr]
   $ cd docs/
   $ ./build.sh
   ...same error as above...
   ```
   
   This makes me think there might be something wonky with RTD.  Perhaps it's 
running a different command than `make html`, or somehow is getting other 
extras installed that include mock (e.g., any of the extras that build on 
`devel` in setup.py).  As far as I can tell [from the RTD 
docs](http://docs.readthedocs.io/en/latest/builds.html#understanding-what-s-going-on)
 they are running the same command that errors out for me outside of their 
environment when I don't supply the mock dependency.
   
   > When we build your documentation, we run sphinx-build -b html . 
_build/html, where html would be replaced with the correct backend. 
   
   I'm all for getting to the bottom of the mock issue, but I also know that 
getting the docs site updating again is probably more urgent.  If not having 
mock doesn't seem to affect RTD, I'm happy to remove it here.
   
   **So the mock issue is really a completely separate issue from the GPL RTD 
issue.  Should I separate it from this PR?  Or keep it here and add a comment?  
What do you think?**
   
   P.S. I also fixed some trailing whitespace in .readthedocs.yml.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570696#comment-16570696
 ] 

ASF subversion and git services commented on AIRFLOW-2860:
--

Commit aa7b25b6afc069077c3e42ab13ef2de3efa7d4fc in incubator-airflow's branch 
refs/heads/master from [~Fokko]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=aa7b25b ]

Revert "[AIRFLOW-2860] DruidHook: time variable is not updated correctly when 
checking for timeout (#3707)"

This reverts commit d12aacd552878308f9b1c3663414bb7c00c0632b.


> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Assignee: Adam Welsh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on a change in pull request #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
Fokko commented on a change in pull request #3710: [AIRFLOW-2860] Update tests 
for druid hook
URL: https://github.com/apache/incubator-airflow/pull/3710#discussion_r208008753
 
 

 ##
 File path: airflow/hooks/druid_hook.py
 ##
 @@ -53,6 +54,8 @@ def __init__(
 self.max_ingestion_time = max_ingestion_time
 self.header = {'content-type': 'application/json'}
 
+assert self.timeout >= 1
 
 Review comment:
   @awelsh93 
   Can you change this line to:
   ```
   if self.timeout < 1:
  raise ValueError("Druid timeout should be equal or greater than 1")
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bolkedebruin commented on a change in pull request #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
bolkedebruin commented on a change in pull request #3710: [AIRFLOW-2860] Update 
tests for druid hook
URL: https://github.com/apache/incubator-airflow/pull/3710#discussion_r208007062
 
 

 ##
 File path: airflow/hooks/druid_hook.py
 ##
 @@ -53,6 +54,8 @@ def __init__(
 self.max_ingestion_time = max_ingestion_time
 self.header = {'content-type': 'application/json'}
 
+assert self.timeout >= 1
 
 Review comment:
   Strike that, missed the location. I would handle this as a ValueError though


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2863) GKEClusterHook catches wrong exception

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570683#comment-16570683
 ] 

ASF GitHub Bot commented on AIRFLOW-2863:
-

Noremac201 opened a new pull request #3711: [AIRFLOW-2863] Fix GKEClusterHook 
catching wrong exception
URL: https://github.com/apache/incubator-airflow/pull/3711
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
   
 - https://issues.apache.org/jira/browse/AIRFLOW-2863
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   GKEClusterHook was catching the wrong exception than it was expecting, and 
therefore would fail instead of succeeding. This fixes that.
   
   ### Tests
   -`test_create_cluster_already_exists`
   -`test_cluter_delete_not_found`
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> GKEClusterHook catches wrong exception
> --
>
> Key: AIRFLOW-2863
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2863
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Cameron Moberg
>Assignee: Cameron Moberg
>Priority: Minor
>
> Instead of successfully catching the error and reporting success, it reports 
> a failure, since it catches the wrong error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Noremac201 opened a new pull request #3711: [AIRFLOW-2863] Fix GKEClusterHook catching wrong exception

2018-08-06 Thread GitBox
Noremac201 opened a new pull request #3711: [AIRFLOW-2863] Fix GKEClusterHook 
catching wrong exception
URL: https://github.com/apache/incubator-airflow/pull/3711
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
   
 - https://issues.apache.org/jira/browse/AIRFLOW-2863
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   GKEClusterHook was catching the wrong exception than it was expecting, and 
therefore would fail instead of succeeding. This fixes that.
   
   ### Tests
   -`test_create_cluster_already_exists`
   -`test_cluter_delete_not_found`
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2863) GKEClusterHook catches wrong exception

2018-08-06 Thread Cameron Moberg (JIRA)
Cameron Moberg created AIRFLOW-2863:
---

 Summary: GKEClusterHook catches wrong exception
 Key: AIRFLOW-2863
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2863
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Cameron Moberg
Assignee: Cameron Moberg


Instead of successfully catching the error and reporting success, it reports a 
failure, since it catches the wrong error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-351) Failed to clear downstream tasks

2018-08-06 Thread Cameron Moberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Moberg reassigned AIRFLOW-351:
--

Assignee: (was: Cameron Moberg)

> Failed to clear downstream tasks
> 
>
> Key: AIRFLOW-351
> URL: https://issues.apache.org/jira/browse/AIRFLOW-351
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: models, subdag, webserver
>Affects Versions: Airflow 1.7.1.3
>Reporter: Adinata
>Priority: Major
> Attachments: dag_error.py, error.log, error_on_clear_dag.txt, 
> ubuntu-14-packages.log, ubuntu-16-oops.log, ubuntu-16-packages.log
>
>
> {code}
>   / (  ()   )  \___
>  /( (  (  )   _))  )   )\
>(( (   )()  )   (   )  )
>  ((/  ( _(   )   (   _) ) (  () )  )
> ( (  ( (_)   (((   )  .((_ ) .  )_
>( (  )(  (  ))   ) . ) (   )
>   (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
>   ( (  (   ) (  )   (  )) ) _)(   )  )  )
>  ( (  ( \ ) ((_  ( ) ( )  )   ) )  )) ( )
>   (  (   (  (   (_ ( ) ( _)  ) (  )  )   )
>  ( (  ( (  (  ) (_  )  ) )  _)   ) _( ( )
>   ((  (   )(( _)   _) _(_ (  (_ )
>(_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
>((__)\\||lll|l||///  \_))
> (   /(/ (  )  ) )\   )
>   (( ( ( | | ) ) )\   )
>(   /(| / ( )) ) ) )) )
>  ( ( _(|)_) )
>   (  ||\(|(|)|/|| )
> (|(||(||))
>   ( //|/l|||)|\\ \ )
> (/ / //  /|//\\  \ \  \ _)
> ---
> Node: 9889a7c79e9b
> ---
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in 
> wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in 
> full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in 
> handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in 
> full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in 
> dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 68, 
> in inner
> return self._run_view(f, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/base.py", line 
> 367, in _run_view
> return fn(self, *args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/flask_login.py", line 755, in 
> decorated_view
> return func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 118, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/utils.py", line 
> 167, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/views.py", line 
> 1017, in clear
> include_upstream=upstream)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2870, 
> in sub_dag
> dag = copy.deepcopy(self)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2856, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/python2.7/copy.py", line 163, in deepcopy
> y = copier(x, memo)
>   File "/usr/lib/python2.7/copy.py", line 257, in _deepcopy_dict
> y[deepcopy(key, memo)] = deepcopy(value, memo)
>   File "/usr/lib/python2.7/copy.py", line 174, in deepcopy
> y = copier(memo)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1974, 
> in __deepcopy__
> setattr(result, k, copy.deepcopy(v, memo))
>   File "/usr/lib/python2.7/copy.py", line 190, in deepcopy
> y = _reconstruct(x, rv, 1, memo)
>   File "/usr/lib/python2.7/copy.py", line 334, in _reconstruct
> state = deepcopy(state, memo)
>   File "/usr/lib/python2.7/copy.py", line 163, in deepcopy
> y = copier(x, memo)
>   File "/usr/lib/python2.7/copy.py", line 257, in _deepcopy_dict
> 

[GitHub] gwax edited a comment on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
gwax edited a comment on issue #3629: AIRFLOW-2787 Allow is_backfill to handle 
NULL DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410786820
 
 
   I have not seen a full set of passing tests yet but it is different CI 
targets that fail each time. The tests seem to be failing due to some sort of 
problem ~with~ after kerberos setup before anything even runs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bolkedebruin commented on issue #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
bolkedebruin commented on issue #3710: [AIRFLOW-2860] Update tests for druid 
hook
URL: 
https://github.com/apache/incubator-airflow/pull/3710#issuecomment-410813753
 
 
   I think this pr is blocking (as in it is required for) 
https://github.com/apache/incubator-airflow/pull/3708 I cannot get beyond the 
submit timeout test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bolkedebruin commented on a change in pull request #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
bolkedebruin commented on a change in pull request #3710: [AIRFLOW-2860] Update 
tests for druid hook
URL: https://github.com/apache/incubator-airflow/pull/3710#discussion_r207992729
 
 

 ##
 File path: airflow/hooks/druid_hook.py
 ##
 @@ -53,6 +54,8 @@ def __init__(
 self.max_ingestion_time = max_ingestion_time
 self.header = {'content-type': 'application/json'}
 
+assert self.timeout >= 1
 
 Review comment:
   Also please use self.assertXXX otherwise we won't see causes.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2813) `pip install apache-airflow` fails

2018-08-06 Thread Frank Cash (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570613#comment-16570613
 ] 

Frank Cash commented on AIRFLOW-2813:
-

Is there an ETA for when this will be released?

> `pip install apache-airflow` fails
> --
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Priority: Major
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] gwax commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL DagRun.run_id

2018-08-06 Thread GitBox
gwax commented on issue #3629: AIRFLOW-2787 Allow is_backfill to handle NULL 
DagRun.run_id
URL: 
https://github.com/apache/incubator-airflow/pull/3629#issuecomment-410786820
 
 
   I have not seen a full set of passing tests yet but it is different CI 
targets that fail each time. The tests seem to be failing due to some sort of 
problem with kerberos setup before anything even runs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3612: [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration

2018-08-06 Thread GitBox
dimberman commented on issue #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: 
https://github.com/apache/incubator-airflow/pull/3612#issuecomment-410775855
 
 
   @r4vi this looks great. I don't think there's any strong reliance on 
/tmp/dags and anyone who wants that directory can set it in the cfg.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3612: [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration

2018-08-06 Thread GitBox
dimberman commented on issue #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: 
https://github.com/apache/incubator-airflow/pull/3612#issuecomment-410775253
 
 
   @r4vi Looking now


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2811) Fix scheduler_ops_metrics.py to work

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570457#comment-16570457
 ] 

ASF subversion and git services commented on AIRFLOW-2811:
--

Commit 84a55f3e546cfbfd5f47302537444c8c8c4d2753 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=84a55f3 ]

[AIRFLOW-2811] Fix scheduler_ops_metrics.py to work (#3653)

This PR fixes timezone problem in
scheduler_ops_metrics.py and makes
its timeout configurable.

> Fix scheduler_ops_metrics.py to work
> 
>
> Key: AIRFLOW-2811
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2811
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> I tried to run {{scripts/perf/scheduler_ops_metrics.py}} but it failed with 
> the following error:
> {code}
> $ python scripts/perf/scheduler_ops_metrics.py 
> (snip)
> Traceback (most recent call last):
>   File "scripts/perf/scheduler_ops_metrics.py", line 192, in 
> main()
>   File "scripts/perf/scheduler_ops_metrics.py", line 188, in main
> job.run()
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 202, in run
> self._execute()
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 1584, in 
> _execute
> self._execute_helper(processor_manager)
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 1714, in 
> _execute_helper
> self.heartbeat()
>   File "scripts/perf/scheduler_ops_metrics.py", line 121, in heartbeat
> for dag in dags for task in dag.tasks])
> TypeError: can't subtract offset-naive and offset-aware datetimes
> {code}
> Also, it'd be nice if {{MAX_RUNTIME_SECS}} were configurable, since the 
> default value (6 seconds) is too short for all TaskInstances to finish in my 
> environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2811) Fix scheduler_ops_metrics.py to work

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570456#comment-16570456
 ] 

ASF GitHub Bot commented on AIRFLOW-2811:
-

feng-tao closed pull request #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py 
to work
URL: https://github.com/apache/incubator-airflow/pull/3653
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/scripts/perf/scheduler_ops_metrics.py 
b/scripts/perf/scheduler_ops_metrics.py
index d4e472d34f..7928649977 100644
--- a/scripts/perf/scheduler_ops_metrics.py
+++ b/scripts/perf/scheduler_ops_metrics.py
@@ -17,7 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from datetime import datetime
 import logging
 import pandas as pd
 import sys
@@ -25,6 +24,7 @@
 from airflow import configuration, settings
 from airflow.jobs import SchedulerJob
 from airflow.models import DagBag, DagModel, DagRun, TaskInstance
+from airflow.utils import timezone
 from airflow.utils.state import State
 
 SUBDIR = 'scripts/perf/dags'
@@ -53,7 +53,10 @@ class SchedulerMetricsJob(SchedulerJob):
 run on remote systems and spend the majority of their time on I/O wait.
 
 To Run:
-$ python scripts/perf/scheduler_ops_metrics.py
+$ python scripts/perf/scheduler_ops_metrics.py [timeout]
+
+You can specify timeout in seconds as an optional parameter.
+Its default value is 6 seconds.
 """
 __mapper_args__ = {
 'polymorphic_identity': 'SchedulerMetricsJob'
@@ -71,7 +74,7 @@ def print_stats(self):
 .filter(TI.dag_id.in_(DAG_IDS))
 .all()
 )
-successful_tis = filter(lambda x: x.state == State.SUCCESS, tis)
+successful_tis = [x for x in tis if x.state == State.SUCCESS]
 ti_perf = [(ti.dag_id, ti.task_id, ti.execution_date,
 (ti.queued_dttm - self.start_date).total_seconds(),
 (ti.start_date - self.start_date).total_seconds(),
@@ -117,11 +120,11 @@ def heartbeat(self):
 dagbag = DagBag(SUBDIR)
 dags = [dagbag.dags[dag_id] for dag_id in DAG_IDS]
 # the tasks in perf_dag_1 and per_dag_2 have a daily schedule interval.
-num_task_instances = sum([(datetime.today() - task.start_date).days
+num_task_instances = sum([(timezone.utcnow() - task.start_date).days
  for dag in dags for task in dag.tasks])
 
 if (len(successful_tis) == num_task_instances or
-(datetime.now()-self.start_date).total_seconds() >
+(timezone.utcnow() - self.start_date).total_seconds() >
 MAX_RUNTIME_SECS):
 if (len(successful_tis) == num_task_instances):
 self.log.info("All tasks processed! Printing stats.")
@@ -178,6 +181,17 @@ def set_dags_paused_state(is_paused):
 
 
 def main():
+global MAX_RUNTIME_SECS
+if len(sys.argv) > 1:
+try:
+max_runtime_secs = int(sys.argv[1])
+if max_runtime_secs < 1:
+raise ValueError
+MAX_RUNTIME_SECS = max_runtime_secs
+except ValueError:
+logging.error('Specify a positive integer for timeout.')
+sys.exit(1)
+
 configuration.load_test_config()
 
 set_dags_paused_state(False)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix scheduler_ops_metrics.py to work
> 
>
> Key: AIRFLOW-2811
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2811
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> I tried to run {{scripts/perf/scheduler_ops_metrics.py}} but it failed with 
> the following error:
> {code}
> $ python scripts/perf/scheduler_ops_metrics.py 
> (snip)
> Traceback (most recent call last):
>   File "scripts/perf/scheduler_ops_metrics.py", line 192, in 
> main()
>   File "scripts/perf/scheduler_ops_metrics.py", line 188, in main
> job.run()
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 202, in run
> self._execute()
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 1584, in 
> _execute
> self._execute_helper(processor_manager)
>   File "/home/sekikn/dev/incubator-airflow/airflow/jobs.py", line 1714, in 
> _execute_helper
> 

[GitHub] feng-tao closed pull request #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py to work

2018-08-06 Thread GitBox
feng-tao closed pull request #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py 
to work
URL: https://github.com/apache/incubator-airflow/pull/3653
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/scripts/perf/scheduler_ops_metrics.py 
b/scripts/perf/scheduler_ops_metrics.py
index d4e472d34f..7928649977 100644
--- a/scripts/perf/scheduler_ops_metrics.py
+++ b/scripts/perf/scheduler_ops_metrics.py
@@ -17,7 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-from datetime import datetime
 import logging
 import pandas as pd
 import sys
@@ -25,6 +24,7 @@
 from airflow import configuration, settings
 from airflow.jobs import SchedulerJob
 from airflow.models import DagBag, DagModel, DagRun, TaskInstance
+from airflow.utils import timezone
 from airflow.utils.state import State
 
 SUBDIR = 'scripts/perf/dags'
@@ -53,7 +53,10 @@ class SchedulerMetricsJob(SchedulerJob):
 run on remote systems and spend the majority of their time on I/O wait.
 
 To Run:
-$ python scripts/perf/scheduler_ops_metrics.py
+$ python scripts/perf/scheduler_ops_metrics.py [timeout]
+
+You can specify timeout in seconds as an optional parameter.
+Its default value is 6 seconds.
 """
 __mapper_args__ = {
 'polymorphic_identity': 'SchedulerMetricsJob'
@@ -71,7 +74,7 @@ def print_stats(self):
 .filter(TI.dag_id.in_(DAG_IDS))
 .all()
 )
-successful_tis = filter(lambda x: x.state == State.SUCCESS, tis)
+successful_tis = [x for x in tis if x.state == State.SUCCESS]
 ti_perf = [(ti.dag_id, ti.task_id, ti.execution_date,
 (ti.queued_dttm - self.start_date).total_seconds(),
 (ti.start_date - self.start_date).total_seconds(),
@@ -117,11 +120,11 @@ def heartbeat(self):
 dagbag = DagBag(SUBDIR)
 dags = [dagbag.dags[dag_id] for dag_id in DAG_IDS]
 # the tasks in perf_dag_1 and per_dag_2 have a daily schedule interval.
-num_task_instances = sum([(datetime.today() - task.start_date).days
+num_task_instances = sum([(timezone.utcnow() - task.start_date).days
  for dag in dags for task in dag.tasks])
 
 if (len(successful_tis) == num_task_instances or
-(datetime.now()-self.start_date).total_seconds() >
+(timezone.utcnow() - self.start_date).total_seconds() >
 MAX_RUNTIME_SECS):
 if (len(successful_tis) == num_task_instances):
 self.log.info("All tasks processed! Printing stats.")
@@ -178,6 +181,17 @@ def set_dags_paused_state(is_paused):
 
 
 def main():
+global MAX_RUNTIME_SECS
+if len(sys.argv) > 1:
+try:
+max_runtime_secs = int(sys.argv[1])
+if max_runtime_secs < 1:
+raise ValueError
+MAX_RUNTIME_SECS = max_runtime_secs
+except ValueError:
+logging.error('Specify a positive integer for timeout.')
+sys.exit(1)
+
 configuration.load_test_config()
 
 set_dags_paused_state(False)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py to work

2018-08-06 Thread GitBox
feng-tao commented on issue #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py 
to work
URL: 
https://github.com/apache/incubator-airflow/pull/3653#issuecomment-410770258
 
 
   lgtm @sekikn 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #3653: [AIRFLOW-2811] Fix scheduler_ops_metrics.py to work

2018-08-06 Thread GitBox
ashb commented on a change in pull request #3653: [AIRFLOW-2811] Fix 
scheduler_ops_metrics.py to work
URL: https://github.com/apache/incubator-airflow/pull/3653#discussion_r207954239
 
 

 ##
 File path: scripts/perf/scheduler_ops_metrics.py
 ##
 @@ -71,7 +74,7 @@ def print_stats(self):
 .filter(TI.dag_id.in_(DAG_IDS))
 .all()
 )
-successful_tis = filter(lambda x: x.state == State.SUCCESS, tis)
+successful_tis = [x for x in tis if x.state == State.SUCCESS]
 
 Review comment:
   `list(filter(...))` perhaps then?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3700: [AIRFLOW-2140] Don't require kubernetes for the SparkSubmit hook

2018-08-06 Thread GitBox
ashb commented on issue #3700: [AIRFLOW-2140] Don't require kubernetes for the 
SparkSubmit hook
URL: 
https://github.com/apache/incubator-airflow/pull/3700#issuecomment-410767604
 
 
   I thought about a local import, but thought that getting told about the 
error as soon as possible would be better, rather than when it delaying it to 
_just_ when the on_kill handler was run.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on a change in pull request #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
feng-tao commented on a change in pull request #3710: [AIRFLOW-2860] Update 
tests for druid hook
URL: https://github.com/apache/incubator-airflow/pull/3710#discussion_r207952035
 
 

 ##
 File path: airflow/hooks/druid_hook.py
 ##
 @@ -53,6 +54,8 @@ def __init__(
 self.max_ingestion_time = max_ingestion_time
 self.header = {'content-type': 'application/json'}
 
+assert self.timeout >= 1
 
 Review comment:
   I would prefer not to put assert outside of test code. Could you change it 
to throw exception when timeout is less than 1?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
codecov-io commented on issue #3710: [AIRFLOW-2860] Update tests for druid hook
URL: 
https://github.com/apache/incubator-airflow/pull/3710#issuecomment-410760084
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=h1)
 Report
   > Merging 
[#3710](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/d12aacd552878308f9b1c3663414bb7c00c0632b?src=pr=desc)
 will **increase** coverage by `59.86%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3710/graphs/tree.svg?height=150=650=WdLKlKHOAU=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#3710   +/-   ##
   ===
   + Coverage   17.69%   77.56%   +59.86% 
   ===
 Files 204  204   
 Lines   1576615767+1 
   ===
   + Hits 279012229 +9439 
   + Misses  12976 3538 -9438
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/druid\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kcnVpZF9ob29rLnB5)
 | `88.88% <100%> (+88.88%)` | :arrow_up: |
   | 
[airflow/utils/operator\_resources.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9vcGVyYXRvcl9yZXNvdXJjZXMucHk=)
 | `86.95% <0%> (+4.34%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `79.33% <0%> (+4.95%)` | :arrow_up: |
   | 
[airflow/executors/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvX19pbml0X18ucHk=)
 | `63.46% <0%> (+5.76%)` | :arrow_up: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `73.91% <0%> (+10.86%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `91.66% <0%> (+14.58%)` | :arrow_up: |
   | 
[airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==)
 | `80.43% <0%> (+15.21%)` | :arrow_up: |
   | 
[airflow/hooks/oracle\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9vcmFjbGVfaG9vay5weQ==)
 | `15.47% <0%> (+15.47%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `33.33% <0%> (+18.25%)` | :arrow_up: |
   | ... and [162 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=footer).
 Last update 
[d12aacd...0d1fd50](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3710: [AIRFLOW-2860] Update tests for 
druid hook
URL: 
https://github.com/apache/incubator-airflow/pull/3710#issuecomment-410760084
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=h1)
 Report
   > Merging 
[#3710](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/d12aacd552878308f9b1c3663414bb7c00c0632b?src=pr=desc)
 will **increase** coverage by `59.86%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3710/graphs/tree.svg?token=WdLKlKHOAU=pr=150=650)](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#3710   +/-   ##
   ===
   + Coverage   17.69%   77.56%   +59.86% 
   ===
 Files 204  204   
 Lines   1576615767+1 
   ===
   + Hits 279012229 +9439 
   + Misses  12976 3538 -9438
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/druid\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kcnVpZF9ob29rLnB5)
 | `88.88% <100%> (+88.88%)` | :arrow_up: |
   | 
[airflow/utils/operator\_resources.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9vcGVyYXRvcl9yZXNvdXJjZXMucHk=)
 | `86.95% <0%> (+4.34%)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `79.33% <0%> (+4.95%)` | :arrow_up: |
   | 
[airflow/executors/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvX19pbml0X18ucHk=)
 | `63.46% <0%> (+5.76%)` | :arrow_up: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `73.91% <0%> (+10.86%)` | :arrow_up: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `91.66% <0%> (+14.58%)` | :arrow_up: |
   | 
[airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==)
 | `80.43% <0%> (+15.21%)` | :arrow_up: |
   | 
[airflow/hooks/oracle\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9vcmFjbGVfaG9vay5weQ==)
 | `15.47% <0%> (+15.47%)` | :arrow_up: |
   | 
[airflow/task/task\_runner/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL19faW5pdF9fLnB5)
 | `63.63% <0%> (+18.18%)` | :arrow_up: |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `33.33% <0%> (+18.25%)` | :arrow_up: |
   | ... and [162 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3710/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=footer).
 Last update 
[d12aacd...0d1fd50](https://codecov.io/gh/apache/incubator-airflow/pull/3710?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] awelsh93 commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
awelsh93 commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is 
not updated correctly when…
URL: 
https://github.com/apache/incubator-airflow/pull/3707#issuecomment-410748321
 
 
   @Fokko Thanks for your comments, I've created a new pull request (#3710) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570355#comment-16570355
 ] 

ASF GitHub Bot commented on AIRFLOW-2860:
-

awelsh93 opened a new pull request #3710: [AIRFLOW-2860] Update tests for druid 
hook
URL: https://github.com/apache/incubator-airflow/pull/3710
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   This pull request updates the tests for `airflow/hooks/druid_hook.py` and 
also adds an assert so that `timeout` must be greater than or equal to 1 
second. This should fix the failing build in #3707.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   I ran `nosetests /tests/hooks/test_druid_hook.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Assignee: Adam Welsh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] awelsh93 opened a new pull request #3710: [AIRFLOW-2860] Update tests for druid hook

2018-08-06 Thread GitBox
awelsh93 opened a new pull request #3710: [AIRFLOW-2860] Update tests for druid 
hook
URL: https://github.com/apache/incubator-airflow/pull/3710
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   This pull request updates the tests for `airflow/hooks/druid_hook.py` and 
also adds an assert so that `timeout` must be greater than or equal to 1 
second. This should fix the failing build in #3707.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   I ran `nosetests /tests/hooks/test_druid_hook.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
Fokko commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not 
updated correctly when…
URL: 
https://github.com/apache/incubator-airflow/pull/3707#issuecomment-410734736
 
 
   @awelsh93 I've reverted the commit for now since it fails the build. Could 
you create a new PR with the fixed test?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2862) S3ToRedshiftTransfer Copy Command Flexibility

2018-08-06 Thread Micheal Ascah (JIRA)
Micheal Ascah created AIRFLOW-2862:
--

 Summary: S3ToRedshiftTransfer Copy Command Flexibility
 Key: AIRFLOW-2862
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2862
 Project: Apache Airflow
  Issue Type: Improvement
  Components: operators
Reporter: Micheal Ascah
Assignee: Micheal Ascah


Currently, the S3ToRedshiftTransfer class requires that the target table to be 
loaded is suffixed to the end of the S3 key provided.

It doesn't seem justifiable that the operator should require the file be named 
by any convention. The S3 bucket + S3 key should be all that is needed. This 
makes it possible to load any S3 Key into a Redshift table, rather than only 
files that have the table name at the end of the S3 key.

The S3 key parameter should also be template-able so that files created in S3 
using timestamps from macros in other tasks in the current DAG run can be used 
to identify files when loading from S3 to Redshift.

The command template should change from 
{code:java}
COPY {schema}.{table}
 FROM 's3://{s3_bucket}/{s3_key}/{table}'
 with credentials
 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
 {copy_options};{code}

 To
 
{code:java}
COPY {schema}.{table}
 FROM 's3://{s3_bucket}/{s3_key}'
 with credentials
 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
 {copy_options};
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
Fokko commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not 
updated correctly when…
URL: 
https://github.com/apache/incubator-airflow/pull/3707#issuecomment-410725007
 
 
   @awelsh93 Thanks for the quick response. Maybe add an assert that the 
`timeout` has to be positive. Feel free to change the test as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] awelsh93 edited a comment on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
awelsh93 edited a comment on issue #3707: [AIRFLOW-2860] DruidHook: time 
variable is not updated correctly when…
URL: 
https://github.com/apache/incubator-airflow/pull/3707#issuecomment-410703524
 
 
   Hey @Fokko, thanks for your quick response. I've noticed some of the Travis 
jobs have failed - specifically for test_submit_timeout in 
`tests.hooks.test_druid_hook.TestDruidHook`. It seems like a timeout of 0 means 
this change gets stuck in a loop where `sec` gets incremented by 0 and hence 
will never be greater than `max_ingestion_time`.
   
   What course of action do you recommend?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside commented on issue #3690: [AIRFLOW-2845] Remove asserts from the contrib package

2018-08-06 Thread GitBox
xnuinside commented on issue #3690: [AIRFLOW-2845] Remove asserts from the 
contrib package
URL: 
https://github.com/apache/incubator-airflow/pull/3690#issuecomment-410711411
 
 
   thanks all for the review :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] awelsh93 commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
awelsh93 commented on issue #3707: [AIRFLOW-2860] DruidHook: time variable is 
not updated correctly when…
URL: 
https://github.com/apache/incubator-airflow/pull/3707#issuecomment-410703524
 
 
   Hey @Fokko, thanks for your quick response. I've noticed some of the Travis 
jobs have failed - specifically for this test 
`tests.hooks.test_druid_hook.TestDruidHook`. It seems like a timeout of 0 means 
this change gets stuck in a loop where `sec` gets incremented by 0 and hence 
will never be greater than `max_ingestion_time`.
   
   What course of action do you recommend?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #3662: [AIRFLOW-2818] Handle DAG File with Upper Case Extension

2018-08-06 Thread GitBox
XD-DENG commented on issue #3662: [AIRFLOW-2818] Handle DAG File with Upper 
Case Extension
URL: 
https://github.com/apache/incubator-airflow/pull/3662#issuecomment-410697110
 
 
   Hi @ashb , may you take another look at this and let me know your thoughts? 
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3708: [AIRFLOW-2859] Implement own UtcDateTime

2018-08-06 Thread GitBox
ashb commented on issue #3708: [AIRFLOW-2859] Implement own UtcDateTime
URL: 
https://github.com/apache/incubator-airflow/pull/3708#issuecomment-410689175
 
 
   Code looks okay at first glance. Not in front of a laptop today to test 
further.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-2861) Need index on log table

2018-08-06 Thread Vardan Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-2861 started by Vardan Gupta.
-
> Need index on log table
> ---
>
> Key: AIRFLOW-2861
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2861
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: database
>Affects Versions: 1.10.0
>Reporter: Vardan Gupta
>Assignee: Vardan Gupta
>Priority: Major
>
> Delete dag functionality is added in v1-10-stable, whose implementation 
> during the metadata cleanup 
> [part|https://github.com/apache/incubator-airflow/blob/dc78b9196723ca6724185231ccd6f5bbe8edcaf3/airflow/api/common/experimental/delete_dag.py#L48],
>  look for classes which has attribute named as dag_id and then formulate the 
> query on matching model and then delete from metadata, we've few numbers 
> where we've observed slowness especially in log table because it doesn't have 
> any single or multiple-column index. Creating an index would boost the 
> performance though insertion will be a bit slower. Since deletion will be a 
> sync call, would be good idea to create index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2861) Need index on log table

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570098#comment-16570098
 ] 

ASF GitHub Bot commented on AIRFLOW-2861:
-

vardancse opened a new pull request #3709: [AIRFLOW-2861][Improvement] Added 
index on log table, will make delete dag optimized
URL: https://github.com/apache/incubator-airflow/pull/3709
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2861
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   Delete dag functionality is added in v1-10-stable, whose implementation 
during the metadata cleanup part, look for classes which has attribute named as 
dag_id and then formulate the query on matching model and then delete from 
metadata, we've few numbers where we've observed slowness especially in log 
table because it doesn't have any single or multiple-column index. Creating an 
index would boost the performance though insertion will be a bit slower. Since 
deletion will be a sync call, would be good idea to create index.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Need index on log table
> ---
>
> Key: AIRFLOW-2861
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2861
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: database
>Affects Versions: 1.10.0
>Reporter: Vardan Gupta
>Assignee: Vardan Gupta
>Priority: Major
>
> Delete dag functionality is added in v1-10-stable, whose implementation 
> during the metadata cleanup 
> [part|https://github.com/apache/incubator-airflow/blob/dc78b9196723ca6724185231ccd6f5bbe8edcaf3/airflow/api/common/experimental/delete_dag.py#L48],
>  look for classes which has attribute named as dag_id and then formulate the 
> query on matching model and then delete from metadata, we've few numbers 
> where we've observed slowness especially in log table because it doesn't have 
> any single or multiple-column index. Creating an index would boost the 
> performance though insertion will be a bit slower. Since deletion will be a 
> sync call, would be good idea to create index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] vardancse opened a new pull request #3709: [AIRFLOW-2861][Improvement] Added index on log table, will make delete dag optimized

2018-08-06 Thread GitBox
vardancse opened a new pull request #3709: [AIRFLOW-2861][Improvement] Added 
index on log table, will make delete dag optimized
URL: https://github.com/apache/incubator-airflow/pull/3709
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2861
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   Delete dag functionality is added in v1-10-stable, whose implementation 
during the metadata cleanup part, look for classes which has attribute named as 
dag_id and then formulate the query on matching model and then delete from 
metadata, we've few numbers where we've observed slowness especially in log 
table because it doesn't have any single or multiple-column index. Creating an 
index would boost the performance though insertion will be a bit slower. Since 
deletion will be a sync call, would be good idea to create index.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-2861) Need index on log table

2018-08-06 Thread Vardan Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vardan Gupta reassigned AIRFLOW-2861:
-

Assignee: Vardan Gupta

> Need index on log table
> ---
>
> Key: AIRFLOW-2861
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2861
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: database
>Affects Versions: 1.10.0
>Reporter: Vardan Gupta
>Assignee: Vardan Gupta
>Priority: Major
>
> Delete dag functionality is added in v1-10-stable, whose implementation 
> during the metadata cleanup 
> [part|https://github.com/apache/incubator-airflow/blob/dc78b9196723ca6724185231ccd6f5bbe8edcaf3/airflow/api/common/experimental/delete_dag.py#L48],
>  look for classes which has attribute named as dag_id and then formulate the 
> query on matching model and then delete from metadata, we've few numbers 
> where we've observed slowness especially in log table because it doesn't have 
> any single or multiple-column index. Creating an index would boost the 
> performance though insertion will be a bit slower. Since deletion will be a 
> sync call, would be good idea to create index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2861) Need index on log table

2018-08-06 Thread Vardan Gupta (JIRA)
Vardan Gupta created AIRFLOW-2861:
-

 Summary: Need index on log table
 Key: AIRFLOW-2861
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2861
 Project: Apache Airflow
  Issue Type: Improvement
  Components: database
Affects Versions: 1.10.0
Reporter: Vardan Gupta


Delete dag functionality is added in v1-10-stable, whose implementation during 
the metadata cleanup 
[part|https://github.com/apache/incubator-airflow/blob/dc78b9196723ca6724185231ccd6f5bbe8edcaf3/airflow/api/common/experimental/delete_dag.py#L48],
 look for classes which has attribute named as dag_id and then formulate the 
query on matching model and then delete from metadata, we've few numbers where 
we've observed slowness especially in log table because it doesn't have any 
single or multiple-column index. Creating an index would boost the performance 
though insertion will be a bit slower. Since deletion will be a sync call, 
would be good idea to create index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XD-DENG commented on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
XD-DENG commented on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity 
in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410672510
 
 
   Thanks @yrqls21 again for the review inputs. Very helpful for refining this 
PR!
   
   Have updated accordingly and ensured all tests passed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2859) DateTimes returned from the database are not converted to UTC

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570048#comment-16570048
 ] 

ASF GitHub Bot commented on AIRFLOW-2859:
-

bolkedebruin opened a new pull request #3708: [AIRFLOW-2859] Implement own 
UtcDateTime
URL: https://github.com/apache/incubator-airflow/pull/3708
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2859
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   
   The different UtcDateTime implementations all have issues.
   Either they replace tzinfo directly without converting
   or they do not convert to UTC at all.
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Added
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [X] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   
   cc @Fokko @ashb new tests pass locally with equal mysql version to travis. 
If test fail for mysql I might need some help ;-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DateTimes returned from the database are not converted to UTC
> -
>
> Key: AIRFLOW-2859
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2859
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Reporter: Bolke de Bruin
>Priority: Blocker
> Fix For: 1.10.0
>
>
> This is due to the fact that sqlalchemy-utcdatetime does not convert to UTC 
> when the database returns datetimes with tzinfo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bolkedebruin opened a new pull request #3708: [AIRFLOW-2859] Implement own UtcDateTime

2018-08-06 Thread GitBox
bolkedebruin opened a new pull request #3708: [AIRFLOW-2859] Implement own 
UtcDateTime
URL: https://github.com/apache/incubator-airflow/pull/3708
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2859
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   
   The different UtcDateTime implementations all have issues.
   Either they replace tzinfo directly without converting
   or they do not convert to UTC at all.
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Added
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [X] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   
   cc @Fokko @ashb new tests pass locally with equal mysql version to travis. 
If test fail for mysql I might need some help ;-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3612: [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration

2018-08-06 Thread GitBox
codecov-io commented on issue #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: 
https://github.com/apache/incubator-airflow/pull/3612#issuecomment-410666351
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=h1)
 Report
   > Merging 
[#3612](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/690df46eb872f8c1579aef286699761917431b1f?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3612/graphs/tree.svg?width=650=150=WdLKlKHOAU=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3612   +/-   ##
   ===
 Coverage   77.56%   77.56%   
   ===
 Files 204  204   
 Lines   1576615766   
   ===
 Hits1222912229   
 Misses   3537 3537
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=footer).
 Last update 
[690df46...bad08e2](https://codecov.io/gh/apache/incubator-airflow/pull/3612?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
Fokko closed pull request #3707: [AIRFLOW-2860] DruidHook: time variable is not 
updated correctly when…
URL: https://github.com/apache/incubator-airflow/pull/3707
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/hooks/druid_hook.py b/airflow/hooks/druid_hook.py
index 53607f0e7a..e5fb6a7a26 100644
--- a/airflow/hooks/druid_hook.py
+++ b/airflow/hooks/druid_hook.py
@@ -81,8 +81,6 @@ def submit_indexing_job(self, json_index_spec):
 
 self.log.info("Job still running for %s seconds...", sec)
 
-sec = sec + 1
-
 if self.max_ingestion_time and sec > self.max_ingestion_time:
 # ensure that the job gets killed if the max ingestion time is 
exceeded
 requests.post("{0}/{1}/shutdown".format(url, druid_task_id))
@@ -91,6 +89,8 @@ def submit_indexing_job(self, json_index_spec):
 
 time.sleep(self.timeout)
 
+sec = sec + self.timeout
+
 status = req_status.json()['status']['status']
 if status == 'RUNNING':
 running = True


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570019#comment-16570019
 ] 

ASF subversion and git services commented on AIRFLOW-2860:
--

Commit d12aacd552878308f9b1c3663414bb7c00c0632b in incubator-airflow's branch 
refs/heads/master from awelsh93
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=d12aacd ]

[AIRFLOW-2860] DruidHook: time variable is not updated correctly when checking 
for timeout (#3707)



> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Assignee: Adam Welsh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570018#comment-16570018
 ] 

ASF GitHub Bot commented on AIRFLOW-2860:
-

Fokko closed pull request #3707: [AIRFLOW-2860] DruidHook: time variable is not 
updated correctly when…
URL: https://github.com/apache/incubator-airflow/pull/3707
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/hooks/druid_hook.py b/airflow/hooks/druid_hook.py
index 53607f0e7a..e5fb6a7a26 100644
--- a/airflow/hooks/druid_hook.py
+++ b/airflow/hooks/druid_hook.py
@@ -81,8 +81,6 @@ def submit_indexing_job(self, json_index_spec):
 
 self.log.info("Job still running for %s seconds...", sec)
 
-sec = sec + 1
-
 if self.max_ingestion_time and sec > self.max_ingestion_time:
 # ensure that the job gets killed if the max ingestion time is 
exceeded
 requests.post("{0}/{1}/shutdown".format(url, druid_task_id))
@@ -91,6 +89,8 @@ def submit_indexing_job(self, json_index_spec):
 
 time.sleep(self.timeout)
 
+sec = sec + self.timeout
+
 status = req_status.json()['status']['status']
 if status == 'RUNNING':
 running = True


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Assignee: Adam Welsh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2860.
---
   Resolution: Fixed
 Assignee: Adam Welsh
Fix Version/s: 2.0.0

> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Assignee: Adam Welsh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570017#comment-16570017
 ] 

ASF GitHub Bot commented on AIRFLOW-2860:
-

awelsh93 opened a new pull request #3707: [AIRFLOW-2860] DruidHook: time 
variable is not updated correctly when…
URL: https://github.com/apache/incubator-airflow/pull/3707
 
 
   … checking for timeout
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   I ran `nosetests /tests/hooks/test_druid_hook.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DruidHook: time variable is not updated correctly when checking for timeout 
> 
>
> Key: AIRFLOW-2860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Adam Welsh
>Priority: Trivial
>
> The variable that is used in the condition to check if the Druid ingestion 
> task has exceeded the max_ingestion_time is not updated correctly. It gets 
> incremented by one (which is the default value for timeout) but should be 
> incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] awelsh93 opened a new pull request #3707: [AIRFLOW-2860] DruidHook: time variable is not updated correctly when…

2018-08-06 Thread GitBox
awelsh93 opened a new pull request #3707: [AIRFLOW-2860] DruidHook: time 
variable is not updated correctly when…
URL: https://github.com/apache/incubator-airflow/pull/3707
 
 
   … checking for timeout
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   I ran `nosetests /tests/hooks/test_druid_hook.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2860) DruidHook: time variable is not updated correctly when checking for timeout

2018-08-06 Thread Adam Welsh (JIRA)
Adam Welsh created AIRFLOW-2860:
---

 Summary: DruidHook: time variable is not updated correctly when 
checking for timeout 
 Key: AIRFLOW-2860
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2860
 Project: Apache Airflow
  Issue Type: Bug
  Components: hooks
Reporter: Adam Welsh


The variable that is used in the condition to check if the Druid ingestion task 
has exceeded the max_ingestion_time is not updated correctly. It gets 
incremented by one (which is the default value for timeout) but should be 
incremented by timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] r4vi commented on issue #3612: [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration

2018-08-06 Thread GitBox
r4vi commented on issue #3612: [AIRFLOW-2755] Added 
`kubernetes.worker_dags_folder` configuration
URL: 
https://github.com/apache/incubator-airflow/pull/3612#issuecomment-410653357
 
 
   @Fokko rebased


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #3704: [AIRFLOW-XXX] Add how to run the docs server to docs

2018-08-06 Thread GitBox
kaxil closed pull request #3704: [AIRFLOW-XXX] Add how to run the docs server 
to docs
URL: https://github.com/apache/incubator-airflow/pull/3704
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 2cf8e0218e..06b9279917 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -69,10 +69,12 @@ install the `doc` extra.
 pip install -e .[doc]
 ```
 
-Generate the documentation by running:
+Generate and serve the documentation by running:
 
 ```
-cd docs && ./build.sh
+cd docs
+./build.sh
+./start_doc_server.sh
 ```
 
 Only a subset of the API reference documentation builds. Install additional


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3704: [AIRFLOW-XXX] Add how to run the docs server to docs

2018-08-06 Thread GitBox
kaxil commented on issue #3704: [AIRFLOW-XXX] Add how to run the docs server to 
docs
URL: 
https://github.com/apache/incubator-airflow/pull/3704#issuecomment-410652503
 
 
   Thanks @tedmiston . Appreciate all the contributions for the docs. They are 
really important.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410575101
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=h1)
 Report
   > Merging 
[#3698](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/6824c86e97d568304a02df51ecd806949a49723a?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3698/graphs/tree.svg?height=150=650=WdLKlKHOAU=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3698  +/-   ##
   ==
   + Coverage   77.56%   77.56%   +<.01% 
   ==
 Files 204  204  
 Lines   1576615772   +6 
   ==
   + Hits1222912234   +5 
   - Misses   3537 3538   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3698/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.57% <100%> (-0.02%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=footer).
 Last update 
[6824c86...e9c96aa](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410575101
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=h1)
 Report
   > Merging 
[#3698](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/6824c86e97d568304a02df51ecd806949a49723a?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3698/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3698  +/-   ##
   ==
   + Coverage   77.56%   77.56%   +<.01% 
   ==
 Files 204  204  
 Lines   1576615772   +6 
   ==
   + Hits1222912234   +5 
   - Misses   3537 3538   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3698/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.57% <100%> (-0.02%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=footer).
 Last update 
[6824c86...e9c96aa](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #3706: [AIRFLOW-XXX] Add docs for running Travis CI on fork

2018-08-06 Thread GitBox
Fokko closed pull request #3706: [AIRFLOW-XXX] Add docs for running Travis CI 
on fork
URL: https://github.com/apache/incubator-airflow/pull/3706
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 2cf8e0218e..ee142fe916 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -16,6 +16,7 @@ little bit helps, and credit will always be given.
 - [Development and Testing](#development-and-testing)
   - [Setting up a development 
environment](#setting-up-a-development-environment)
   - [Pull requests guidelines](#pull-request-guidelines)
+  - [Testing on Travis CI](#testing-on-travis-ci)
   - [Testing Locally](#testing-locally)
 - [Changing the Metadata Database](#changing-the-metadata-database)
 
@@ -138,7 +139,7 @@ Feel free to customize based on the extras available in 
[setup.py](./setup.py)
 Before you submit a pull request from your forked repo, check that it
 meets these guidelines:
 
-1. The pull request should include tests, either as doctests, unit tests, or 
both. The airflow repo uses [Travis 
CI](https://travis-ci.org/apache/incubator-airflow) to run the tests and 
[codecov](https://codecov.io/gh/apache/incubator-airflow) to track coverage. 
You can set up both for free on your fork. It will help you making sure you do 
not break the build with your PR and that you help increase coverage.
+1. The pull request should include tests, either as doctests, unit tests, or 
both. The airflow repo uses [Travis 
CI](https://travis-ci.org/apache/incubator-airflow) to run the tests and 
[codecov](https://codecov.io/gh/apache/incubator-airflow) to track coverage. 
You can set up both for free on your fork (see the "Testing on Travis CI" 
section below). It will help you making sure you do not break the build with 
your PR and that you help increase coverage.
 1. Please [rebase your fork](http://stackoverflow.com/a/7244456/1110993), 
squash commits, and resolve all conflicts.
 1. Every pull request should have an associated 
[JIRA](https://issues.apache.org/jira/browse/AIRFLOW/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
 The JIRA link should also be contained in the PR description.
 1. Preface your commit's subject & PR's title with **[AIRFLOW-XXX]** where 
*XXX* is the JIRA number. We compose release notes (i.e. for Airflow releases) 
from all commit titles in a release. By placing the JIRA number in the commit 
title and hence in the release notes, Airflow users can look into JIRA and 
Github PRs for more details about a particular change.
@@ -148,6 +149,71 @@ meets these guidelines:
 1. As Airflow grows as a project, we try to enforce a more consistent style 
and try to follow the Python community guidelines. We track this using 
[landscape.io](https://landscape.io/github/apache/incubator-airflow/), which 
you can setup on your fork as well to check before you submit your PR. We 
currently enforce most [PEP8](https://www.python.org/dev/peps/pep-0008/) and a 
few other linting rules. It is usually a good idea to lint locally as well 
using [flake8](https://flake8.readthedocs.org/en/latest/) using `flake8 airflow 
tests`. `git diff upstream/master -u -- "*.py" | flake8 --diff` will return any 
changed files in your branch that require linting.
 1. Please read this excellent 
[article](http://chris.beams.io/posts/git-commit/) on commit messages and 
adhere to them. It makes the lives of those who come after you a lot easier.
 
+### Testing on Travis CI
+
+We currently rely heavily on Travis CI for running the full Airflow test suite
+as running all of the tests locally requires significant setup.  You can setup
+Travis CI in your fork of Airflow by following the
+[Travis CI Getting Started guide][travis-ci-getting-started].
+
+There are two different options available for running Travis CI which are
+setup as separate components on GitHub:
+
+1. **Travis CI GitHub App** (new version)
+1. **Travis CI GitHub Services** (legacy version)
+
+ Travis CI GitHub App (new version)
+
+1. Once installed, you can configure the Travis CI GitHub App at
+https://github.com/settings/installations/169040.
+
+1. For the Travis CI GitHub App, you can set repository access to either "all
+repositories" for convenience, or "only select repositories" and choose
+`/incubator-airflow` in the dropdown.
+
+1. You can access Travis CI for your fork at
+`https://travis-ci.com//incubator-airflow`.
+
+ Travis CI GitHub Services (legacy version)
+
+The Travis CI GitHub Services versions uses an Authorized OAuth App.  Note
+that `apache/incubator-airflow` is currently still using the legacy version.
+
+1. Once installed, you can configure the Travis CI Authorized OAuth App at

[jira] [Commented] (AIRFLOW-2845) Remove asserts from the contrib code (change to legal exceptions)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569929#comment-16569929
 ] 

ASF GitHub Bot commented on AIRFLOW-2845:
-

Fokko closed pull request #3690: [AIRFLOW-2845] Remove asserts from the contrib 
package
URL: https://github.com/apache/incubator-airflow/pull/3690
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index f4c1a3b520..aa8fc382a6 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -592,9 +592,11 @@ def run_query(self,
 }
 
 if destination_dataset_table:
-assert '.' in destination_dataset_table, (
-'Expected destination_dataset_table in the format of '
-'.. Got: {}').format(destination_dataset_table)
+if '.' not in destination_dataset_table:
+raise ValueError(
+'Expected destination_dataset_table name in the format of '
+'.. Got: {}'.format(
+destination_dataset_table))
 destination_project, destination_dataset, destination_table = \
 _split_tablename(table_input=destination_dataset_table,
  default_project_id=self.project_id)
@@ -610,7 +612,9 @@ def run_query(self,
 }
 })
 if udf_config:
-assert isinstance(udf_config, list)
+if not isinstance(udf_config, list):
+raise TypeError("udf_config argument must have a type 'list'"
+" not {}".format(type(udf_config)))
 configuration['query'].update({
 'userDefinedFunctionResources': udf_config
 })
@@ -1153,10 +1157,10 @@ def run_table_delete(self, deletion_dataset_table,
 :type ignore_if_missing: boolean
 :return:
 """
-
-assert '.' in deletion_dataset_table, (
-'Expected deletion_dataset_table in the format of '
-'.. Got: {}').format(deletion_dataset_table)
+if '.' not in deletion_dataset_table:
+raise ValueError(
+'Expected deletion_dataset_table name in the format of '
+'.. Got: {}'.format(deletion_dataset_table))
 deletion_project, deletion_dataset, deletion_table = \
 _split_tablename(table_input=deletion_dataset_table,
  default_project_id=self.project_id)
@@ -1284,14 +1288,10 @@ def run_grant_dataset_view_access(self,
 # if view is already in access, do nothing.
 self.log.info(
 'Table %s:%s.%s already has authorized view access to %s:%s 
dataset.',
-view_project, view_dataset, view_table, source_project,
-source_dataset)
+view_project, view_dataset, view_table, source_project, 
source_dataset)
 return source_dataset_resource
 
-def delete_dataset(self,
-   project_id,
-   dataset_id
-   ):
+def delete_dataset(self, project_id, dataset_id):
 """
 Delete a dataset of Big query in your project.
 :param project_id: The name of the project where we have the dataset .
@@ -1308,9 +1308,8 @@ def delete_dataset(self,
 self.service.datasets().delete(
 projectId=project_id,
 datasetId=dataset_id).execute()
-
-self.log.info('Dataset deleted successfully: In project %s Dataset 
%s',
-  project_id, dataset_id)
+self.log.info('Dataset deleted successfully: In project %s '
+  'Dataset %s', project_id, dataset_id)
 
 except HttpError as err:
 raise AirflowException(
@@ -1518,14 +1517,17 @@ def _bq_cast(string_field, bq_type):
 elif bq_type == 'FLOAT' or bq_type == 'TIMESTAMP':
 return float(string_field)
 elif bq_type == 'BOOLEAN':
-assert string_field in set(['true', 'false'])
+if string_field not in ['true', 'false']:
+raise ValueError("{} must have value 'true' or 'false'".format(
+string_field))
 return string_field == 'true'
 else:
 return string_field
 
 
 def _split_tablename(table_input, default_project_id, var_name=None):
-assert default_project_id is not None, "INTERNAL: No default project is 
specified"
+if not default_project_id:
+raise ValueError("INTERNAL: No default project is specified")
 
 def var_print(var_name):
 if 

[jira] [Resolved] (AIRFLOW-2845) Remove asserts from the contrib code (change to legal exceptions)

2018-08-06 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2845.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Remove asserts from the contrib code (change to legal exceptions) 
> --
>
> Key: AIRFLOW-2845
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2845
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.9.0
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.0.0
>
>
> Hi guys.  `asserts` is used in Airflow contrib package code .  And from point 
> of view for which purposes asserts are really is, it's not correct.
> If we look at documentation we could find information what asserts is debug 
> tool: 
> [https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement] 
> and also it is could be disabled globally by default. 
> If you do not mind, I will be happy to prepare PR for remove asserts from the 
> contrib module with changing it to raising errors with correct Exceptions and 
> messages and not just "Assertion Error".
> I talk only about src (not about asserts in tests). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2845) Remove asserts from the contrib code (change to legal exceptions)

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569930#comment-16569930
 ] 

ASF subversion and git services commented on AIRFLOW-2845:
--

Commit 5b7a28c7e0dade62bb4d087f25687821e7f11083 in incubator-airflow's branch 
refs/heads/master from Yuliya Volkova
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=5b7a28c ]

[AIRFLOW-2845] Asserts in contrib package code are changed on raise ValueError 
and TypeError (#3690)



> Remove asserts from the contrib code (change to legal exceptions) 
> --
>
> Key: AIRFLOW-2845
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2845
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.9.0
>Reporter: Iuliia Volkova
>Assignee: Iuliia Volkova
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.0.0
>
>
> Hi guys.  `asserts` is used in Airflow contrib package code .  And from point 
> of view for which purposes asserts are really is, it's not correct.
> If we look at documentation we could find information what asserts is debug 
> tool: 
> [https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement] 
> and also it is could be disabled globally by default. 
> If you do not mind, I will be happy to prepare PR for remove asserts from the 
> contrib module with changing it to raising errors with correct Exceptions and 
> messages and not just "Assertion Error".
> I talk only about src (not about asserts in tests). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3690: [AIRFLOW-2845] Remove asserts from the contrib package

2018-08-06 Thread GitBox
Fokko commented on issue #3690: [AIRFLOW-2845] Remove asserts from the contrib 
package
URL: 
https://github.com/apache/incubator-airflow/pull/3690#issuecomment-410645124
 
 
   Thanks @xnuinside 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2858) Make tox.ini indentation and whitespace consistent

2018-08-06 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-2858:

Fix Version/s: 2.0.0

> Make tox.ini indentation and whitespace consistent
> --
>
> Key: AIRFLOW-2858
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2858
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci, tests
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
> Fix For: 2.0.0
>
>
> The tox.ini file currently mixes multiple indentation and whitespace styles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2858) Make tox.ini indentation and whitespace consistent

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569878#comment-16569878
 ] 

ASF GitHub Bot commented on AIRFLOW-2858:
-

kaxil closed pull request #3705: [AIRFLOW-2858] Make tox.ini indentation and 
whitespace consistent
URL: https://github.com/apache/incubator-airflow/pull/3705
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tox.ini b/tox.ini
index a20d2c5546..b9c386afb1 100644
--- a/tox.ini
+++ b/tox.ini
@@ -18,7 +18,7 @@
 
 [tox]
 envlist = flake8,{py27,py35}-backend_{mysql,sqlite,postgres}
-skipsdist=True
+skipsdist = True
 
 [global]
 wheel_dir = {homedir}/.wheelhouse
@@ -36,19 +36,19 @@ deps =
 coveralls
 
 basepython =
-  py27: python2.7
-  py35: python3.5
+py27: python2.7
+py35: python3.5
 
 setenv =
-  HADOOP_DISTRO=cdh
-  HADOOP_HOME=/tmp/hadoop-cdh
-  HADOOP_OPTS=-D/tmp/krb5.conf
-  MINICLUSTER_HOME=/tmp/minicluster-1.1-SNAPSHOT
-  HIVE_HOME=/tmp/hive
-  backend_mysql: AIRFLOW__CORE__SQL_ALCHEMY_CONN=mysql://root@localhost/airflow
-  backend_postgres: 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres@localhost/airflow
-  backend_sqlite: 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=sqlite:///{homedir}/airflow/airflow.db
-  backend_sqlite: AIRFLOW__CORE__EXECUTOR=SequentialExecutor
+HADOOP_DISTRO = cdh
+HADOOP_HOME = /tmp/hadoop-cdh
+HADOOP_OPTS = -D/tmp/krb5.conf
+MINICLUSTER_HOME = /tmp/minicluster-1.1-SNAPSHOT
+HIVE_HOME = /tmp/hive
+backend_mysql: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
mysql://root@localhost/airflow
+backend_postgres: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
postgresql+psycopg2://postgres@localhost/airflow
+backend_sqlite: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
sqlite:///{homedir}/airflow/airflow.db
+backend_sqlite: AIRFLOW__CORE__EXECUTOR = SequentialExecutor
 
 passenv =
 HOME
@@ -67,15 +67,15 @@ passenv =
 SLUGIFY_USES_TEXT_UNIDECODE
 
 commands =
-  pip wheel -w {homedir}/.wheelhouse -f {homedir}/.wheelhouse -e .[devel_ci]
-  pip install --find-links={homedir}/.wheelhouse --no-index -e .[devel_ci]
-  sudo {toxinidir}/scripts/ci/setup_kdc.sh
-  {toxinidir}/scripts/ci/setup_env.sh
-  {toxinidir}/scripts/ci/ldap.sh
-  {toxinidir}/scripts/ci/load_fixtures.sh
-  {toxinidir}/scripts/ci/load_data.sh
-  {toxinidir}/scripts/ci/run_tests.sh []
-  {toxinidir}/scripts/ci/check-license.sh
+pip wheel -w {homedir}/.wheelhouse -f {homedir}/.wheelhouse -e .[devel_ci]
+pip install --find-links={homedir}/.wheelhouse --no-index -e .[devel_ci]
+sudo {toxinidir}/scripts/ci/setup_kdc.sh
+{toxinidir}/scripts/ci/setup_env.sh
+{toxinidir}/scripts/ci/ldap.sh
+{toxinidir}/scripts/ci/load_fixtures.sh
+{toxinidir}/scripts/ci/load_data.sh
+{toxinidir}/scripts/ci/run_tests.sh []
+{toxinidir}/scripts/ci/check-license.sh
 
 [testenv:flake8]
 basepython = python3
@@ -85,4 +85,3 @@ deps =
 
 commands =
 {toxinidir}/scripts/ci/flake8_diff.sh
-


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Make tox.ini indentation and whitespace consistent
> --
>
> Key: AIRFLOW-2858
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2858
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci, tests
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
> Fix For: 2.0.0
>
>
> The tox.ini file currently mixes multiple indentation and whitespace styles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2858) Make tox.ini indentation and whitespace consistent

2018-08-06 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-2858.
-
Resolution: Fixed

Resolved by https://github.com/apache/incubator-airflow/pull/3705

> Make tox.ini indentation and whitespace consistent
> --
>
> Key: AIRFLOW-2858
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2858
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci, tests
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> The tox.ini file currently mixes multiple indentation and whitespace styles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2858) Make tox.ini indentation and whitespace consistent

2018-08-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569879#comment-16569879
 ] 

ASF subversion and git services commented on AIRFLOW-2858:
--

Commit 6824c86e97d568304a02df51ecd806949a49723a in incubator-airflow's branch 
refs/heads/master from Taylor D. Edmiston
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=6824c86 ]

[AIRFLOW-2858] Make tox.ini indentation and whitespace consistent (#3705)



> Make tox.ini indentation and whitespace consistent
> --
>
> Key: AIRFLOW-2858
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2858
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci, tests
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
> Fix For: 2.0.0
>
>
> The tox.ini file currently mixes multiple indentation and whitespace styles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #3705: [AIRFLOW-2858] Make tox.ini indentation and whitespace consistent

2018-08-06 Thread GitBox
kaxil closed pull request #3705: [AIRFLOW-2858] Make tox.ini indentation and 
whitespace consistent
URL: https://github.com/apache/incubator-airflow/pull/3705
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tox.ini b/tox.ini
index a20d2c5546..b9c386afb1 100644
--- a/tox.ini
+++ b/tox.ini
@@ -18,7 +18,7 @@
 
 [tox]
 envlist = flake8,{py27,py35}-backend_{mysql,sqlite,postgres}
-skipsdist=True
+skipsdist = True
 
 [global]
 wheel_dir = {homedir}/.wheelhouse
@@ -36,19 +36,19 @@ deps =
 coveralls
 
 basepython =
-  py27: python2.7
-  py35: python3.5
+py27: python2.7
+py35: python3.5
 
 setenv =
-  HADOOP_DISTRO=cdh
-  HADOOP_HOME=/tmp/hadoop-cdh
-  HADOOP_OPTS=-D/tmp/krb5.conf
-  MINICLUSTER_HOME=/tmp/minicluster-1.1-SNAPSHOT
-  HIVE_HOME=/tmp/hive
-  backend_mysql: AIRFLOW__CORE__SQL_ALCHEMY_CONN=mysql://root@localhost/airflow
-  backend_postgres: 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres@localhost/airflow
-  backend_sqlite: 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=sqlite:///{homedir}/airflow/airflow.db
-  backend_sqlite: AIRFLOW__CORE__EXECUTOR=SequentialExecutor
+HADOOP_DISTRO = cdh
+HADOOP_HOME = /tmp/hadoop-cdh
+HADOOP_OPTS = -D/tmp/krb5.conf
+MINICLUSTER_HOME = /tmp/minicluster-1.1-SNAPSHOT
+HIVE_HOME = /tmp/hive
+backend_mysql: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
mysql://root@localhost/airflow
+backend_postgres: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
postgresql+psycopg2://postgres@localhost/airflow
+backend_sqlite: AIRFLOW__CORE__SQL_ALCHEMY_CONN = 
sqlite:///{homedir}/airflow/airflow.db
+backend_sqlite: AIRFLOW__CORE__EXECUTOR = SequentialExecutor
 
 passenv =
 HOME
@@ -67,15 +67,15 @@ passenv =
 SLUGIFY_USES_TEXT_UNIDECODE
 
 commands =
-  pip wheel -w {homedir}/.wheelhouse -f {homedir}/.wheelhouse -e .[devel_ci]
-  pip install --find-links={homedir}/.wheelhouse --no-index -e .[devel_ci]
-  sudo {toxinidir}/scripts/ci/setup_kdc.sh
-  {toxinidir}/scripts/ci/setup_env.sh
-  {toxinidir}/scripts/ci/ldap.sh
-  {toxinidir}/scripts/ci/load_fixtures.sh
-  {toxinidir}/scripts/ci/load_data.sh
-  {toxinidir}/scripts/ci/run_tests.sh []
-  {toxinidir}/scripts/ci/check-license.sh
+pip wheel -w {homedir}/.wheelhouse -f {homedir}/.wheelhouse -e .[devel_ci]
+pip install --find-links={homedir}/.wheelhouse --no-index -e .[devel_ci]
+sudo {toxinidir}/scripts/ci/setup_kdc.sh
+{toxinidir}/scripts/ci/setup_env.sh
+{toxinidir}/scripts/ci/ldap.sh
+{toxinidir}/scripts/ci/load_fixtures.sh
+{toxinidir}/scripts/ci/load_data.sh
+{toxinidir}/scripts/ci/run_tests.sh []
+{toxinidir}/scripts/ci/check-license.sh
 
 [testenv:flake8]
 basepython = python3
@@ -85,4 +85,3 @@ deps =
 
 commands =
 {toxinidir}/scripts/ci/flake8_diff.sh
-


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410575101
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=h1)
 Report
   > Merging 
[#3698](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dc64649d98f9c6ddb508b49b52a61fc7799e4343?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3698/graphs/tree.svg?width=650=150=WdLKlKHOAU=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3698  +/-   ##
   ==
   + Coverage   77.56%   77.57%   +<.01% 
   ==
 Files 204  204  
 Lines   1576615772   +6 
   ==
   + Hits1222912235   +6 
 Misses   3537 3537
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3698/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.61% <100%> (+0.02%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=footer).
 Last update 
[dc64649...e57953d](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
codecov-io edited a comment on issue #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: 
https://github.com/apache/incubator-airflow/pull/3698#issuecomment-410575101
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=h1)
 Report
   > Merging 
[#3698](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dc64649d98f9c6ddb508b49b52a61fc7799e4343?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3698/graphs/tree.svg?width=650=pr=WdLKlKHOAU=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3698  +/-   ##
   ==
   + Coverage   77.56%   77.57%   +<.01% 
   ==
 Files 204  204  
 Lines   1576615772   +6 
   ==
   + Hits1222912235   +6 
 Misses   3537 3537
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3698/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.61% <100%> (+0.02%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=footer).
 Last update 
[dc64649...e57953d](https://codecov.io/gh/apache/incubator-airflow/pull/3698?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
ashb commented on a change in pull request #3703: [AIRFLOW-2857] Fix 
verify_gpl_dependency for Read the Docs
URL: https://github.com/apache/incubator-airflow/pull/3703#discussion_r207799954
 
 

 ##
 File path: setup.py
 ##
 @@ -161,6 +164,7 @@ def write_version(filename=os.path.join(*['airflow',
 databricks = ['requests>=2.5.1, <3']
 datadog = ['datadog>=0.14.0']
 doc = [
+'mock',
 
 Review comment:
   ?? This is needed for docs? I would have though mock is just used in tests/, 
and i’m Surprised it’s needed here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #3703: [AIRFLOW-2857] Fix verify_gpl_dependency for Read the Docs

2018-08-06 Thread GitBox
ashb commented on a change in pull request #3703: [AIRFLOW-2857] Fix 
verify_gpl_dependency for Read the Docs
URL: https://github.com/apache/incubator-airflow/pull/3703#discussion_r207799614
 
 

 ##
 File path: setup.py
 ##
 @@ -37,6 +37,9 @@
 
 # See LEGAL-362
 def verify_gpl_dependency():
+if os.getenv("READTHEDOCS") == "True":
 
 Review comment:
   Could you add a comment saying why this is needed please?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
yrqls21 commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r207797889
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -414,6 +416,16 @@ def process_file(self, filepath, only_if_updated=True, 
safe_mode=True):
 self.bag_dag(dag, parent_dag=dag, root_dag=dag)
 found_dags.append(dag)
 found_dags += dag.subdags
+if isinstance(dag._schedule_interval, 
six.string_types):
 
 Review comment:
   maybe put the check before we append the `dag` to `found_dags`?( if that's 
what we do for other dags with import errors)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron Expression Validity in DagBag.process_file()

2018-08-06 Thread GitBox
XD-DENG commented on a change in pull request #3698: [AIRFLOW-2855] Check Cron 
Expression Validity in DagBag.process_file()
URL: https://github.com/apache/incubator-airflow/pull/3698#discussion_r207796998
 
 

 ##
 File path: tests/models.py
 ##
 @@ -1038,6 +1038,21 @@ def test_zip(self):
 dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, "test_zip.zip"))
 self.assertTrue(dagbag.get_dag("test_zip_dag"))
 
+def test_process_file_cron_validity_check(self):
+"""
+test if an invalid cron expression
+as schedule interval can be identified
+"""
+invalid_dag_files = ["test_invalid_cron.py", 
"test_zip_invalid_cron.zip"]
+dagbag = models.DagBag()
+
+for d in invalid_dag_files:
+dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
+
+for d in invalid_dag_files:
+self.assertTrue(any([d in k and "Invalid Cron expression" in v
 
 Review comment:
   Hi @yrqls21 , great points!
   
   - For your **1st point**, actually you're right. It's not really necessary 
to re-parse these two files, i.e. the code below can be moved.
   
   ```
   for d in invalid_dag_files:
   dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, d))
   ``` 
   But I purposely keep them here to make it clear what I'm testing here.
   
   - Your **2nd point** is great. I agree it makes no practical difference 
here, given `n` and `m` would never be too big here. But I think I will change 
this part as well. Always good to keep performance in mind.
   
   Thanks for your reviewing! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >