[GitHub] [airflow] aneesh-joseph opened a new pull request #9628: fix static checks

2020-07-02 Thread GitBox


aneesh-joseph opened a new pull request #9628:
URL: https://github.com/apache/airflow/pull/9628


   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [ ] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Target Github ISSUE in description if exists
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


turbaszek commented on pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#issuecomment-652954399


   @dossett I've added handling of this case



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #9627: MySQL support in Helm airflow database config

2020-07-02 Thread GitBox


potiuk commented on issue #9627:
URL: https://github.com/apache/airflow/issues/9627#issuecomment-652966722


   Ach sorry. CC: @schnie  (you are both just one line apart when typing "Greg" 
).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator

2020-07-02 Thread GitBox


albertocalderari commented on a change in pull request #9590:
URL: https://github.com/apache/airflow/pull/9590#discussion_r449065871



##
File path: airflow/providers/google/cloud/operators/bigquery.py
##
@@ -1692,32 +1692,52 @@ def prepare_template(self) -> None:
 with open(self.configuration, 'r') as file:
 self.configuration = json.loads(file.read())
 
+def _submit_job(self, hook: BigQueryHook, job_id: str):
+# Submit a new job
+job = hook.insert_job(
+configuration=self.configuration,
+project_id=self.project_id,
+location=self.location,
+job_id=job_id,
+)
+# Start the job and wait for it to complete and get the result.
+job.result()
+return job
+
 def execute(self, context: Any):
 hook = BigQueryHook(
 gcp_conn_id=self.gcp_conn_id,
 delegate_to=self.delegate_to,
 )
 
-job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}"
+exec_date = context['execution_date'].isoformat()
+job_id = self.job_id or 
f"airflow_{self.dag_id}_{self.task_id}_{exec_date}"
+
 try:
-job = hook.insert_job(
-configuration=self.configuration,
-project_id=self.project_id,
-location=self.location,
-job_id=job_id,
-)
-# Start the job and wait for it to complete and get the result.
-job.result()
+# Submit a new job
+job = self._submit_job(hook, job_id)
 except Conflict:
+# If the job already exists retrieve it
 job = hook.get_job(
 project_id=self.project_id,
 location=self.location,
 job_id=job_id,
 )
-# Get existing job and wait for it to be ready
-for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):
-sleep(time_to_wait)
-job.reload()
-if job.done():
-break
+
+if job.done() and job.error_result:
+# The job exists and finished with an error and we are 
probably reruning it
+# So we have to make a new job_id because it has to be unique
+job_id = f"{self.job_id}_{int(time())}"
+job = self._submit_job(hook, job_id)
+elif not job.done():
+# The job is still running so wait for it to be ready
+for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):

Review comment:
   When I hit result it polls too





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


potiuk commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653059073


   If we want to support ExternalLoggingMixin we should also bacport it to 1.10 
but this might be a bit more complex - using Bowler refactoring (@turbaszek 
WDYT?)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator

2020-07-02 Thread GitBox


albertocalderari commented on a change in pull request #9590:
URL: https://github.com/apache/airflow/pull/9590#discussion_r449065662



##
File path: airflow/providers/google/cloud/operators/bigquery.py
##
@@ -1692,32 +1692,52 @@ def prepare_template(self) -> None:
 with open(self.configuration, 'r') as file:
 self.configuration = json.loads(file.read())
 
+def _submit_job(self, hook: BigQueryHook, job_id: str):
+# Submit a new job
+job = hook.insert_job(
+configuration=self.configuration,
+project_id=self.project_id,
+location=self.location,
+job_id=job_id,
+)
+# Start the job and wait for it to complete and get the result.
+job.result()
+return job
+
 def execute(self, context: Any):
 hook = BigQueryHook(
 gcp_conn_id=self.gcp_conn_id,
 delegate_to=self.delegate_to,
 )
 
-job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}"
+exec_date = context['execution_date'].isoformat()
+job_id = self.job_id or 
f"airflow_{self.dag_id}_{self.task_id}_{exec_date}"
+
 try:
-job = hook.insert_job(
-configuration=self.configuration,
-project_id=self.project_id,
-location=self.location,
-job_id=job_id,
-)
-# Start the job and wait for it to complete and get the result.
-job.result()
+# Submit a new job
+job = self._submit_job(hook, job_id)
 except Conflict:
+# If the job already exists retrieve it
 job = hook.get_job(
 project_id=self.project_id,
 location=self.location,
 job_id=job_id,
 )
-# Get existing job and wait for it to be ready
-for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):
-sleep(time_to_wait)
-job.reload()
-if job.done():
-break
+
+if job.done() and job.error_result:
+# The job exists and finished with an error and we are 
probably reruning it
+# So we have to make a new job_id because it has to be unique
+job_id = f"{self.job_id}_{int(time())}"
+job = self._submit_job(hook, job_id)
+elif not job.done():
+# The job is still running so wait for it to be ready
+for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):

Review comment:
   ```def bq_load(config: Config, client: bigquery.Client, event: Event, 
source_uri: str) -> JobResultEvent:
   logger.info("Loading GSheet to BQ")
   job_config = LoadJobConfig(
   autodetect=True,
   create_disposition="CREATE_IF_NEEDED",
   write_disposition="WRITE_TRUNCATE",
   source_format="NEWLINE_DELIMITED_JSON"
   )
   logger.info("loading_table")
   table_name = 
f"{config.project_id}.{config.dataset_name}.{event.destination_table}"
   job: LoadJob = client.load_table_from_uri("gs://" + source_uri, 
table_name, job_config=job_config)
   try:
   job.result()
   result_event = JobResultEvent.from_job_result_and_event(job, event)
   except (GoogleCloudError, TimeoutError, ClientError) as _:
   result_event = JobResultEvent.from_job_result_and_event(job, event)
   logger.error("BQ Job Failed")
   return result_event
   ``` That's one example of how I do it in an app





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator

2020-07-02 Thread GitBox


albertocalderari commented on a change in pull request #9590:
URL: https://github.com/apache/airflow/pull/9590#discussion_r449076100



##
File path: airflow/providers/google/cloud/operators/bigquery.py
##
@@ -1692,32 +1692,52 @@ def prepare_template(self) -> None:
 with open(self.configuration, 'r') as file:
 self.configuration = json.loads(file.read())
 
+def _submit_job(self, hook: BigQueryHook, job_id: str):
+# Submit a new job
+job = hook.insert_job(
+configuration=self.configuration,
+project_id=self.project_id,
+location=self.location,
+job_id=job_id,
+)
+# Start the job and wait for it to complete and get the result.
+job.result()
+return job
+
 def execute(self, context: Any):
 hook = BigQueryHook(
 gcp_conn_id=self.gcp_conn_id,
 delegate_to=self.delegate_to,
 )
 
-job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}"
+exec_date = context['execution_date'].isoformat()
+job_id = self.job_id or 
f"airflow_{self.dag_id}_{self.task_id}_{exec_date}"
+
 try:
-job = hook.insert_job(
-configuration=self.configuration,
-project_id=self.project_id,
-location=self.location,
-job_id=job_id,
-)
-# Start the job and wait for it to complete and get the result.
-job.result()
+# Submit a new job
+job = self._submit_job(hook, job_id)
 except Conflict:
+# If the job already exists retrieve it
 job = hook.get_job(
 project_id=self.project_id,
 location=self.location,
 job_id=job_id,
 )
-# Get existing job and wait for it to be ready
-for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):
-sleep(time_to_wait)
-job.reload()
-if job.done():
-break
+
+if job.done() and job.error_result:
+# The job exists and finished with an error and we are 
probably reruning it
+# So we have to make a new job_id because it has to be unique
+job_id = f"{self.job_id}_{int(time())}"

Review comment:
   I see - yet you won't be able to re-poll for this job since it uses a 
the current time, which is not reproducible on an "eventual" next run.
   Though much better than what is now





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dewsmen commented on issue #8557: Count SKIPPED as SUCCESS if wait_for_downstream=True

2020-07-02 Thread GitBox


Dewsmen commented on issue #8557:
URL: https://github.com/apache/airflow/issues/8557#issuecomment-652951751


   The issue was fixed and merged with 
[PR](https://github.com/apache/airflow/pull/7735)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Dewsmen closed issue #8557: Count SKIPPED as SUCCESS if wait_for_downstream=True

2020-07-02 Thread GitBox


Dewsmen closed issue #8557:
URL: https://github.com/apache/airflow/issues/8557


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9628: fix PR checks

2020-07-02 Thread GitBox


boring-cyborg[bot] commented on pull request #9628:
URL: https://github.com/apache/airflow/pull/9628#issuecomment-653020698


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (ee20086 -> 611d449)

2020-07-02 Thread turbaszek
This is an automated email from the ASF dual-hosted git repository.

turbaszek pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from ee20086  Move S3TaskHandler to the AWS provider package (#9602)
 add 611d449  Use supports_read instead of is_supported in log endpoint 
(#9628)

No new revisions were added by this update.

Summary of changes:
 airflow/api_connexion/endpoints/log_endpoint.py| 2 +-
 tests/api_connexion/endpoints/test_log_endpoint.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)



svn commit: r40270 - /dev/airflow/1.10.11rc1/

2020-07-02 Thread kaxilnaik
Author: kaxilnaik
Date: Thu Jul  2 14:43:49 2020
New Revision: 40270

Log:
Add artifacts for Airflow 1.10.11rc1

Added:
dev/airflow/1.10.11rc1/
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz   (with props)
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz   (with 
props)
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc
dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512
dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl   
(with props)
dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.asc
dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.sha512

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz
==
Binary file - no diff available.

Propchange: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc
==
--- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc (added)
+++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc Thu Jul  2 
14:43:49 2020
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+
+iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl798HMACgkQdfzNCiX6
+Dktsagf/Tq6SH4KyouwL9EuNMKQirkFg1HUDRubZS4xFKB4AmXtw/fuqTmwFun0E
+b00tmfIwVRaVRyC6sYx4OUp8MPrMklf7xugwpr0phd0244jcZcwsclm+W0oRkFen
+q8I0f+51Gjt1+NIUOrwS+HQxPQmUwdU8xvEXXTLN9hdijrixUBlEM7iV7zv2OFZy
+C+2/IDzxO0cP/YSwwOiqtynm03WI7skmBtGjBeEi7YU6+FiFnpKj+I+GZea8pmCw
+QToyrji07b1OH+6XE0xsLp1X1kwZYvWGCQMK1HWS2CMyQ3jMbp+lDAVLrpJ8BWPf
+IY3jW6w70aSzFaNXonQFSx9CSGbLdQ==
+=Dy08
+-END PGP SIGNATURE-

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512
==
--- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 (added)
+++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 Thu Jul  
2 14:43:49 2020
@@ -0,0 +1,4 @@
+apache-airflow-1.10.11rc1-bin.tar.gz: DDE50E9D 01513C8F D033F2C5 4A54AB83
+  94DB2E00 673F83CE F154F0D3 01284256
+  4E04ACDD 249F5F1D 505D3676 56B3EB74
+  001C2B5C B429431F 2D298DC8 DD9C18F9

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz
==
Binary file - no diff available.

Propchange: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc
==
--- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc (added)
+++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc Thu Jul  
2 14:43:49 2020
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+
+iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl798G0ACgkQdfzNCiX6
+Dkvd6Af/QgtMu+Alj/aysZhLJHsZBKJ7EJJ40cmotrbRLuKtmpx0MXw37Vyb9R9b
+f9fiCVvRVNhDqBE+B9FdkXlGrmmC3z1veB+i3HCcTmC3MT1IxtZqAb020DGbMG02
+OueXILrkisAHLo+FH0anHSARzqoW1UaN/H1fPWSzQVz0yfkeRL11bsgeqLzYzORX
+IXzI9y6M85mdTyeTxGhn0CuzXctOUgXKgFkJ+fdIIhRZ2PWs/Q1+vsvngI8cu9QJ
+kEtAL9Y27keRbLmCC+w7Ps3ns3DeBkjEXQ6/cE72+ce/DZrg7sApfnXAd+SWLtX2
+w2wCswqYW1mUCFQRZUngMpkYLQFgoQ==
+=ytP8
+-END PGP SIGNATURE-

Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512
==
--- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 
(added)
+++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 Thu 
Jul  2 14:43:49 2020
@@ -0,0 +1,3 @@
+apache-airflow-1.10.11rc1-source.tar.gz: 
+877F7562 7EA81ABF 17C97958 5A272147 BB756FBE 83518A81 14353207 42082370 
6504A88E
+ F07DCE3C 61FBAAFB B1C6C747 E69D04F2 DA092F99 4C020BCC 9083FEDE

Added: dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl
==
Binary file - no diff available.

Propchange: 
dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl
--
svn:mime-type = application/octet-stream

Added: dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.asc

[GitHub] [airflow] dossett commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


dossett commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449067489



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -502,32 +506,79 @@ def __init__(self,
 self.timeout = timeout
 self.metadata = metadata
 self.gcp_conn_id = gcp_conn_id
+self.delete_on_error = delete_on_error
+
+def _create_cluster(self, hook):
+operation = hook.create_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster=self.cluster,
+request_id=self.request_id,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+cluster = operation.result()
+self.log.info("Cluster created.")
+return cluster
+
+def _delete_cluster(self, hook):
+self.log.info("Deleting the cluster")
+hook.delete_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info("Cluster %s deleted", self.cluster_name)

Review comment:
   The ERROR cluster will eventually be lifecycled by dataproc, but in my 
experience that often takes longer (20-30 minutes) than typical retry counts / 
retry waits will cover.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


ephraimbuddy commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653060319


   @potiuk Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


olchas commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449082822



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -502,32 +506,79 @@ def __init__(self,
 self.timeout = timeout
 self.metadata = metadata
 self.gcp_conn_id = gcp_conn_id
+self.delete_on_error = delete_on_error
+
+def _create_cluster(self, hook):
+operation = hook.create_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster=self.cluster,
+request_id=self.request_id,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+cluster = operation.result()
+self.log.info("Cluster created.")
+return cluster
+
+def _delete_cluster(self, hook):
+self.log.info("Deleting the cluster")
+hook.delete_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info("Cluster %s deleted", self.cluster_name)

Review comment:
   But I meant to raise an exception **after** the cluster is deleted. 
Then, as far as I understand, on retry we would follow the logic of cluster 
existing but being in 'DELETING' state, so we would get another chance to 
create it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on pull request #9624: Move StackdriverTaskHandler to the provider package

2020-07-02 Thread GitBox


turbaszek commented on pull request #9624:
URL: https://github.com/apache/airflow/pull/9624#issuecomment-653009381


   @ephraimbuddy can you please take a look at the CI errors?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


turbaszek commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653009423


   @ephraimbuddy can you please take a look at the CI errors?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek merged pull request #9628: fix PR checks

2020-07-02 Thread GitBox


turbaszek merged pull request #9628:
URL: https://github.com/apache/airflow/pull/9628


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


olchas commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449056748



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -502,32 +506,79 @@ def __init__(self,
 self.timeout = timeout
 self.metadata = metadata
 self.gcp_conn_id = gcp_conn_id
+self.delete_on_error = delete_on_error
+
+def _create_cluster(self, hook):
+operation = hook.create_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster=self.cluster,
+request_id=self.request_id,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+cluster = operation.result()
+self.log.info("Cluster created.")
+return cluster
+
+def _delete_cluster(self, hook):
+self.log.info("Deleting the cluster")
+hook.delete_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info("Cluster %s deleted", self.cluster_name)
+
+def _get_cluster(self, hook):
+return hook.get_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster_name=self.cluster_name,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+
+def _handle_error_state(self, hook):
+self.log.info("Cluster is in ERROR state")
+gcs_uri = hook.diagnose_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info(
+'Diagnostic information for cluster %s available at: %s',
+self.cluster_name, gcs_uri
+)
+if self.delete_on_error:
+self._delete_cluster(hook)
 
 def execute(self, context):
 self.log.info('Creating cluster: %s', self.cluster_name)
 hook = DataprocHook(gcp_conn_id=self.gcp_conn_id)
 try:
-operation = hook.create_cluster(
-project_id=self.project_id,
-region=self.region,
-cluster=self.cluster,
-request_id=self.request_id,
-retry=self.retry,
-timeout=self.timeout,
-metadata=self.metadata,
-)
-cluster = operation.result()
-self.log.info("Cluster created.")
+cluster = self._create_cluster(hook)
 except AlreadyExists:
-cluster = hook.get_cluster(
-project_id=self.project_id,
-region=self.region,
-cluster_name=self.cluster_name,
-retry=self.retry,
-timeout=self.timeout,
-metadata=self.metadata,
-)
 self.log.info("Cluster already exists.")
+cluster = self._get_cluster(hook)
+
+if cluster.status.state == cluster.status.ERROR:
+self._handle_error_state(hook)
+elif cluster.status.state == cluster.status.DELETING:
+# Wait for cluster to delete
+for time_to_sleep in exponential_sleep_generator(initial=10, 
maximum=120):

Review comment:
   I know that we are waiting here for an operation that is supposed to 
finish within a finite amount but what do you think about adding a maximum 
amount of total sleep before raising an exception?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


potiuk commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653056290


   Look at the logs @ephraimbuddy -> the "ExternalLoggingMixin" is a bit 
strange as I cannot see it in the logging_mixin. Where is it from?
   
   
   Traceback (most recent call last):
 File "/import_all_provider_classes.py", line 61, in 
import_all_provider_classes
   _module = importlib.import_module(module_name)
 File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in 
import_module
   return _bootstrap._gcd_import(name[level:], package, level)
 File "", line 994, in _gcd_import
 File "", line 971, in _find_and_load
 File "", line 955, in _find_and_load_unlocked
 File "", line 665, in _load_unlocked
 File "", line 678, in exec_module
 File "", line 219, in 
_call_with_frames_removed
 File 
"/usr/local/lib/python3.6/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py",
 line 34, in 
   from airflow.utils.log.logging_mixin import ExternalLoggingMixin, 
LoggingMixin
   ImportError: cannot import name 'ExternalLoggingMixin'
   
   
 ERROR ENCOUNTERED!
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9277: [WIP] Health endpoint spec

2020-07-02 Thread GitBox


turbaszek commented on a change in pull request #9277:
URL: https://github.com/apache/airflow/pull/9277#discussion_r448975124



##
File path: airflow/api_connexion/endpoints/health_endpoint.py
##
@@ -14,13 +14,35 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8144
+from airflow.api_connexion.schemas.health_schema import health_schema
+from airflow.jobs.scheduler_job import SchedulerJob
 
 
 def get_health():
 """
-Checks if the API works
+Return the health of the airflow scheduler and metadatabase
 """
-return "OK"
+HEALTHY = "healthy"  # pylint: disable=invalid-name
+UNHEALTHY = "unhealthy"  # pylint: disable=invalid-name

Review comment:
   If you will move those out of the function then there will be no need to 
disable pylint :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on pull request #9616: local job heartbeat callback should use session from provide_session

2020-07-02 Thread GitBox


turbaszek commented on pull request #9616:
URL: https://github.com/apache/airflow/pull/9616#issuecomment-653012832


   @pingzh can you please take a look at the failing tests? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #8992: [AIRFLOW-5391] Do not re-run skipped tasks when they are cleared (#7276)

2020-07-02 Thread GitBox


kaxil commented on pull request #8992:
URL: https://github.com/apache/airflow/pull/8992#issuecomment-653022789


   Hi @yuqian90, apologies this won't make it to 1.10.11 as the 
LatestOnlyOperator causes a change in behaviour. Is it possible for you to 
achieve this without changing the behaviour (talking about the note in 
Updating.md) or adding a flag to have old behavior vs new behaviour (default 
would be old behaviour).
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on pull request #9604: Move CloudwatchTaskHandler to the provider package

2020-07-02 Thread GitBox


turbaszek commented on pull request #9604:
URL: https://github.com/apache/airflow/pull/9604#issuecomment-653022472


   @ephraimbuddy please rebase, I've just merged this fix in #9628 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9629: Updated link to official documentation

2020-07-02 Thread GitBox


boring-cyborg[bot] commented on pull request #9629:
URL: https://github.com/apache/airflow/pull/9629#issuecomment-653036293


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek merged pull request #9629: Updated link to official documentation

2020-07-02 Thread GitBox


turbaszek merged pull request #9629:
URL: https://github.com/apache/airflow/pull/9629


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (611d449 -> 37ca8ad)

2020-07-02 Thread turbaszek
This is an automated email from the ASF dual-hosted git repository.

turbaszek pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 611d449  Use supports_read instead of is_supported in log endpoint 
(#9628)
 add 37ca8ad  Updated link to official documentation (#9629)

No new revisions were added by this update.

Summary of changes:
 docs/project.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[GitHub] [airflow] ephraimbuddy commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


ephraimbuddy commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653044347


   Hi @turbaszek , what could be the cause of the backport packages CI build 
error? I can't seem to figure it out



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek opened a new pull request #9631: Add function to get current context

2020-07-02 Thread GitBox


turbaszek opened a new pull request #9631:
URL: https://github.com/apache/airflow/pull/9631


   Support for getting current context at any codelocation that runs under the 
scope of BaseOperator.execute function. This functionality is part of AIP-31
   
   closes: #8058 
   
   Work based on @jonathanshir PR #8651 
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


olchas commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449062517



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -437,6 +437,9 @@ class DataprocCreateClusterOperator(BaseOperator):
 :type project_id: str
 :param region: leave as 'global', might become relevant in the future. 
(templated)
 :type region: str
+:parm delete_on_error: If true the claster will be deleted if created with 
ERROR state. Default

Review comment:
   ```suggestion
   :param delete_on_error: If true the cluster will be deleted if created 
with ERROR state. Default
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


olchas commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449062072



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -502,32 +506,79 @@ def __init__(self,
 self.timeout = timeout
 self.metadata = metadata
 self.gcp_conn_id = gcp_conn_id
+self.delete_on_error = delete_on_error
+
+def _create_cluster(self, hook):
+operation = hook.create_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster=self.cluster,
+request_id=self.request_id,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+cluster = operation.result()
+self.log.info("Cluster created.")
+return cluster
+
+def _delete_cluster(self, hook):
+self.log.info("Deleting the cluster")
+hook.delete_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info("Cluster %s deleted", self.cluster_name)

Review comment:
   I wonder if it would not be a good idea to raise an exception here. It 
just seems weird to me that an operator that is supposed to create a cluster 
ends up deleting one instead and even returns a reference to no-longer-existing 
cluster. Raising an exception would also allow a second attempt to create the 
cluster on retry (if retries apply, of course).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9628: fix PR checks

2020-07-02 Thread GitBox


potiuk commented on pull request #9628:
URL: https://github.com/apache/airflow/pull/9628#issuecomment-652965696


   I think this was a transient error - rerunning it



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] OmairK commented on a change in pull request #9277: [WIP] Health endpoint spec

2020-07-02 Thread GitBox


OmairK commented on a change in pull request #9277:
URL: https://github.com/apache/airflow/pull/9277#discussion_r448972246



##
File path: airflow/api_connexion/endpoints/health_endpoint.py
##
@@ -14,13 +14,34 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8144
+from airflow.api_connexion.schemas.health_schema import health_schema
+from airflow.configuration import conf
+from airflow.jobs.scheduler_job import SchedulerJob
+from airflow.utils.session import provide_session
 
 
 def get_health():
 """
-Checks if the API works
+Return the health of the airflow scheduler and metadatabase
 """
-return "OK"
+payload = {
+'metadatabase': {'status': 'unhealthy'}
+}
+
+latest_scheduler_heartbeat = None
+scheduler_status = 'unhealthy'
+payload['metadatabase'] = {'status': 'healthy'}

Review comment:
   Thanks, fixed it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] OmairK commented on a change in pull request #9277: [WIP] Health endpoint spec

2020-07-02 Thread GitBox


OmairK commented on a change in pull request #9277:
URL: https://github.com/apache/airflow/pull/9277#discussion_r448971809



##
File path: airflow/api_connexion/endpoints/health_endpoint.py
##
@@ -14,13 +14,34 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8144
+from airflow.api_connexion.schemas.health_schema import health_schema
+from airflow.configuration import conf
+from airflow.jobs.scheduler_job import SchedulerJob
+from airflow.utils.session import provide_session
 
 
 def get_health():
 """
-Checks if the API works
+Return the health of the airflow scheduler and metadatabase
 """
-return "OK"
+payload = {
+'metadatabase': {'status': 'unhealthy'}
+}
+
+latest_scheduler_heartbeat = None
+scheduler_status = 'unhealthy'
+payload['metadatabase'] = {'status': 'healthy'}

Review comment:
   Thanks. [Fixed 
it](https://github.com/apache/airflow/pull/9277/files#diff-d7bb321505e9703c67dc7c78ff5b55deR25-R48)
 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on a change in pull request #9531: Support .airflowignore for plugins

2020-07-02 Thread GitBox


j-y-matsubara commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r448993794



##
File path: airflow/utils/file.py
##
@@ -90,6 +90,47 @@ def open_maybe_zipped(fileloc, mode='r'):
 return io.open(fileloc, mode=mode)
 
 
+def find_path_from_directory(
+base_dir_path: str,
+ignore_list_file: str) -> Generator[str, None, None]:
+"""
+Search the file and return the path of the file that should not be ignored.
+:param base_dir_path: the base path to be searched for.
+:param ignore_file_list_name: the file name in which specifies a regular 
expression pattern is written.

Review comment:
   I'm sorry.
   It's my simple mistake.
   I fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9615: Mask other forms of password arguments in SparkSubmitOperator

2020-07-02 Thread GitBox


turbaszek commented on a change in pull request #9615:
URL: https://github.com/apache/airflow/pull/9615#discussion_r449012333



##
File path: airflow/providers/apache/spark/hooks/spark_submit.py
##
@@ -237,8 +237,8 @@ def _mask_cmd(self, connection_cmd):
 # Mask any password related fields in application args with key value 
pair
 # where key contains password (case insensitive), e.g. 
HivePassword='abc'
 connection_cmd_masked = re.sub(
-r"(\S*?(?:secret|password)\S*?\s*=\s*')[^']*(?=')",
-r'\1**', ' '.join(connection_cmd), flags=re.I)
+r"(\S*?(?:secret|password)\S*?\s*(?:=|\s+)(['\"]?))[^'^\"]+(\2)",

Review comment:
   It would be nice  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9615: Mask other forms of password arguments in SparkSubmitOperator

2020-07-02 Thread GitBox


turbaszek commented on a change in pull request #9615:
URL: https://github.com/apache/airflow/pull/9615#discussion_r449011845



##
File path: tests/providers/apache/spark/hooks/test_spark_submit.py
##
@@ -748,3 +750,64 @@ def test_k8s_process_on_kill(self, mock_popen, 
mock_client_method):
 client.delete_namespaced_pod.assert_called_once_with(
 'spark-pi-edf2ace37be7353a958b38733a12f8e6-driver',
 'mynamespace', **kwargs)
+
+
+@pytest.mark.parametrize(
+("command", "expected"),
+(
+(
+("spark-submit", "foo", "--bar", "baz", "--password='secret'"),
+"spark-submit foo --bar baz --password='**'",
+),
+(
+("spark-submit", "foo", "--bar", "baz", '--password="secret"'),
+'spark-submit foo --bar baz --password="**"',
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--password=secret"),
+"spark-submit foo --bar baz --password=**",
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--password 'secret'"),
+"spark-submit foo --bar baz --password '**'",
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--password secret"),
+"spark-submit foo --bar baz --password **",
+),
+(
+("spark-submit", "foo", "--bar", "baz", '--password "secret"'),
+'spark-submit foo --bar baz --password "**"',
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--secret='secret'"),
+"spark-submit foo --bar baz --secret='**'",
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--foo.password='secret'"),
+"spark-submit foo --bar baz --foo.password='**'",
+),
+(
+("spark-submit",),
+"spark-submit",
+),
+
+(
+("spark-submit", "foo", "--bar", "baz", "--password \"secret'"),
+"spark-submit foo --bar baz --password \"secret'",
+),
+(
+("spark-submit", "foo", "--bar", "baz", "--password 'secret\""),
+"spark-submit foo --bar baz --password 'secret\"",
+),
+),
+)
+def test_masks_passwords(command: str, expected: str) -> None:

Review comment:
   You can use https://pypi.org/project/parameterized/ in future we will 
probably migrate to pytest





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb opened a new issue #9630: Officially support HA for scheduler component (AIP-15)

2020-07-02 Thread GitBox


ashb opened a new issue #9630:
URL: https://github.com/apache/airflow/issues/9630


   Placeholder issue - details to follow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] annotated tag 1.10.11rc1 updated (317b041 -> 96e8507)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to annotated tag 1.10.11rc1
in repository https://gitbox.apache.org/repos/asf/airflow.git.


*** WARNING: tag 1.10.11rc1 was modified! ***

from 317b041  (commit)
  to 96e8507  (tag)
 tagging 317b0412383ccda571fbef568c9eabd70ab8e666 (commit)
 replaces 1.10.10rc4
  by Kaxil Naik
  on Thu Jul 2 15:31:42 2020 +0100

- Log -
Airflow 1.10.11rc1
-BEGIN PGP SIGNATURE-

iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl7979kACgkQdfzNCiX6
DkvNRQgAguHDONDBZnEfcsLonuSEq48F61dCzp8ox8rDK/yA+mhl5SmQEFiv/45A
iDQD9aEAoW67tHngElTO5wagAYtccVbCHRoMKSIc8EadrSWWDyy0VxoiDEMkalI2
bMVwsSHDxGDyA0nkl4QWRDOdaGe5xcsYWm+k4QgAz0GCeOWKaCup6TZmGTtQelNH
Fiz2r1njpdlQWXrl1L0ncXtS0hfmiaGQaaG58j+wUqKhWvhurHWcae+EuBEQYbuy
APBmyBn1m3xkBc41pZCr8/0FGpasKMDxwSWH0a7QMfh7HAG/fO/YbS+5XwRlGz/o
paGdSBlwzX8Fd469l8f5VPfGManLcA==
=MBko
-END PGP SIGNATURE-
---


No new revisions were added by this update.

Summary of changes:



[GitHub] [airflow] dossett commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


dossett commented on a change in pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#discussion_r449064584



##
File path: airflow/providers/google/cloud/operators/dataproc.py
##
@@ -502,32 +506,79 @@ def __init__(self,
 self.timeout = timeout
 self.metadata = metadata
 self.gcp_conn_id = gcp_conn_id
+self.delete_on_error = delete_on_error
+
+def _create_cluster(self, hook):
+operation = hook.create_cluster(
+project_id=self.project_id,
+region=self.region,
+cluster=self.cluster,
+request_id=self.request_id,
+retry=self.retry,
+timeout=self.timeout,
+metadata=self.metadata,
+)
+cluster = operation.result()
+self.log.info("Cluster created.")
+return cluster
+
+def _delete_cluster(self, hook):
+self.log.info("Deleting the cluster")
+hook.delete_cluster(
+region=self.region,
+cluster_name=self.cluster_name,
+project_id=self.project_id,
+)
+self.log.info("Cluster %s deleted", self.cluster_name)

Review comment:
   If the cluster isn't deleted, then the retries will fail because the 
cluster already exists (even if it exists in an ERROR state and is not usable).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


potiuk commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653057551


   We can't use ExtenralLoggingMixin in Airflow 1.10 providers because it only 
appeared  TODAY in master



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch v1-10-test updated: Update README.md for 1.10.11

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v1-10-test by this push:
 new 317b041  Update README.md for 1.10.11
317b041 is described below

commit 317b0412383ccda571fbef568c9eabd70ab8e666
Author: Kaxil Naik 
AuthorDate: Thu Jul 2 15:29:41 2020 +0100

Update README.md for 1.10.11
---
 README.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index d5883bf..81b935d 100644
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@ Apache Airflow is tested with:
 * Sqlite - latest stable (it is used mainly for development purpose)
 * Kubernetes - 1.16.2, 1.17.0
 
-### Stable version (1.10.10)
+### Stable version
 
 * Python versions: 2.7, 3.5, 3.6, 3.7, 3.8
 * Postgres DB: 9.6, 10
@@ -107,14 +107,14 @@ in the URL.
 1. Installing just airflow:
 
 ```bash
-pip install apache-airflow==1.10.10 \
- --constraint 
https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt
+pip install apache-airflow==1.10.11 \
+ --constraint 
https://raw.githubusercontent.com/apache/airflow/1.10.11/requirements/requirements-python3.7.txt
 ```
 
 2. Installing with extras (for example postgres,gcp)
 ```bash
-pip install apache-airflow[postgres,gcp]==1.10.10 \
- --constraint 
https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt
+pip install apache-airflow[postgres,gcp]==1.10.11 \
+ --constraint 
https://raw.githubusercontent.com/apache/airflow/1.10.11/requirements/requirements-python3.7.txt
 ```
 
 ## Beyond the Horizon



[GitHub] [airflow] turbaszek commented on pull request #9593: Improve handling Dataproc cluster creation with ERROR state

2020-07-02 Thread GitBox


turbaszek commented on pull request #9593:
URL: https://github.com/apache/airflow/pull/9593#issuecomment-652980847


   @olchas would you mind taking a look?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on a change in pull request #9531: Support .airflowignore for plugins

2020-07-02 Thread GitBox


j-y-matsubara commented on a change in pull request #9531:
URL: https://github.com/apache/airflow/pull/9531#discussion_r448993794



##
File path: airflow/utils/file.py
##
@@ -90,6 +90,47 @@ def open_maybe_zipped(fileloc, mode='r'):
 return io.open(fileloc, mode=mode)
 
 
+def find_path_from_directory(
+base_dir_path: str,
+ignore_list_file: str) -> Generator[str, None, None]:
+"""
+Search the file and return the path of the file that should not be ignored.
+:param base_dir_path: the base path to be searched for.
+:param ignore_file_list_name: the file name in which specifies a regular 
expression pattern is written.

Review comment:
   I'm sorry.
   It's a simple mistake on my part.
   I fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9629: Updated link to official documentation

2020-07-02 Thread GitBox


boring-cyborg[bot] commented on pull request #9629:
URL: https://github.com/apache/airflow/pull/9629#issuecomment-653031230


   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for 
testing locally, it’s a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] aviralwal opened a new pull request #9629: Updated link to official documentation

2020-07-02 Thread GitBox


aviralwal opened a new pull request #9629:
URL: https://github.com/apache/airflow/pull/9629


   The link to official documentation should point to the documentation page 
instead of the home page.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] 01/02: Update the tree view of dag on Concepts Last Run Only (#8268)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 34aabacb1a7bf8ad34fe66b54be152dbee250134
Author: Rafael Bottega 
AuthorDate: Thu Apr 16 23:13:18 2020 +0100

Update the tree view of dag on Concepts Last Run Only (#8268)

Resolves #8246

(cherry picked from commit 44ddf54adf7cfe57bfea98cd2726152a2ba19e18)
---
 docs/img/latest_only_with_trigger.png | Bin 49510 -> 42887 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/docs/img/latest_only_with_trigger.png 
b/docs/img/latest_only_with_trigger.png
index 623f8ee..8fc2df9 100644
Binary files a/docs/img/latest_only_with_trigger.png and 
b/docs/img/latest_only_with_trigger.png differ



[jira] [Commented] (AIRFLOW-5391) Clearing a task skipped by BranchPythonOperator will cause the task to execute

2020-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150314#comment-17150314
 ] 

ASF GitHub Bot commented on AIRFLOW-5391:
-

kaxil commented on pull request #8992:
URL: https://github.com/apache/airflow/pull/8992#issuecomment-653022789


   Hi @yuqian90, apologies this won't make it to 1.10.11 as the 
LatestOnlyOperator causes a change in behaviour. Is it possible for you to 
achieve this without changing the behaviour (talking about the note in 
Updating.md) or adding a flag to have old behavior vs new behaviour (default 
would be old behaviour).
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Clearing a task skipped by BranchPythonOperator will cause the task to execute
> --
>
> Key: AIRFLOW-5391
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5391
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.4
>Reporter: Qian Yu
>Assignee: Qian Yu
>Priority: Major
> Fix For: 2.0.0
>
>
> I tried this on 1.10.3 and 1.10.4, both have this issue: 
> E.g. in this example from the doc, branch_a executed, branch_false was 
> skipped because of branching condition. However if someone Clear 
> branch_false, it'll cause branch_false to execute. 
> !https://airflow.apache.org/_images/branch_good.png!
> This behaviour is understandable given how BranchPythonOperator is 
> implemented. BranchPythonOperator does not store its decision anywhere. It 
> skips its own downstream tasks in the branch at runtime. So there's currently 
> no way for branch_false to know it should be skipped without rerunning the 
> branching task.
> This is obviously counter-intuitive from the user's perspective. In this 
> example, users would not expect branch_false to execute when they clear it 
> because the branching task should have skipped it.
> There are a few ways to improve this:
> Option 1): Make downstream tasks skipped by BranchPythonOperator not 
> clearable without also clearing the upstream BranchPythonOperator. In this 
> example, if someone clears branch_false without clearing branching, the Clear 
> action should just fail with an error telling the user he needs to clear the 
> branching task as well.
> Option 2): Make BranchPythonOperator store the result of its skip condition 
> somewhere. Make downstream tasks check for this stored decision and skip 
> themselves if they should have been skipped by the condition. This probably 
> means the decision of BranchPythonOperator needs to be stored in the db.
>  
> [kevcampb|https://blog.diffractive.io/author/kevcampb/] attempted a 
> workaround and on this blog. And he acknowledged his workaround is not 
> perfect and a better permanent fix is needed:
> [https://blog.diffractive.io/2018/08/07/replacement-shortcircuitoperator-for-airflow/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[airflow] 02/02: Add Changelog for 1.10.11

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 3e080c240e9968c4086c32943a22b3c4010453df
Author: Kaxil Naik 
AuthorDate: Wed Jul 1 19:40:29 2020 +0100

Add Changelog for 1.10.11
---
 .pre-commit-config.yaml |   3 +-
 CHANGELOG.txt   | 194 
 2 files changed, 196 insertions(+), 1 deletion(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index e7b90db..a017dad 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -245,7 +245,8 @@ repos:
   (?x)
   ^airflow/contrib/hooks/cassandra_hook.py$|
   ^airflow/operators/hive_stats_operator.py$|
-  ^tests/contrib/hooks/test_cassandra_hook.py
+  ^tests/contrib/hooks/test_cassandra_hook.py|
+  ^CHANGELOG.txt
   - id: dont-use-safe-filter
 language: pygrep
 name: Don't use safe in templates
diff --git a/CHANGELOG.txt b/CHANGELOG.txt
index a8aa353..0313e7b 100644
--- a/CHANGELOG.txt
+++ b/CHANGELOG.txt
@@ -1,3 +1,196 @@
+Airflow 1.10.11, 2020-07-05
+-
+
+New Features
+
+
+- Add task instance mutation hook (#8852)
+- Allow changing Task States Colors (#9520)
+- Add support for AWS Secrets Manager as Secrets Backend (#8186)
+- Add airflow info command to the CLI (#8704)
+- Add Local Filesystem Secret Backend (#8596)
+- Add Airflow config CLI command (#8694)
+- Add Support for Python 3.8 (#8836)(#8823)
+- Allow K8S worker pod to be configured from JSON/YAML file (#6230)
+- Add quarterly to crontab presets (#6873)
+- Add support for ephemeral storage on KubernetesPodOperator (#6337)
+- Add AirflowFailException to fail without any retry (#7133)
+- Add SQL Branch Operator (#8942)
+
+Bug Fixes
+"
+
+- Use NULL as dag.description default value (#7593)
+- BugFix: DAG trigger via UI error in RBAC UI (#8411)
+- Fix logging issue when running tasks (#9363)
+- Fix JSON encoding error in DockerOperator (#8287)
+- Fix alembic crash due to typing import (#6547)
+- Correctly restore upstream_task_ids when deserializing Operators (#8775)
+- Correctly store non-default Nones in serialized tasks/dags (#8772)
+- Correctly deserialize dagrun_timeout field on DAGs (#8735)
+- Fix tree view if config contains " (#9250)
+- Fix Dag Run UI execution date with timezone cannot be saved issue (#8902)
+- Fix Migration for MSSQL (#8385)
+- RBAC ui: Fix missing Y-axis labels with units in plots (#8252)
+- RBAC ui: Fix missing task runs being rendered as circles instead (#8253)
+- Fix: DagRuns page renders the state column with artifacts in old UI (#9612)
+- Fix task and dag stats on home page (#8865)
+- Fix the trigger_dag api in the case of nested subdags (#8081)
+- UX Fix: Prevent undesired text selection with DAG title selection in Chrome 
(#8912)
+- Fix connection add/edit for spark (#8685)
+- Fix retries causing constraint violation on MySQL with DAG Serialization 
(#9336)
+- [AIRFLOW-4472] Use json.dumps/loads for templating lineage data (#5253)
+- Restrict google-cloud-texttospeach to  committer (#7392)
 - [AIRFLOW-] Remove duplicated paragraph in docs (#7662)
 - Fix reference to KubernetesPodOperator (#8100)
+- Update the tree view of dag on Concepts Last Run Only (#8268)
 
 
 Airflow 1.10.9, 2020-02-07



[airflow] branch v1-10-test updated (a5a588e -> 3e080c2)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git.


omit a5a588e  Add Changelog for 1.10.11
 new 34aabac  Update the tree view of dag on Concepts Last Run Only (#8268)
 new 3e080c2  Add Changelog for 1.10.11

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (a5a588e)
\
 N -- N -- N   refs/heads/v1-10-test (3e080c2)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG.txt |   1 +
 docs/img/latest_only_with_trigger.png | Bin 49510 -> 42887 bytes
 2 files changed, 1 insertion(+)



[airflow] branch v1-10-stable updated (8b05289 -> 317b041)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch v1-10-stable
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 8b05289  Cache 1 10 ci images (#8955)
 add 69eeeda  Add Local Filesystem Secret Backend (v1-10) (#8596)
 add ac257fe  Reduce response payload size of /dag_stats and /task_stats 
(#8655)
 add 313d09e  Backport Airflow config command (1.10.*) (#8694)
 add 8eb4565  Add airflow info command (v1-10-test) (#8704)
 add c79e7df  Latest debian-buster release broke image build (#8758)
 add a8d8903  Show Deprecation warning on duplicate Task ids (#8728)
 add 3b70308  [8650] Add Yandex.Cloud custom connection to 1.10 (#8791)
 add 908962a  [AIRFLOW-4052] Allow filtering using "event" and "owner" in 
"Log" view (#4881)
 add cd32afa  Azure storage 0.37.0 is not installable any more (#8833)
 add 0c17935  Avoid failure on transient requirements in CI image
 add f5d89ed  Use Debian's provided JRE from Buster (#8919)
 add a2d3acd  Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work 
(#8938)
 add d0b0207  Fix new flake8 warnings on v1-10-test branch (#8953)
 add b2a4032  [AIRFLOW-3367] Run celery integration test with redis broker. 
(#4207)
 add 64db6e6  Fix race in Celery tests by pre-creating result tables (#8909)
 add a3aa995  Pin Version of Azure Cosmos to <4 (#8956)
 add 5664e36  Fix timing-based flakey test in TestLocalTaskJob (#8405)
 add 23d5ea0  Use production image for k8s tests (#9038)
 add 3437663  Move k8sexecutor out of contrib to closer match master (#8904)
 add 0925741  [AIRFLOW-4851] Refactor K8S codebase with k8s API models 
(#5481)
 add a5e7b99  [AIRFLOW-5443] Use alpine image in Kubernetes's sidecar 
(#6059)
 add 9444b4c  [AIRFLOW-5445] Reduce the required resources for the 
Kubernetes's sidecar (#6062)
 add 4c484ef  [AIRFLOW-5873] KubernetesPodOperator fixes and test (#6524)
 add 4918b85  [AIRFLOW-6959] Use NULL as dag.description default value 
(#7593)
 add 2fa5157  Add note about using dag_run.conf in BashOperator (#9143)
 add 570c9fa  Fix --forward-credentials flag in Breeze (#8554)
 add 79d34ea  Fixed optimistions of non-py-code builds (#8601)
 add c3d4396  Fix the process of requirements generations (#8648)
 add 264a94b  Fixed test-target command (#8795)
 add c057430  Add comments to breeze scripts (#8797)
 add 6ba874b  Useful help information in test-target and docker-compose 
commands (#8796)
 add 1831e79  The librabbitmq library stopped installing for python3.7 
(#8853)
 add 41808a7  Use Debian's provided JRE from Buster (#8919)
 add cf25e53  Re-run all tests when Dockerfile or Github worflow change 
(#8924)
 add 0efaa00  Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work 
(#8938)
 add 7b4e1a4  Python base images are stored in cache (#8943)
 add a41801c  Add ADDITIONAL_PYTHON_DEPS (#9031)
 add ffe496a  Add ADDITIONAL_AIRFLOW_EXTRAS (#9032)
 add 5683783  Additional python extras and deps can be set in breeze (#9035)
 add dbb4284  detect incompatible docker server version in breeze (#9042)
 add 214b508  Adds hive as extra in pyhive (#9075)
 add 7d3dab1  Prevents failure on fixing permissions for files with space 
in it (#9076)
 add 4f1a319  Enable configurable git sync depth  (#9094)
 add 5c45091  Don't reuse MY_DIR in breeze to mean different folder from 
ci/_utils.sh (#9098)
 add e5df858  You can push with Breeze as separate command and to cache 
(#8976)
 add d83331b  Produce less verbose output when building docker mount 
options (#9103)
 add f099416  Display docs errors summary (#8392)
 add 66ab8c3  Remove Hive/Hadoop/Java dependency from unit tests (#9029)
 add 32ed3c6  Kubernetes Cluster is started on host not in the container 
(#8265)
 add d505e8d  Fixes a bug where `build-image` command did not calculate md5 
(#9130)
 add c7c3561  Fix INTEGRATIONS[*]: unbound variable error in breeze (#9135)
 add 77998f5  Cope with multiple processes get_remote_image_info in 
parallel (#9105)
 add 6d07eac  Remove remnant kubernetes stuff from breeze scripts (#9138)
 add 19f6065  Restrict google-cloud-texttospeach to  }/executors/kubernetes_executor.py |  195 +--
 airflow/executors/local_executor.py|   32 +-
 airflow/executors/sequential_executor.py   |9 +
 airflow/hooks/base_hook.py |4 +-
 airflow/hooks/dbapi_hook.py|2 +-
 airflow/hooks/hive_hooks.py|7 +-
 airflow/hooks/webhdfs_hook.py  |6 +-
 airflow/jobs/backfill_job.py   |5 +-
 airflow/jobs/base_job.py   |   23 +-
 airflow/jobs/local_task_job.py |8 -
 airflow/jobs/scheduler_job.py  |   64 +-
 .../__init__.py 

[airflow] branch v1-10-stable updated (8b05289 -> 317b041)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch v1-10-stable
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 8b05289  Cache 1 10 ci images (#8955)
 add 69eeeda  Add Local Filesystem Secret Backend (v1-10) (#8596)
 add ac257fe  Reduce response payload size of /dag_stats and /task_stats 
(#8655)
 add 313d09e  Backport Airflow config command (1.10.*) (#8694)
 add 8eb4565  Add airflow info command (v1-10-test) (#8704)
 add c79e7df  Latest debian-buster release broke image build (#8758)
 add a8d8903  Show Deprecation warning on duplicate Task ids (#8728)
 add 3b70308  [8650] Add Yandex.Cloud custom connection to 1.10 (#8791)
 add 908962a  [AIRFLOW-4052] Allow filtering using "event" and "owner" in 
"Log" view (#4881)
 add cd32afa  Azure storage 0.37.0 is not installable any more (#8833)
 add 0c17935  Avoid failure on transient requirements in CI image
 add f5d89ed  Use Debian's provided JRE from Buster (#8919)
 add a2d3acd  Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work 
(#8938)
 add d0b0207  Fix new flake8 warnings on v1-10-test branch (#8953)
 add b2a4032  [AIRFLOW-3367] Run celery integration test with redis broker. 
(#4207)
 add 64db6e6  Fix race in Celery tests by pre-creating result tables (#8909)
 add a3aa995  Pin Version of Azure Cosmos to <4 (#8956)
 add 5664e36  Fix timing-based flakey test in TestLocalTaskJob (#8405)
 add 23d5ea0  Use production image for k8s tests (#9038)
 add 3437663  Move k8sexecutor out of contrib to closer match master (#8904)
 add 0925741  [AIRFLOW-4851] Refactor K8S codebase with k8s API models 
(#5481)
 add a5e7b99  [AIRFLOW-5443] Use alpine image in Kubernetes's sidecar 
(#6059)
 add 9444b4c  [AIRFLOW-5445] Reduce the required resources for the 
Kubernetes's sidecar (#6062)
 add 4c484ef  [AIRFLOW-5873] KubernetesPodOperator fixes and test (#6524)
 add 4918b85  [AIRFLOW-6959] Use NULL as dag.description default value 
(#7593)
 add 2fa5157  Add note about using dag_run.conf in BashOperator (#9143)
 add 570c9fa  Fix --forward-credentials flag in Breeze (#8554)
 add 79d34ea  Fixed optimistions of non-py-code builds (#8601)
 add c3d4396  Fix the process of requirements generations (#8648)
 add 264a94b  Fixed test-target command (#8795)
 add c057430  Add comments to breeze scripts (#8797)
 add 6ba874b  Useful help information in test-target and docker-compose 
commands (#8796)
 add 1831e79  The librabbitmq library stopped installing for python3.7 
(#8853)
 add 41808a7  Use Debian's provided JRE from Buster (#8919)
 add cf25e53  Re-run all tests when Dockerfile or Github worflow change 
(#8924)
 add 0efaa00  Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work 
(#8938)
 add 7b4e1a4  Python base images are stored in cache (#8943)
 add a41801c  Add ADDITIONAL_PYTHON_DEPS (#9031)
 add ffe496a  Add ADDITIONAL_AIRFLOW_EXTRAS (#9032)
 add 5683783  Additional python extras and deps can be set in breeze (#9035)
 add dbb4284  detect incompatible docker server version in breeze (#9042)
 add 214b508  Adds hive as extra in pyhive (#9075)
 add 7d3dab1  Prevents failure on fixing permissions for files with space 
in it (#9076)
 add 4f1a319  Enable configurable git sync depth  (#9094)
 add 5c45091  Don't reuse MY_DIR in breeze to mean different folder from 
ci/_utils.sh (#9098)
 add e5df858  You can push with Breeze as separate command and to cache 
(#8976)
 add d83331b  Produce less verbose output when building docker mount 
options (#9103)
 add f099416  Display docs errors summary (#8392)
 add 66ab8c3  Remove Hive/Hadoop/Java dependency from unit tests (#9029)
 add 32ed3c6  Kubernetes Cluster is started on host not in the container 
(#8265)
 add d505e8d  Fixes a bug where `build-image` command did not calculate md5 
(#9130)
 add c7c3561  Fix INTEGRATIONS[*]: unbound variable error in breeze (#9135)
 add 77998f5  Cope with multiple processes get_remote_image_info in 
parallel (#9105)
 add 6d07eac  Remove remnant kubernetes stuff from breeze scripts (#9138)
 add 19f6065  Restrict google-cloud-texttospeach to  }/executors/kubernetes_executor.py |  195 +--
 airflow/executors/local_executor.py|   32 +-
 airflow/executors/sequential_executor.py   |9 +
 airflow/hooks/base_hook.py |4 +-
 airflow/hooks/dbapi_hook.py|2 +-
 airflow/hooks/hive_hooks.py|7 +-
 airflow/hooks/webhdfs_hook.py  |6 +-
 airflow/jobs/backfill_job.py   |5 +-
 airflow/jobs/base_job.py   |   23 +-
 airflow/jobs/local_task_job.py |8 -
 airflow/jobs/scheduler_job.py  |   64 +-
 .../__init__.py 

[GitHub] [airflow] freget commented on issue #9609: TimeSensor triggers immediately when used over midnight (UTC)

2020-07-02 Thread GitBox


freget commented on issue #9609:
URL: https://github.com/apache/airflow/issues/9609#issuecomment-65311


   I would also be fine with a new sensor. I still believe that this problem 
should be at least documented, as it is everything but obvious if the DAG is 
initialized timezone aware. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #9586: support new released version of sendgrid

2020-07-02 Thread GitBox


kaxil commented on pull request #9586:
URL: https://github.com/apache/airflow/pull/9586#issuecomment-653175384


   we will need another rebase @ephraimbuddy sorry :(



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj edited a comment on pull request #9618: Fix typos, older versions, and deprecated operators with AI platform example DAG

2020-07-02 Thread GitBox


mik-laj edited a comment on pull request #9618:
URL: https://github.com/apache/airflow/pull/9618#issuecomment-653176749







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9618: Fix typos, older versions, and deprecated operators with AI platform example DAG

2020-07-02 Thread GitBox


mik-laj commented on pull request #9618:
URL: https://github.com/apache/airflow/pull/9618#issuecomment-653176749


   @vuppalli  We don't need unit tests for DAG. We use them in the 
documentation and in system tests. System tests are sufficient in this case.  
Have the files that are needed to run the tests changed? 
   
   It is better if the contribution is small because then it is much easier to 
review and merge in the project. In this case, I would prefer you to create a 
new PR that will only contain documentation. If you need any changes that are 
included in this PR then you can copy the changes from this PR and add 
annotations in the PR title `[depends on ]`.  Additionally, you can add a 
note in the description, but not everyone reads the description of the changes.
   Example:
   ```
   Add guide for MLEngine [depends on #9618]
   ```
   
   In order for this change to be merged, you must fix static check errors in 
this change.
   ```
   airflow/providers/google/cloud/example_dags/example_mlengine.py:29:89: W291 
trailing whitespace
   airflow/providers/google/cloud/example_dags/example_mlengine.py:30:80: W291 
trailing whitespace
   ```
   + isort
   For more information: 
   https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#id1
   
   How do you like an internship at Google? Do you have any concerns about 
contributing to Open Source? The community of this project is very open to new 
contributors and interns. We currently have two active interns who contribute 
to the project. If you would like to get more involved in this project, we'll 
be happy to help. Apache Airflow is the core of the Cloud Composer service, so 
contributions to this project will be appreciated by your company.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on pull request #9482: Add CRUD endpoint for XCom

2020-07-02 Thread GitBox


ephraimbuddy commented on pull request #9482:
URL: https://github.com/apache/airflow/pull/9482#issuecomment-653119162


   Hi, @turbaszek can I ask for review?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook

2020-07-02 Thread GitBox


vanka56 commented on a change in pull request #9472:
URL: https://github.com/apache/airflow/pull/9472#discussion_r449187394



##
File path: tests/providers/apache/hive/hooks/test_hive.py
##
@@ -383,6 +383,10 @@ def test_table_exists(self):
 self.hook.table_exists(str(random.randint(1, 1)))
 )
 
+def test_drop_partition(self):
+self.assertTrue(self.hook.drop_partitions(self.table, db=self.database,
+  part_vals=[DEFAULT_DATE_DS]))
+

Review comment:
   @turbaszek Yes. it uses Hivemetastore Thrift client. it does the 
partition from the test table set up for unit testing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator

2020-07-02 Thread GitBox


albertocalderari commented on a change in pull request #9590:
URL: https://github.com/apache/airflow/pull/9590#discussion_r449173278



##
File path: airflow/providers/google/cloud/operators/bigquery.py
##
@@ -1692,32 +1692,52 @@ def prepare_template(self) -> None:
 with open(self.configuration, 'r') as file:
 self.configuration = json.loads(file.read())
 
+def _submit_job(self, hook: BigQueryHook, job_id: str):
+# Submit a new job
+job = hook.insert_job(
+configuration=self.configuration,
+project_id=self.project_id,
+location=self.location,
+job_id=job_id,
+)
+# Start the job and wait for it to complete and get the result.
+job.result()
+return job
+
 def execute(self, context: Any):
 hook = BigQueryHook(
 gcp_conn_id=self.gcp_conn_id,
 delegate_to=self.delegate_to,
 )
 
-job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}"
+exec_date = context['execution_date'].isoformat()
+job_id = self.job_id or 
f"airflow_{self.dag_id}_{self.task_id}_{exec_date}"
+
 try:
-job = hook.insert_job(
-configuration=self.configuration,
-project_id=self.project_id,
-location=self.location,
-job_id=job_id,
-)
-# Start the job and wait for it to complete and get the result.
-job.result()
+# Submit a new job
+job = self._submit_job(hook, job_id)
 except Conflict:
+# If the job already exists retrieve it
 job = hook.get_job(
 project_id=self.project_id,
 location=self.location,
 job_id=job_id,
 )
-# Get existing job and wait for it to be ready
-for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):
-sleep(time_to_wait)
-job.reload()
-if job.done():
-break
+
+if job.done() and job.error_result:
+# The job exists and finished with an error and we are 
probably reruning it
+# So we have to make a new job_id because it has to be unique
+job_id = f"{self.job_id}_{int(time())}"
+job = self._submit_job(hook, job_id)
+elif not job.done():
+# The job is still running so wait for it to be ready
+for time_to_wait in exponential_sleep_generator(initial=10, 
maximum=120):

Review comment:
   In case the job is already running I still do ```job.result()``` and it 
polls, but without ding reload





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #9044: Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill`

2020-07-02 Thread GitBox


ashb commented on pull request #9044:
URL: https://github.com/apache/airflow/pull/9044#issuecomment-653168107


   Could you rebase to latest master, hopefully that should fix the failing 
Kube tests



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow-pgbouncer-exporter] branch master created (now 160f560)

2020-07-02 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch master
in repository 
https://gitbox.apache.org/repos/asf/airflow-pgbouncer-exporter.git.


  at 160f560  Merge pull request #4 from jbub/const-metrics

This branch includes the following new commits:

 new 7ce3d9d  Initial commit.
 new 8ac8035  Fix docker badge image url in README.
 new 4f3a283  Add code comments.
 new c276e65  Add missing method comments.
 new 6389264  Make database column ForceUser nullable.
 new 5ef12a0  Fill missing Active field in sql store GetPools method.
 new bf8a081  Update CHANGELOG.
 new d38b321  Capture build version using prometheus/common/version package.
 new c9de406  Refactor http server.
 new df56966  Return error from store Close method.
 new 7d075b5  Refactor http server to improve testability.
 new 7efe5dd  Update vendor.
 new b14b1a1  Update CHANGELOG.
 new 4d8d9f1  Cleanup unused fields from HTTPServer.
 new 816c0c9  Reorder logging in server command to be able to actually see 
any logs.
 new 99c7f24  Add healthcheck.
 new b887927  Update CHANGELOG.
 new 920f2c9  Build with Go 1.9.2.
 new 7b4ae95  Add docker config to goreleaser config.
 new 64d97f3  Update changelog for 0.1.5.
 new 2ce51ba  Add new fields to support PgBouncer 1.8.
 new 2f68ad4  Update CHANGELOG for 0.2.0.
 new e6fcc3d  Build with Go 1.9.4.
 new 1f3b701  Add golangci.yml.
 new 7492ef7  Update vendored libs, prune tests and unused pkgs.
 new 58ea421  Use Go 1.10.3.
 new 08b31eb  Bump testify version.
 new 6d45552  Update CHANGELOG for 0.2.2.
 new 23071e1  Fix duplication of release field in .goreleaser.yml.
 new f7a5640  Expose more stats metrics
 new d38d0a8  Expose more pool metrics
 new d10b51d  Merge pull request #3 from Ometria/extra-stats-metrics
 new a997ab3  Use Go 1.11.2 in travis config.
 new 5e5a68c  Add Go modules support.
 new ee127f5  Drop dep support.
 new 599d095  Update CHANGELOG for 0.3.0.
 new d5392f9  Fix build version passing in .goreleaser.yml.
 new 08553d9  Enable GO111MODULE in travis build.
 new 3df8ff9  Run install step in travis.
 new 10aaf7e  Add initial drone config.
 new e3db533  Use drone ci, drop coveralls and golangci.
 new 90b1b9c  Drop travis.
 new 044c3e8  Welcome 2019.
 new b30fa9b  Pin dependencies versions to tags.
 new 2451596  Move code to internal.
 new e9edb61  Add go version to go.mod.
 new d214032  Test drone with golang:1.12.
 new 97437a2  Update CHANGELOG for 0.4.0.
 new 2dc730b  Add docker example to README.
 new bc8ea94  Update to Go 1.13.
 new 7cecfdb  Bump github.com/lib/pq to v1.3.0.
 new c2c0876  Bump github.com/prometheus/client_golang to v1.3.0.
 new 01fe33e  Update to github.com/urfave/cli/v2.
 new aebee06  Add docker compose for testing.
 new c1a4faf  Update CHANGELOG for 0.5.0.
 new 4bac8ef  Update goreleaser yaml to be compatible with latest release.
 new 3be0522  Use skip_push auto in docker release.
 new e55c5f0  Update CHANGELOG for 0.5.1.
 new 1f60abd  Do not use draft release in goreleaser.
 new 03e87c1  Update CHANGELOG for 0.5.2.
 new 24e1fc3  Use custom query in store.Check.
 new 777e244  Use sqlx.Open instead of sqlx.Connect to skip calling Ping.
 new d4e8ae3  Check store on startup.
 new fe94b3a  Rename store.NewSQLStore to store.NewSQL.
 new da78521  Update CHANGELOG for 0.5.0.
 new 470d1d6  Refactor exporter to use NewConstMetric.
 new 160f560  Merge pull request #4 from jbub/const-metrics

The 67 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.




[airflow] 02/09: Fix typo in helm chart upgrade command for 2.0 (#9484)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 47e1a875229657251fdeddaae5a2dd083572079b
Author: Ash Berlin-Taylor 
AuthorDate: Tue Jun 23 10:38:06 2020 +0100

Fix typo in helm chart upgrade command for 2.0 (#9484)


(cherry picked from commit b1cd382db9367ec828b8ee16899ecea9fcf824a7)
---
 chart/templates/scheduler/scheduler-deployment.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chart/templates/scheduler/scheduler-deployment.yaml 
b/chart/templates/scheduler/scheduler-deployment.yaml
index 1b46f6a..d5c3a06 100644
--- a/chart/templates/scheduler/scheduler-deployment.yaml
+++ b/chart/templates/scheduler/scheduler-deployment.yaml
@@ -96,7 +96,7 @@ spec:
   image: {{ template "airflow_image" . }}
   imagePullPolicy: {{ .Values.images.airflow.pullPolicy }}
   # Support running against 1.10.x and 2.0.0dev/master
-  args: ["bash", "-c", "airflow upgradedb || airfow db upgrade"]
+  args: ["bash", "-c", "airflow upgradedb || airflow db upgrade"]
   env:
   {{- include "custom_airflow_environment" . | indent 10 }}
   {{- include "standard_airflow_environment" . | indent 10 }}



[airflow] 05/09: Remove redundant airflowVersion from Helm Chart readme (#9592)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 9bc18c1938ebd5f19e1de4a2ce7270957fd3fea6
Author: Kaxil Naik 
AuthorDate: Tue Jun 30 17:02:56 2020 +0100

Remove redundant airflowVersion from Helm Chart readme (#9592)

We no longer use `airflowVersion` , we instead use 
`defaultAirflowRepository` and `defaultAirflowTag`

(cherry picked from commit d6b323b0cd9be2aa941cbb1e1e15d766b4d6539b)
---
 chart/README.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/chart/README.md b/chart/README.md
index 402a9d7..089ea22 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -91,7 +91,6 @@ The following tables lists the configurable parameters of the 
Airflow chart and
 | `networkPolicies.enabled` | Enable Network 
Policies to restrict traffic
  | `true`|
 | `airflowHome` | Location of airflow 
home directory  
 | `/opt/airflow`|
 | `rbacEnabled` | Deploy pods with 
Kubernets RBAC enabled  
| `true`|
-| `airflowVersion`  | Default Airflow 
image version   
 | `1.10.5`  |
 | `executor`| Airflow executor (eg 
SequentialExecutor, LocalExecutor, CeleryExecutor, KubernetesExecutor)  
| `KubernetesExecutor`  |
 | `allowPodLaunching`   | Allow airflow pods 
to talk to Kubernetes API to launch more pods   
  | `true`|
 | `defaultAirflowRepository`| Fallback docker 
repository to pull airflow image from   
 | `apache/airflow`  |



[airflow] 04/09: Fix typo of resultBackendConnection in chart README (#9537)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit a213847e99b4b9f0afb7a8c0b6fc3968d04d6e40
Author: Vicken Simonian 
AuthorDate: Fri Jun 26 11:40:30 2020 -0700

Fix typo of resultBackendConnection in chart README (#9537)


(cherry picked from commit 096f5c5cba963b364ee75f6686d128cd4d34d66e)
---
 chart/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chart/README.md b/chart/README.md
index d0366f6..402a9d7 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -119,11 +119,11 @@ The following tables lists the configurable parameters of 
the Airflow chart and
 | `data.metadataSecretName` | Secret name to mount 
Airflow connection string from  
| `~`   |
 | `data.resultBackendSecretName`| Secret name to mount 
Celery result backend connection string from
| `~`   |
 | `data.metadataConection`  | Field separated 
connection data (alternative to secret name)
 | `{}`  |
-| `data.resultBakcnedConnection`| Field separated 
connection data (alternative to secret name)
 | `{}`  |
+| `data.resultBackendConnection`| Field separated 
connection data (alternative to secret name)
 | `{}`  |
 | `fernetKey`   | String representing 
an Airflow fernet key   
 | `~`   |
 | `fernetKeySecretName` | Secret name for 
Airlow fernet key   
 | `~`   |
 | `workers.replicas`| Replica count for 
Celery workers (if applicable)  
   | `1`   |
-| `workers.keda.enabled` | Enable KEDA 
autoscaling features
 | `false`   |
+| `workers.keda.enabled`| Enable KEDA 
autoscaling features
 | `false`   |
 | `workers.keda.pollingInverval`| How often KEDA 
should poll the backend database for metrics in seconds 
  | `5`   |
 | `workers.keda.cooldownPeriod` | How often KEDA 
should wait before scaling down in seconds  
  | `30`  |
 | `workers.keda.maxReplicaCount`| Maximum number of 
Celery workers KEDA can scale to
   | `10`  |



[airflow] 07/09: Switches to Helm Chart for Kubernetes tests (#9468)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit b4a620cc1abfee581e5a6291914efc5479e72c18
Author: Jarek Potiuk 
AuthorDate: Wed Jul 1 14:50:30 2020 +0200

Switches to Helm Chart for Kubernetes tests (#9468)

The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.

The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.

This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.

Co-authored-by: Jarek Potiuk 
Co-authored-by: Daniel Imberman 
(cherry picked from commit 8bd15ef634cca40f3cf6ca3442262f3e05144512)
---
 .github/workflows/ci.yml   |  23 +-
 BREEZE.rst |  81 +++--
 CI.rst |   2 +-
 Dockerfile |   4 +
 IMAGES.rst |   3 +
 TESTING.rst|  67 ++--
 airflow/kubernetes/pod_launcher.py |   2 +-
 breeze |  51 ++-
 breeze-complete|  14 +-
 chart/README.md|   5 +-
 chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes
 chart/requirements.lock|   4 +-
 chart/templates/configmap.yaml |   2 +
 chart/templates/rbac/pod-launcher-role.yaml|   2 +-
 chart/templates/rbac/pod-launcher-rolebinding.yaml |   4 +-
 kubernetes_tests/test_kubernetes_executor.py   |  40 ++-
 scripts/ci/ci_build_production_images.sh   |  25 --
 scripts/ci/ci_count_changed_files.sh   |   2 +-
 scripts/ci/ci_deploy_app_to_kubernetes.sh  |  16 +-
 scripts/ci/ci_docs.sh  |   2 +-
 scripts/ci/ci_flake8.sh|   2 +-
 scripts/ci/ci_generate_requirements.sh |   2 +-
 scripts/ci/ci_load_image_to_kind.sh|   7 +-
 scripts/ci/ci_mypy.sh  |   2 +-
 scripts/ci/ci_perform_kind_cluster_operation.sh|   6 +-
 scripts/ci/ci_run_airflow_testing.sh   |   2 +-
 scripts/ci/ci_run_kubernetes_tests.sh  |   6 +-
 scripts/ci/ci_run_static_checks.sh |   2 +-
 scripts/ci/kubernetes/app/postgres.yaml|  94 -
 .../kubernetes/app/templates/airflow.template.yaml | 207 ---
 .../app/templates/configmaps.template.yaml | 395 -
 .../app/templates/init_git_sync.template.yaml  |  36 --
 scripts/ci/kubernetes/app/volumes.yaml |  87 -
 .../docker/airflow-test-env-init-dags.sh   |  36 --
 .../kubernetes/docker/airflow-test-env-init-db.sh  |  46 ---
 scripts/ci/kubernetes/docker/bootstrap.sh  |  74 
 scripts/ci/kubernetes/kind-cluster-conf.yaml   |   3 -
 .../kubernetes/{app/secrets.yaml => volumes.yaml}  |  29 +-
 scripts/ci/libraries/_build_images.sh  |  11 +-
 scripts/ci/libraries/_initialization.sh|  27 +-
 scripts/ci/libraries/_kind.sh  | 380 +++-
 scripts/ci/libraries/_verbosity.sh |  31 ++
 42 files changed, 424 insertions(+), 1410 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index d091d2e..195f7f7 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -100,7 +100,7 @@ jobs:
 steps:
   - uses: actions/checkout@master
   - name: "Build PROD image ${{ matrix.python-version }}"
-run: ./scripts/ci/ci_build_production_images.sh
+run: ./scripts/ci/ci_prepare_prod_image_on_ci.sh
 
   tests-kubernetes:
 timeout-minutes: 80
@@ -113,7 +113,11 @@ jobs:
 kube-mode:
   - image
 kubernetes-version:
-  - "v1.15.3"
+  - "v1.18.2"
+kind-version:
+  - "v0.8.0"
+helm-version:
+  - "v3.2.4"
   fail-fast: false
 env:
   BACKEND: postgres
@@ -126,6 +130,8 @@ jobs:
   PYTHON_MAJOR_MINOR_VERSION: "${{ matrix.python-version }}"
   KUBERNETES_MODE: "${{ matrix.kube-mode }}"
   KUBERNETES_VERSION: "${{ 

[GitHub] [airflow] kaxil commented on pull request #9364: Add option for Vault token to automatically be renewed

2020-07-02 Thread GitBox


kaxil commented on pull request #9364:
URL: https://github.com/apache/airflow/pull/9364#issuecomment-653091216


   Yup, let's add it to the client



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] 07/09: Switches to Helm Chart for Kubernetes tests (#9468)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit b4a620cc1abfee581e5a6291914efc5479e72c18
Author: Jarek Potiuk 
AuthorDate: Wed Jul 1 14:50:30 2020 +0200

Switches to Helm Chart for Kubernetes tests (#9468)

The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.

The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.

This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.

Co-authored-by: Jarek Potiuk 
Co-authored-by: Daniel Imberman 
(cherry picked from commit 8bd15ef634cca40f3cf6ca3442262f3e05144512)
---
 .github/workflows/ci.yml   |  23 +-
 BREEZE.rst |  81 +++--
 CI.rst |   2 +-
 Dockerfile |   4 +
 IMAGES.rst |   3 +
 TESTING.rst|  67 ++--
 airflow/kubernetes/pod_launcher.py |   2 +-
 breeze |  51 ++-
 breeze-complete|  14 +-
 chart/README.md|   5 +-
 chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes
 chart/requirements.lock|   4 +-
 chart/templates/configmap.yaml |   2 +
 chart/templates/rbac/pod-launcher-role.yaml|   2 +-
 chart/templates/rbac/pod-launcher-rolebinding.yaml |   4 +-
 kubernetes_tests/test_kubernetes_executor.py   |  40 ++-
 scripts/ci/ci_build_production_images.sh   |  25 --
 scripts/ci/ci_count_changed_files.sh   |   2 +-
 scripts/ci/ci_deploy_app_to_kubernetes.sh  |  16 +-
 scripts/ci/ci_docs.sh  |   2 +-
 scripts/ci/ci_flake8.sh|   2 +-
 scripts/ci/ci_generate_requirements.sh |   2 +-
 scripts/ci/ci_load_image_to_kind.sh|   7 +-
 scripts/ci/ci_mypy.sh  |   2 +-
 scripts/ci/ci_perform_kind_cluster_operation.sh|   6 +-
 scripts/ci/ci_run_airflow_testing.sh   |   2 +-
 scripts/ci/ci_run_kubernetes_tests.sh  |   6 +-
 scripts/ci/ci_run_static_checks.sh |   2 +-
 scripts/ci/kubernetes/app/postgres.yaml|  94 -
 .../kubernetes/app/templates/airflow.template.yaml | 207 ---
 .../app/templates/configmaps.template.yaml | 395 -
 .../app/templates/init_git_sync.template.yaml  |  36 --
 scripts/ci/kubernetes/app/volumes.yaml |  87 -
 .../docker/airflow-test-env-init-dags.sh   |  36 --
 .../kubernetes/docker/airflow-test-env-init-db.sh  |  46 ---
 scripts/ci/kubernetes/docker/bootstrap.sh  |  74 
 scripts/ci/kubernetes/kind-cluster-conf.yaml   |   3 -
 .../kubernetes/{app/secrets.yaml => volumes.yaml}  |  29 +-
 scripts/ci/libraries/_build_images.sh  |  11 +-
 scripts/ci/libraries/_initialization.sh|  27 +-
 scripts/ci/libraries/_kind.sh  | 380 +++-
 scripts/ci/libraries/_verbosity.sh |  31 ++
 42 files changed, 424 insertions(+), 1410 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index d091d2e..195f7f7 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -100,7 +100,7 @@ jobs:
 steps:
   - uses: actions/checkout@master
   - name: "Build PROD image ${{ matrix.python-version }}"
-run: ./scripts/ci/ci_build_production_images.sh
+run: ./scripts/ci/ci_prepare_prod_image_on_ci.sh
 
   tests-kubernetes:
 timeout-minutes: 80
@@ -113,7 +113,11 @@ jobs:
 kube-mode:
   - image
 kubernetes-version:
-  - "v1.15.3"
+  - "v1.18.2"
+kind-version:
+  - "v0.8.0"
+helm-version:
+  - "v3.2.4"
   fail-fast: false
 env:
   BACKEND: postgres
@@ -126,6 +130,8 @@ jobs:
   PYTHON_MAJOR_MINOR_VERSION: "${{ matrix.python-version }}"
   KUBERNETES_MODE: "${{ matrix.kube-mode }}"
   KUBERNETES_VERSION: "${{ 

[airflow] 08/09: Removes importlib usage - it's not needed (fails on Airflow 1.10) (#9613)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 74ecb8ac1c275d19f8f27161a61e723905bce23e
Author: Jarek Potiuk 
AuthorDate: Wed Jul 1 18:07:12 2020 +0200

Removes importlib usage - it's not needed (fails on Airflow 1.10) (#9613)


(cherry picked from commit a3a52c78b274483f2035ad975fc218abd8ffdf8a)
---
 chart/templates/_helpers.yaml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chart/templates/_helpers.yaml b/chart/templates/_helpers.yaml
index 66d1850..ac121a4 100644
--- a/chart/templates/_helpers.yaml
+++ b/chart/templates/_helpers.yaml
@@ -205,7 +205,7 @@ log_connections = {{ .Values.pgbouncer.logConnections }}
   - python
   - -c
   - |
-import importlib
+import airflow
 import os
 import time
 
@@ -215,7 +215,7 @@ log_connections = {{ .Values.pgbouncer.logConnections }}
 
 from airflow import settings
 
-package_dir = 
os.path.dirname(importlib.util.find_spec('airflow').origin)
+package_dir = os.path.abspath(os.path.dirname(airflow.__file__))
 directory = os.path.join(package_dir, 'migrations')
 config = Config(os.path.join(package_dir, 'alembic.ini'))
 config.set_main_option('script_location', directory)



[airflow] 09/09: Update Breeze documentation (#9608)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 99d37e2d35a9d82103b35e4042c27a7f5620b568
Author: Jarek Potiuk 
AuthorDate: Wed Jul 1 16:02:24 2020 +0200

Update Breeze documentation (#9608)

* Update Breeze documentation

(cherry picked from commit f3e1f9a313d8a6f841f6a5c9f2663518fee16b8f)
---
 BREEZE.rst  | 293 
 TESTING.rst |   2 +-
 2 files changed, 198 insertions(+), 97 deletions(-)

diff --git a/BREEZE.rst b/BREEZE.rst
index 9b318e2..735286a 100644
--- a/BREEZE.rst
+++ b/BREEZE.rst
@@ -232,44 +232,6 @@ from your ``logs`` directory in the Airflow sources, so 
all logs created in the
 visible in the host as well. Every time you enter the container, the ``logs`` 
directory is
 cleaned so that logs do not accumulate.
 
-CLIs for cloud providers
-
-
-For development convenience we installed simple wrappers for the most common 
cloud providers CLIs. Those
-CLIs are not installed when you build or pull the image - they will be 
downloaded as docker images
-the first time you attempt to use them. It is downloaded and executed in your 
host's docker engine so once
-it is downloaded, it will stay until you remove the downloaded images from 
your host container.
-
-For each of those CLI credentials are taken (automatically) from the 
credentials you have defined in
-your ${HOME} directory on host.
-
-Those tools also have host Airflow source directory mounted in /opt/airflow 
path
-so you can directly transfer files to/from your airflow host sources.
-
-Those are currently installed CLIs (they are available as aliases to the 
docker commands):
-
-+---+--+-+---+
-| Cloud Provider| CLI tool | Docker image  
  | Configuration dir |
-+===+==+=+===+
-| Amazon Web Services   | aws  | amazon/aws-cli:latest 
  | .aws  |
-+---+--+-+---+
-| Microsoft Azure   | az   | mcr.microsoft.com/azure-cli:latest
  | .azure|
-+---+--+-+---+
-| Google Cloud Platform | bq   | 
gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud|
-|   
+--+-+---+
-|   | gcloud   | 
gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud|
-|   
+--+-+---+
-|   | gsutil   | 
gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud|
-+---+--+-+---+
-
-For each of the CLIs we have also an accompanying ``*-update`` alias (for 
example ``aws-update``) which
-will pull the latest image for the tool. Note that all Google Cloud Platform 
tools are served by one
-image and they are updated together.
-
-Also - in case you run several different Breeze containers in parallel (from 
different directories,
-with different versions) - they docker images for CLI Cloud Providers tools 
are shared so if you update it
-for one Breeze container, they will also get updated for all the other 
containers.
-
 Using the Airflow Breeze Environment
 =
 
@@ -287,6 +249,7 @@ Managing CI environment:
 * Stop running interactive environment with ``breeze stop`` command
 * Restart running interactive environment with ``breeze restart`` command
 * Run test specified with ``breeze tests`` command
+* Generate requirements with ``breeze generate-requirements`` command
 * Execute arbitrary command in the test environment with ``breeze shell`` 
command
 * Execute arbitrary docker-compose command with ``breeze docker-compose`` 
command
 * Push docker images with ``breeze push-image`` command (require 
committer's rights to push images)
@@ -319,7 +282,7 @@ Manage and Interact with Kubernetes tests environment:
 Run static checks:
 
 * Run static checks - either for currently staged change or for all files 
with
-  ``breeze static-check`` or ``breeze static-check-all-files`` command
+  ``breeze static-check`` command
 
 Build documentation:
 
@@ -330,10 +293,12 @@ Set up local development environment:
 * Setup local virtualenv with ``breeze setup-virtualenv`` command
 * Setup autocomplete for itself with ``breeze setup-autocomplete`` command
 
-

[airflow] 03/09: Remove non-existent chart value from readme (#9511)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 20e106d31dc125812d2c5c9e421a406cdb3e6958
Author: Ash Berlin-Taylor 
AuthorDate: Thu Jun 25 12:19:26 2020 +0100

Remove non-existent chart value from readme (#9511)

This was accidentally left over when this was extracted from
Astronomer's chart.

(cherry picked from commit 561060aaa82ddb63fe2a38473bfd920a5aeff786)
---
 chart/README.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/chart/README.md b/chart/README.md
index 8657eee..d0366f6 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -157,7 +157,6 @@ The following tables lists the configurable parameters of 
the Airflow chart and
 | `webserver.resources.limits.memory`   | Memory Limit of 
webserver   
 | `~`   |
 | `webserver.resources.requests.cpu`| CPU Request of 
webserver   
  | `~`   |
 | `webserver.resources.requests.memory` | Memory Request of 
webserver   
   | `~`   |
-| `webserver.jwtSigningCertificateSecretName`   | Name of secret to 
mount Airflow Webserver JWT singing certificate from
   | `~`   |
 | `webserver.defaultUser`   | Optional default 
airflow user information
| `{}`  |
 
 



[airflow] 02/09: Fix typo in helm chart upgrade command for 2.0 (#9484)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 47e1a875229657251fdeddaae5a2dd083572079b
Author: Ash Berlin-Taylor 
AuthorDate: Tue Jun 23 10:38:06 2020 +0100

Fix typo in helm chart upgrade command for 2.0 (#9484)


(cherry picked from commit b1cd382db9367ec828b8ee16899ecea9fcf824a7)
---
 chart/templates/scheduler/scheduler-deployment.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chart/templates/scheduler/scheduler-deployment.yaml 
b/chart/templates/scheduler/scheduler-deployment.yaml
index 1b46f6a..d5c3a06 100644
--- a/chart/templates/scheduler/scheduler-deployment.yaml
+++ b/chart/templates/scheduler/scheduler-deployment.yaml
@@ -96,7 +96,7 @@ spec:
   image: {{ template "airflow_image" . }}
   imagePullPolicy: {{ .Values.images.airflow.pullPolicy }}
   # Support running against 1.10.x and 2.0.0dev/master
-  args: ["bash", "-c", "airflow upgradedb || airfow db upgrade"]
+  args: ["bash", "-c", "airflow upgradedb || airflow db upgrade"]
   env:
   {{- include "custom_airflow_environment" . | indent 10 }}
   {{- include "standard_airflow_environment" . | indent 10 }}



[airflow] 04/09: Fix typo of resultBackendConnection in chart README (#9537)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit a213847e99b4b9f0afb7a8c0b6fc3968d04d6e40
Author: Vicken Simonian 
AuthorDate: Fri Jun 26 11:40:30 2020 -0700

Fix typo of resultBackendConnection in chart README (#9537)


(cherry picked from commit 096f5c5cba963b364ee75f6686d128cd4d34d66e)
---
 chart/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chart/README.md b/chart/README.md
index d0366f6..402a9d7 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -119,11 +119,11 @@ The following tables lists the configurable parameters of 
the Airflow chart and
 | `data.metadataSecretName` | Secret name to mount 
Airflow connection string from  
| `~`   |
 | `data.resultBackendSecretName`| Secret name to mount 
Celery result backend connection string from
| `~`   |
 | `data.metadataConection`  | Field separated 
connection data (alternative to secret name)
 | `{}`  |
-| `data.resultBakcnedConnection`| Field separated 
connection data (alternative to secret name)
 | `{}`  |
+| `data.resultBackendConnection`| Field separated 
connection data (alternative to secret name)
 | `{}`  |
 | `fernetKey`   | String representing 
an Airflow fernet key   
 | `~`   |
 | `fernetKeySecretName` | Secret name for 
Airlow fernet key   
 | `~`   |
 | `workers.replicas`| Replica count for 
Celery workers (if applicable)  
   | `1`   |
-| `workers.keda.enabled` | Enable KEDA 
autoscaling features
 | `false`   |
+| `workers.keda.enabled`| Enable KEDA 
autoscaling features
 | `false`   |
 | `workers.keda.pollingInverval`| How often KEDA 
should poll the backend database for metrics in seconds 
  | `5`   |
 | `workers.keda.cooldownPeriod` | How often KEDA 
should wait before scaling down in seconds  
  | `30`  |
 | `workers.keda.maxReplicaCount`| Maximum number of 
Celery workers KEDA can scale to
   | `10`  |



[airflow] 05/09: Remove redundant airflowVersion from Helm Chart readme (#9592)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 9bc18c1938ebd5f19e1de4a2ce7270957fd3fea6
Author: Kaxil Naik 
AuthorDate: Tue Jun 30 17:02:56 2020 +0100

Remove redundant airflowVersion from Helm Chart readme (#9592)

We no longer use `airflowVersion` , we instead use 
`defaultAirflowRepository` and `defaultAirflowTag`

(cherry picked from commit d6b323b0cd9be2aa941cbb1e1e15d766b4d6539b)
---
 chart/README.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/chart/README.md b/chart/README.md
index 402a9d7..089ea22 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -91,7 +91,6 @@ The following tables lists the configurable parameters of the 
Airflow chart and
 | `networkPolicies.enabled` | Enable Network 
Policies to restrict traffic
  | `true`|
 | `airflowHome` | Location of airflow 
home directory  
 | `/opt/airflow`|
 | `rbacEnabled` | Deploy pods with 
Kubernets RBAC enabled  
| `true`|
-| `airflowVersion`  | Default Airflow 
image version   
 | `1.10.5`  |
 | `executor`| Airflow executor (eg 
SequentialExecutor, LocalExecutor, CeleryExecutor, KubernetesExecutor)  
| `KubernetesExecutor`  |
 | `allowPodLaunching`   | Allow airflow pods 
to talk to Kubernetes API to launch more pods   
  | `true`|
 | `defaultAirflowRepository`| Fallback docker 
repository to pull airflow image from   
 | `apache/airflow`  |



[airflow] 06/09: Fix broken link in chart/README.md (#9591)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit e6b2d0f89d5e9348869c15dd78f5c1a03906ae38
Author: Kaxil Naik 
AuthorDate: Tue Jun 30 17:03:11 2020 +0100

Fix broken link in chart/README.md (#9591)

`CONTRIBUTING.md` -> `../CONTRIBUTING.rst`

(cherry picked from commit bbfaafeb552b48560960ab4aba84723b7ccbf386)
---
 chart/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/chart/README.md b/chart/README.md
index 089ea22..76d14b4 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -264,4 +264,4 @@ to port-forward the Airflow UI to http://localhost:8080/ to 
cofirm Airflow is wo
 
 ## Contributing
 
-Check out [our contributing guide!](CONTRIBUTING.md)
+Check out [our contributing guide!](../CONTRIBUTING.rst)



[airflow] branch v1-10-test updated (317b041 -> 99d37e2)

2020-07-02 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 317b041  Update README.md for 1.10.11
 new d43ca01  Add Production Helm chart support (#8777)
 new 47e1a87  Fix typo in helm chart upgrade command for 2.0 (#9484)
 new 20e106d  Remove non-existent chart value from readme (#9511)
 new a213847  Fix typo of resultBackendConnection in chart README (#9537)
 new 9bc18c1  Remove redundant airflowVersion from Helm Chart readme (#9592)
 new e6b2d0f  Fix broken link in chart/README.md (#9591)
 new b4a620c  Switches to Helm Chart for Kubernetes tests (#9468)
 new 74ecb8a  Removes importlib usage - it's not needed (fails on Airflow 
1.10) (#9613)
 new 99d37e2  Update Breeze documentation (#9608)

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .github/workflows/ci.yml   |  23 +-
 .pre-commit-config.yaml|   2 +-
 BREEZE.rst | 374 --
 CI.rst |   2 +-
 Dockerfile |   4 +
 IMAGES.rst |   3 +
 TESTING.rst|  67 ++--
 airflow/kubernetes/pod_launcher.py |   2 +-
 breeze |  51 ++-
 breeze-complete|  14 +-
 chart/.gitignore   |   9 +
 .../.helmignore|  33 +-
 .readthedocs.yml => chart/Chart.yaml   |  20 +-
 chart/README.md| 270 +
 chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes
 chart/requirements.lock|   6 +
 .../libs/helper.py => chart/requirements.yaml  |  11 +-
 .../LICENSE.txt => chart/templates/NOTES.txt   |  13 +
 chart/templates/_helpers.yaml  | 260 
 chart/templates/cleanup/cleanup-cronjob.yaml   |  67 
 .../templates/cleanup/cleanup-serviceaccount.yaml  |  24 +-
 chart/templates/configmap.yaml | 119 ++
 chart/templates/create-user-job.yaml   |  87 
 chart/templates/flower/flower-deployment.yaml  | 102 +
 chart/templates/flower/flower-networkpolicy.yaml   |  51 +++
 .../templates/flower/flower-service.yaml   |  41 +-
 .../pod.yaml => chart/templates/limitrange.yaml|  33 +-
 .../templates/pgbouncer/pgbouncer-deployment.yaml  | 128 ++
 .../pgbouncer/pgbouncer-networkpolicy.yaml |  69 
 .../pgbouncer/pgbouncer-poddisruptionbudget.yaml   |  56 ++-
 chart/templates/pgbouncer/pgbouncer-service.yaml   |  56 +++
 .../templates/rbac/pod-cleanup-role.yaml   |  34 +-
 .../templates/rbac/pod-cleanup-rolebinding.yaml|  32 +-
 chart/templates/rbac/pod-launcher-role.yaml|  58 +++
 chart/templates/rbac/pod-launcher-rolebinding.yaml |  51 +++
 chart/templates/redis/redis-networkpolicy.yaml |  63 +++
 .../templates/redis/redis-service.yaml |  41 +-
 chart/templates/redis/redis-statefulset.yaml   |  99 +
 .../pod.yaml => chart/templates/resourcequota.yaml |  33 +-
 .../templates/scheduler/scheduler-deployment.yaml  | 195 +
 .../scheduler/scheduler-networkpolicy.yaml |  55 +++
 .../scheduler/scheduler-poddisruptionbudget.yaml   |  39 +-
 .../templates/scheduler/scheduler-service.yaml |  41 +-
 .../scheduler/scheduler-serviceaccount.yaml|  24 +-
 .../templates/secrets/elasticsearch-secret.yaml|  22 +-
 .../templates/secrets/fernetkey-secret.yaml|  27 +-
 .../secrets/metadata-connection-secret.yaml|  42 ++
 .../templates/secrets/pgbouncer-config-secret.yaml |  23 +-
 .../templates/secrets/pgbouncer-stats-secret.yaml  |  22 +-
 chart/templates/secrets/redis-secrets.yaml |  61 +++
 .../templates/secrets/registry-secret.yaml |  24 +-
 .../secrets/result-backend-connection-secret.yaml  |  37 ++
 chart/templates/statsd/statsd-deployment.yaml  |  87 
 chart/templates/statsd/statsd-networkpolicy.yaml   |  57 +++
 chart/templates/statsd/statsd-service.yaml |  56 +++
 .../templates/webserver/webserver-deployment.yaml  | 139 +++
 .../webserver/webserver-networkpolicy.yaml |  51 +++
 .../templates/webserver/webserver-service.yaml |  39 +-
 chart/templates/workers/worker-deployment.yaml | 161 
 chart/templates/workers/worker-kedaautoscaler.yaml |  47 +++
 chart/templates/workers/worker-networkpolicy.yaml  |  53 +++
 .../templates/workers/worker-service.yaml  |  

[GitHub] [airflow] mik-laj commented on pull request #9431: Move API page limit and offset parameters to views as kwargs Arguments

2020-07-02 Thread GitBox


mik-laj commented on pull request #9431:
URL: https://github.com/apache/airflow/pull/9431#issuecomment-653185780


   > This PR depends on #9503 , 
   
   I don't see the common parts. Can you say something more?
   
   Can you do a rebase? I would like to merge this change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] aneesh-joseph commented on pull request #9044: Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill`

2020-07-02 Thread GitBox


aneesh-joseph commented on pull request #9044:
URL: https://github.com/apache/airflow/pull/9044#issuecomment-653191413


   > Could you rebase to latest master, hopefully that should fix the failing 
Kube tests
   
   done, but failed again, is there a way to re-run them?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kblibr opened a new pull request #9632: Fixing typo in chart/README.me

2020-07-02 Thread GitBox


kblibr opened a new pull request #9632:
URL: https://github.com/apache/airflow/pull/9632


   I found a simple typo in the readme and thought it would be good to fix it.
   -- No issue in github issues.
   -- No change to source. (no unit test changed.)
   -- Change did not alter the meaning of the readme.
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X ] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Target Github ISSUE in description if exists
   - [X ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [X ] Relevant documentation is updated including usage instructions.
   - [ X] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9632: Fixing typo in chart/README.me

2020-07-02 Thread GitBox


boring-cyborg[bot] commented on pull request #9632:
URL: https://github.com/apache/airflow/pull/9632#issuecomment-653198140


   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for 
testing locally, it’s a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook

2020-07-02 Thread GitBox


vanka56 commented on a change in pull request #9472:
URL: https://github.com/apache/airflow/pull/9472#discussion_r449299190



##
File path: airflow/providers/apache/hive/hooks/hive.py
##
@@ -775,6 +775,23 @@ def table_exists(self, table_name, db='default'):
 except Exception:  # pylint: disable=broad-except
 return False
 
+def drop_partitions(self, table_name, part_vals, delete_data=False, 
db='default'):
+"""
+Drop partitions matching param_names input
+>>> hh = HiveMetastoreHook()
+>>> hh.drop_partitions(db='airflow', table_name='static_babynames',
+part_vals="['2020-05-01']")
+True

Review comment:
   Done!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook

2020-07-02 Thread GitBox


vanka56 commented on a change in pull request #9472:
URL: https://github.com/apache/airflow/pull/9472#discussion_r449299122



##
File path: airflow/providers/apache/hive/hooks/hive.py
##
@@ -775,6 +775,23 @@ def table_exists(self, table_name, db='default'):
 except Exception:  # pylint: disable=broad-except
 return False
 
+def drop_partitions(self, table_name, part_vals, delete_data=False, 
db='default'):

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9632: Fixing typo in chart/README.me

2020-07-02 Thread GitBox


kaxil commented on a change in pull request #9632:
URL: https://github.com/apache/airflow/pull/9632#discussion_r449260079



##
File path: chart/README.md
##
@@ -74,8 +74,7 @@ helm upgrade airflow . \
   --set images.airflow.tag=8a0da78
 ```
 
-For local development purppose you can also u
-You can also build the image locally and use it via deployment method 
described by Breeze.
+For local development purppose you can also build the image locally and use it 
via deployment method described by Breeze.

Review comment:
   ```suggestion
   For local development purpose you can also build the image locally and use 
it via deployment method described by Breeze.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Acehaidrey commented on pull request #9544: Add metric for scheduling delay between first run task & expected start time

2020-07-02 Thread GitBox


Acehaidrey commented on pull request #9544:
URL: https://github.com/apache/airflow/pull/9544#issuecomment-653242280


   @mik-laj please when you get a chance



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jpuffer opened a new issue #9633: "Try Number" is off by 1 in the Gantt view

2020-07-02 Thread GitBox


jpuffer opened a new issue #9633:
URL: https://github.com/apache/airflow/issues/9633


   **Apache Airflow version**: 1.10.10
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: Astronomer
   
   **What happened**:
   In the Gantt view, all tasks start at "Try number: 2" and increment from 
there.
   
   **What you expected to happen**:
   I expect the first try to be called try number 1.
   
   **How to reproduce it**:
   Look at gantt view for any DAG run
   
   
   **Anything else we need to know**:
   
   JIRA says this was solved as of v..10.7, but I'm finding that's not the case:
   https://issues.apache.org/jira/browse/AIRFLOW-2143
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jkbngl commented on pull request #9295: added mssql to oracle transfer operator

2020-07-02 Thread GitBox


jkbngl commented on pull request #9295:
URL: https://github.com/apache/airflow/pull/9295#issuecomment-653231283


   Hi @ephraimbuddy would you mind helping me with the naming convention of the 
operators? I got an error for my operator to not match: .*Operator$ so I 
renamed it to MsSqlToOracleOperator, now I get an error for matching 
.*To[A-Z0-9].*Operator$, the issue is the "**To**" from my understanding... 
   What would be the correct name for my operator? I also checked other 
operators and they also match .*To[A-Z0-9].*Operator$, like e.g.:
   - FileToWasbOperator
   - OracleToAzureDataLakeTransferOperator
   - OracleToOracleTransferOperator
   - 
   
   Maybe it would be correct to move it in a transfer folder instead of in the 
operator folder, but this is also not the case for all other transfer 
operators? Your help would be appreciated! Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9295: added mssql to oracle transfer operator

2020-07-02 Thread GitBox


mik-laj commented on pull request #9295:
URL: https://github.com/apache/airflow/pull/9295#issuecomment-653234573


   Here is naming convention:
   
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#naming-conventions-for-provider-packages
   is this helpful for you?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package

2020-07-02 Thread GitBox


mik-laj commented on pull request #9623:
URL: https://github.com/apache/airflow/pull/9623#issuecomment-653188369


   @potiuk  This will cherry-pick changes to the Javascript code. If you are 
willing, we can do it. For now, we should delete references to this class in 
backport packages. This will allow these classes to be used without this one 
new feature.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9632: Fixing typo in chart/README.me

2020-07-02 Thread GitBox


boring-cyborg[bot] commented on pull request #9632:
URL: https://github.com/apache/airflow/pull/9632#issuecomment-653217267


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Fixing typo in chart/README.me (#9632)

2020-07-02 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 72d5a58  Fixing typo in chart/README.me (#9632)
72d5a58 is described below

commit 72d5a58fd734cf4a02e351210b7db5ae2fae7e4e
Author: Bryant Larsen 
AuthorDate: Thu Jul 2 14:57:55 2020 -0600

Fixing typo in chart/README.me (#9632)

* Fixing typo in readme
---
 chart/README.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/chart/README.md b/chart/README.md
index 6cc361e..fb04517 100644
--- a/chart/README.md
+++ b/chart/README.md
@@ -74,8 +74,7 @@ helm upgrade airflow . \
   --set images.airflow.tag=8a0da78
 ```
 
-For local development purppose you can also u
-You can also build the image locally and use it via deployment method 
described by Breeze.
+For local development purpose you can also build the image locally and use it 
via deployment method described by Breeze.
 
 ## Parameters
 



[GitHub] [airflow] potiuk merged pull request #9632: Fixing typo in chart/README.me

2020-07-02 Thread GitBox


potiuk merged pull request #9632:
URL: https://github.com/apache/airflow/pull/9632


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9295: added mssql to oracle transfer operator

2020-07-02 Thread GitBox


mik-laj commented on pull request #9295:
URL: https://github.com/apache/airflow/pull/9295#issuecomment-653235179


   Here is voting and discussion about transfer package: 
https://lists.apache.org/x/thread.html/r3514ef575b437b9eb368111b1e4b03ad7455e63d64c359c22fd6ea9a@%3Cdev.airflow.apache.org%3E



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9503: add date-time format validation for API spec

2020-07-02 Thread GitBox


mik-laj commented on a change in pull request #9503:
URL: https://github.com/apache/airflow/pull/9503#discussion_r449274053



##
File path: airflow/api_connexion/endpoints/dag_run_endpoint.py
##
@@ -69,24 +61,24 @@ def get_dag_runs(session, dag_id, start_date_gte=None, 
start_date_lte=None,
 
 # filter start date
 if start_date_gte:
-query = query.filter(DagRun.start_date >= start_date_gte)
+query = query.filter(DagRun.start_date >= 
timezone.parse(start_date_gte))

Review comment:
   We cannot make such a change to the specification because API clients 
must support this type of field. It will not be easy to add custom type support 
for all generated clients. We can try to make such a change only in memory when 
loading the file, but I'm not sure if we need it. The current solution meets 
all our requirements.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6497) Scheduler creates DagBag in the same process with outdated info

2020-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150617#comment-17150617
 ] 

ASF GitHub Bot commented on AIRFLOW-6497:
-

feng-tao commented on pull request #7597:
URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511


   just learn this pr, nice!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Scheduler creates DagBag in the same process with outdated info
> ---
>
> Key: AIRFLOW-6497
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6497
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: Qian Yu
>Assignee: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>
> The following code in scheduler_job.py seems to be called in the same process 
> as the scheduler. It creates a DagBag. But since scheduler is a long running 
> process, it does not pick up the latest changes made to DAGs. For example, 
> changes to retries count, on_failure_callback, newly added tasks, etc are not 
> reflected.
>  
> {code:python}
> if ti.try_number == try_number and ti.state == State.QUEUED:
> msg = ("Executor reports task instance {} finished ({}) "
>"although the task says its {}. Was the task "
>"killed externally?".format(ti, state, ti.state))
> Stats.incr('scheduler.tasks.killed_externally')
> self.log.error(msg)
> try:
> simple_dag = simple_dag_bag.get_dag(dag_id)
> dagbag = models.DagBag(simple_dag.full_filepath)
> dag = dagbag.get_dag(dag_id)
> ti.task = dag.get_task(task_id)
> ti.handle_failure(msg)
> except Exception:
> self.log.error("Cannot load the dag bag to handle 
> failure for %s"
>". Setting task to FAILED without 
> callbacks or "
>"retries. Do you have enough 
> resources?", ti)
> ti.state = State.FAILED
> session.merge(ti)
> session.commit()
> {code}
> This causes errors such as AttributeError due to stale code being hit. E.g. 
> when someone added a .join attribute to CustomOperator without bouncing the 
> scheduler, this is what he would get after a CeleryWorker timeout error 
> causes this line to be hit:
> {code}
> [2020-01-05 22:25:45,951] {dagbag.py:207} ERROR - Failed to import: 
> /dags/dag1.py
> Traceback (most recent call last):
>   File "/lib/python3.6/site-packages/airflow/models/dagbag.py", line 204, in 
> process_file
> m = imp.load_source(mod_name, filepath)
>   File "/usr/lib/python3.6/imp.py", line 172, in load_source
> module = _load(spec)
>   File "", line 684, in _load
>   File "", line 665, in _load_unlocked
>   File "", line 678, in exec_module
>   File "", line 219, in _call_with_frames_removed
>   File "/dags/dag1.py", line 280, in 
> task1 >> task2.join
> AttributeError: 'CustomOperator' object has no attribute 'join'
> [2020-01-05 22:25:45,951] {scheduler_job.py:1314} ERROR - Cannot load the dag 
> bag to handle failure for  [queued]>. Setting task to FAILED without callbacks or retries. Do you have 
> enough resou
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6497) Scheduler creates DagBag in the same process with outdated info

2020-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150618#comment-17150618
 ] 

ASF GitHub Bot commented on AIRFLOW-6497:
-

feng-tao edited a comment on pull request #7597:
URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511


   just found out this pr, nice!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Scheduler creates DagBag in the same process with outdated info
> ---
>
> Key: AIRFLOW-6497
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6497
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: Qian Yu
>Assignee: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>
> The following code in scheduler_job.py seems to be called in the same process 
> as the scheduler. It creates a DagBag. But since scheduler is a long running 
> process, it does not pick up the latest changes made to DAGs. For example, 
> changes to retries count, on_failure_callback, newly added tasks, etc are not 
> reflected.
>  
> {code:python}
> if ti.try_number == try_number and ti.state == State.QUEUED:
> msg = ("Executor reports task instance {} finished ({}) "
>"although the task says its {}. Was the task "
>"killed externally?".format(ti, state, ti.state))
> Stats.incr('scheduler.tasks.killed_externally')
> self.log.error(msg)
> try:
> simple_dag = simple_dag_bag.get_dag(dag_id)
> dagbag = models.DagBag(simple_dag.full_filepath)
> dag = dagbag.get_dag(dag_id)
> ti.task = dag.get_task(task_id)
> ti.handle_failure(msg)
> except Exception:
> self.log.error("Cannot load the dag bag to handle 
> failure for %s"
>". Setting task to FAILED without 
> callbacks or "
>"retries. Do you have enough 
> resources?", ti)
> ti.state = State.FAILED
> session.merge(ti)
> session.commit()
> {code}
> This causes errors such as AttributeError due to stale code being hit. E.g. 
> when someone added a .join attribute to CustomOperator without bouncing the 
> scheduler, this is what he would get after a CeleryWorker timeout error 
> causes this line to be hit:
> {code}
> [2020-01-05 22:25:45,951] {dagbag.py:207} ERROR - Failed to import: 
> /dags/dag1.py
> Traceback (most recent call last):
>   File "/lib/python3.6/site-packages/airflow/models/dagbag.py", line 204, in 
> process_file
> m = imp.load_source(mod_name, filepath)
>   File "/usr/lib/python3.6/imp.py", line 172, in load_source
> module = _load(spec)
>   File "", line 684, in _load
>   File "", line 665, in _load_unlocked
>   File "", line 678, in exec_module
>   File "", line 219, in _call_with_frames_removed
>   File "/dags/dag1.py", line 280, in 
> task1 >> task2.join
> AttributeError: 'CustomOperator' object has no attribute 'join'
> [2020-01-05 22:25:45,951] {scheduler_job.py:1314} ERROR - Cannot load the dag 
> bag to handle failure for  [queued]>. Setting task to FAILED without callbacks or retries. Do you have 
> enough resou
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] feng-tao edited a comment on pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-07-02 Thread GitBox


feng-tao edited a comment on pull request #7597:
URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511


   just found out this pr, nice!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] feng-tao commented on pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-07-02 Thread GitBox


feng-tao commented on pull request #7597:
URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511


   just learn this pr, nice!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on pull request #9431: Move API page limit and offset parameters to views as kwargs Arguments

2020-07-02 Thread GitBox


ephraimbuddy commented on pull request #9431:
URL: https://github.com/apache/airflow/pull/9431#issuecomment-653202146


   > > This PR depends on #9503 ,
   > 
   > I don't see the common parts. Can you say something more?
   > 
   It is because I removed `format_parameters` tests in #9503. I was hoping to 
add it back with this PR. However, I will rebase now then add the tests on the 
other PR
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ITriangle opened a new pull request #9635: fix : _run_task_by_executor pickle_id is None

2020-07-02 Thread GitBox


ITriangle opened a new pull request #9635:
URL: https://github.com/apache/airflow/pull/9635


   session add  need to commit,Otherwise, pickle_id is None
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] freedom1989 opened a new issue #9634: Cannot update XCOMs via RBAC UI

2020-07-02 Thread GitBox


freedom1989 opened a new issue #9634:
URL: https://github.com/apache/airflow/issues/9634


   **Apache Airflow version**: 1.10.10
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release):
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**: 
   
   I cannot update the XCOMs via the RBAC UI. It always show me "Not a valid 
datetime value."
   I even tried `2020-06-30 16:56:02+00:00` and `2020-06-30 16:56:02` as the 
input.
   
   
   **What you expected to happen**:
   
   **How to reproduce it**:
   
   
![image](https://user-images.githubusercontent.com/3204415/86428573-a817f680-bd1f-11ea-913e-4ee011cf6671.png)
   
   **Anything else we need to know**:
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] stale[bot] closed pull request #8768: [POC] Mark keywords-only arguments in hook method signatures

2020-07-02 Thread GitBox


stale[bot] closed pull request #8768:
URL: https://github.com/apache/airflow/pull/8768


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >