[GitHub] [airflow] aneesh-joseph opened a new pull request #9628: fix static checks
aneesh-joseph opened a new pull request #9628: URL: https://github.com/apache/airflow/pull/9628 --- Make sure to mark the boxes below before creating PR: [x] - [ ] Description above provides context of the change - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Target Github ISSUE in description if exists - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on pull request #9593: Improve handling Dataproc cluster creation with ERROR state
turbaszek commented on pull request #9593: URL: https://github.com/apache/airflow/pull/9593#issuecomment-652954399 @dossett I've added handling of this case This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #9627: MySQL support in Helm airflow database config
potiuk commented on issue #9627: URL: https://github.com/apache/airflow/issues/9627#issuecomment-652966722 Ach sorry. CC: @schnie (you are both just one line apart when typing "Greg" ). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator
albertocalderari commented on a change in pull request #9590: URL: https://github.com/apache/airflow/pull/9590#discussion_r449065871 ## File path: airflow/providers/google/cloud/operators/bigquery.py ## @@ -1692,32 +1692,52 @@ def prepare_template(self) -> None: with open(self.configuration, 'r') as file: self.configuration = json.loads(file.read()) +def _submit_job(self, hook: BigQueryHook, job_id: str): +# Submit a new job +job = hook.insert_job( +configuration=self.configuration, +project_id=self.project_id, +location=self.location, +job_id=job_id, +) +# Start the job and wait for it to complete and get the result. +job.result() +return job + def execute(self, context: Any): hook = BigQueryHook( gcp_conn_id=self.gcp_conn_id, delegate_to=self.delegate_to, ) -job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}" +exec_date = context['execution_date'].isoformat() +job_id = self.job_id or f"airflow_{self.dag_id}_{self.task_id}_{exec_date}" + try: -job = hook.insert_job( -configuration=self.configuration, -project_id=self.project_id, -location=self.location, -job_id=job_id, -) -# Start the job and wait for it to complete and get the result. -job.result() +# Submit a new job +job = self._submit_job(hook, job_id) except Conflict: +# If the job already exists retrieve it job = hook.get_job( project_id=self.project_id, location=self.location, job_id=job_id, ) -# Get existing job and wait for it to be ready -for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): -sleep(time_to_wait) -job.reload() -if job.done(): -break + +if job.done() and job.error_result: +# The job exists and finished with an error and we are probably reruning it +# So we have to make a new job_id because it has to be unique +job_id = f"{self.job_id}_{int(time())}" +job = self._submit_job(hook, job_id) +elif not job.done(): +# The job is still running so wait for it to be ready +for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): Review comment: When I hit result it polls too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
potiuk commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653059073 If we want to support ExternalLoggingMixin we should also bacport it to 1.10 but this might be a bit more complex - using Bowler refactoring (@turbaszek WDYT?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator
albertocalderari commented on a change in pull request #9590: URL: https://github.com/apache/airflow/pull/9590#discussion_r449065662 ## File path: airflow/providers/google/cloud/operators/bigquery.py ## @@ -1692,32 +1692,52 @@ def prepare_template(self) -> None: with open(self.configuration, 'r') as file: self.configuration = json.loads(file.read()) +def _submit_job(self, hook: BigQueryHook, job_id: str): +# Submit a new job +job = hook.insert_job( +configuration=self.configuration, +project_id=self.project_id, +location=self.location, +job_id=job_id, +) +# Start the job and wait for it to complete and get the result. +job.result() +return job + def execute(self, context: Any): hook = BigQueryHook( gcp_conn_id=self.gcp_conn_id, delegate_to=self.delegate_to, ) -job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}" +exec_date = context['execution_date'].isoformat() +job_id = self.job_id or f"airflow_{self.dag_id}_{self.task_id}_{exec_date}" + try: -job = hook.insert_job( -configuration=self.configuration, -project_id=self.project_id, -location=self.location, -job_id=job_id, -) -# Start the job and wait for it to complete and get the result. -job.result() +# Submit a new job +job = self._submit_job(hook, job_id) except Conflict: +# If the job already exists retrieve it job = hook.get_job( project_id=self.project_id, location=self.location, job_id=job_id, ) -# Get existing job and wait for it to be ready -for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): -sleep(time_to_wait) -job.reload() -if job.done(): -break + +if job.done() and job.error_result: +# The job exists and finished with an error and we are probably reruning it +# So we have to make a new job_id because it has to be unique +job_id = f"{self.job_id}_{int(time())}" +job = self._submit_job(hook, job_id) +elif not job.done(): +# The job is still running so wait for it to be ready +for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): Review comment: ```def bq_load(config: Config, client: bigquery.Client, event: Event, source_uri: str) -> JobResultEvent: logger.info("Loading GSheet to BQ") job_config = LoadJobConfig( autodetect=True, create_disposition="CREATE_IF_NEEDED", write_disposition="WRITE_TRUNCATE", source_format="NEWLINE_DELIMITED_JSON" ) logger.info("loading_table") table_name = f"{config.project_id}.{config.dataset_name}.{event.destination_table}" job: LoadJob = client.load_table_from_uri("gs://" + source_uri, table_name, job_config=job_config) try: job.result() result_event = JobResultEvent.from_job_result_and_event(job, event) except (GoogleCloudError, TimeoutError, ClientError) as _: result_event = JobResultEvent.from_job_result_and_event(job, event) logger.error("BQ Job Failed") return result_event ``` That's one example of how I do it in an app This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator
albertocalderari commented on a change in pull request #9590: URL: https://github.com/apache/airflow/pull/9590#discussion_r449076100 ## File path: airflow/providers/google/cloud/operators/bigquery.py ## @@ -1692,32 +1692,52 @@ def prepare_template(self) -> None: with open(self.configuration, 'r') as file: self.configuration = json.loads(file.read()) +def _submit_job(self, hook: BigQueryHook, job_id: str): +# Submit a new job +job = hook.insert_job( +configuration=self.configuration, +project_id=self.project_id, +location=self.location, +job_id=job_id, +) +# Start the job and wait for it to complete and get the result. +job.result() +return job + def execute(self, context: Any): hook = BigQueryHook( gcp_conn_id=self.gcp_conn_id, delegate_to=self.delegate_to, ) -job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}" +exec_date = context['execution_date'].isoformat() +job_id = self.job_id or f"airflow_{self.dag_id}_{self.task_id}_{exec_date}" + try: -job = hook.insert_job( -configuration=self.configuration, -project_id=self.project_id, -location=self.location, -job_id=job_id, -) -# Start the job and wait for it to complete and get the result. -job.result() +# Submit a new job +job = self._submit_job(hook, job_id) except Conflict: +# If the job already exists retrieve it job = hook.get_job( project_id=self.project_id, location=self.location, job_id=job_id, ) -# Get existing job and wait for it to be ready -for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): -sleep(time_to_wait) -job.reload() -if job.done(): -break + +if job.done() and job.error_result: +# The job exists and finished with an error and we are probably reruning it +# So we have to make a new job_id because it has to be unique +job_id = f"{self.job_id}_{int(time())}" Review comment: I see - yet you won't be able to re-poll for this job since it uses a the current time, which is not reproducible on an "eventual" next run. Though much better than what is now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Dewsmen commented on issue #8557: Count SKIPPED as SUCCESS if wait_for_downstream=True
Dewsmen commented on issue #8557: URL: https://github.com/apache/airflow/issues/8557#issuecomment-652951751 The issue was fixed and merged with [PR](https://github.com/apache/airflow/pull/7735) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Dewsmen closed issue #8557: Count SKIPPED as SUCCESS if wait_for_downstream=True
Dewsmen closed issue #8557: URL: https://github.com/apache/airflow/issues/8557 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9628: fix PR checks
boring-cyborg[bot] commented on pull request #9628: URL: https://github.com/apache/airflow/pull/9628#issuecomment-653020698 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch master updated (ee20086 -> 611d449)
This is an automated email from the ASF dual-hosted git repository. turbaszek pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/airflow.git. from ee20086 Move S3TaskHandler to the AWS provider package (#9602) add 611d449 Use supports_read instead of is_supported in log endpoint (#9628) No new revisions were added by this update. Summary of changes: airflow/api_connexion/endpoints/log_endpoint.py| 2 +- tests/api_connexion/endpoints/test_log_endpoint.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
svn commit: r40270 - /dev/airflow/1.10.11rc1/
Author: kaxilnaik Date: Thu Jul 2 14:43:49 2020 New Revision: 40270 Log: Add artifacts for Airflow 1.10.11rc1 Added: dev/airflow/1.10.11rc1/ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz (with props) dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz (with props) dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl (with props) dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.asc dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.sha512 Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz == Binary file - no diff available. Propchange: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz -- svn:mime-type = application/octet-stream Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc == --- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc (added) +++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.asc Thu Jul 2 14:43:49 2020 @@ -0,0 +1,11 @@ +-BEGIN PGP SIGNATURE- + +iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl798HMACgkQdfzNCiX6 +Dktsagf/Tq6SH4KyouwL9EuNMKQirkFg1HUDRubZS4xFKB4AmXtw/fuqTmwFun0E +b00tmfIwVRaVRyC6sYx4OUp8MPrMklf7xugwpr0phd0244jcZcwsclm+W0oRkFen +q8I0f+51Gjt1+NIUOrwS+HQxPQmUwdU8xvEXXTLN9hdijrixUBlEM7iV7zv2OFZy +C+2/IDzxO0cP/YSwwOiqtynm03WI7skmBtGjBeEi7YU6+FiFnpKj+I+GZea8pmCw +QToyrji07b1OH+6XE0xsLp1X1kwZYvWGCQMK1HWS2CMyQ3jMbp+lDAVLrpJ8BWPf +IY3jW6w70aSzFaNXonQFSx9CSGbLdQ== +=Dy08 +-END PGP SIGNATURE- Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 == --- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 (added) +++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-bin.tar.gz.sha512 Thu Jul 2 14:43:49 2020 @@ -0,0 +1,4 @@ +apache-airflow-1.10.11rc1-bin.tar.gz: DDE50E9D 01513C8F D033F2C5 4A54AB83 + 94DB2E00 673F83CE F154F0D3 01284256 + 4E04ACDD 249F5F1D 505D3676 56B3EB74 + 001C2B5C B429431F 2D298DC8 DD9C18F9 Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz == Binary file - no diff available. Propchange: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz -- svn:mime-type = application/octet-stream Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc == --- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc (added) +++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.asc Thu Jul 2 14:43:49 2020 @@ -0,0 +1,11 @@ +-BEGIN PGP SIGNATURE- + +iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl798G0ACgkQdfzNCiX6 +Dkvd6Af/QgtMu+Alj/aysZhLJHsZBKJ7EJJ40cmotrbRLuKtmpx0MXw37Vyb9R9b +f9fiCVvRVNhDqBE+B9FdkXlGrmmC3z1veB+i3HCcTmC3MT1IxtZqAb020DGbMG02 +OueXILrkisAHLo+FH0anHSARzqoW1UaN/H1fPWSzQVz0yfkeRL11bsgeqLzYzORX +IXzI9y6M85mdTyeTxGhn0CuzXctOUgXKgFkJ+fdIIhRZ2PWs/Q1+vsvngI8cu9QJ +kEtAL9Y27keRbLmCC+w7Ps3ns3DeBkjEXQ6/cE72+ce/DZrg7sApfnXAd+SWLtX2 +w2wCswqYW1mUCFQRZUngMpkYLQFgoQ== +=ytP8 +-END PGP SIGNATURE- Added: dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 == --- dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 (added) +++ dev/airflow/1.10.11rc1/apache-airflow-1.10.11rc1-source.tar.gz.sha512 Thu Jul 2 14:43:49 2020 @@ -0,0 +1,3 @@ +apache-airflow-1.10.11rc1-source.tar.gz: +877F7562 7EA81ABF 17C97958 5A272147 BB756FBE 83518A81 14353207 42082370 6504A88E + F07DCE3C 61FBAAFB B1C6C747 E69D04F2 DA092F99 4C020BCC 9083FEDE Added: dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl == Binary file - no diff available. Propchange: dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl -- svn:mime-type = application/octet-stream Added: dev/airflow/1.10.11rc1/apache_airflow-1.10.11rc1-py2.py3-none-any.whl.asc
[GitHub] [airflow] dossett commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
dossett commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449067489 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -502,32 +506,79 @@ def __init__(self, self.timeout = timeout self.metadata = metadata self.gcp_conn_id = gcp_conn_id +self.delete_on_error = delete_on_error + +def _create_cluster(self, hook): +operation = hook.create_cluster( +project_id=self.project_id, +region=self.region, +cluster=self.cluster, +request_id=self.request_id, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) +cluster = operation.result() +self.log.info("Cluster created.") +return cluster + +def _delete_cluster(self, hook): +self.log.info("Deleting the cluster") +hook.delete_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info("Cluster %s deleted", self.cluster_name) Review comment: The ERROR cluster will eventually be lifecycled by dataproc, but in my experience that often takes longer (20-30 minutes) than typical retry counts / retry waits will cover. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
ephraimbuddy commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653060319 @potiuk Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
olchas commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449082822 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -502,32 +506,79 @@ def __init__(self, self.timeout = timeout self.metadata = metadata self.gcp_conn_id = gcp_conn_id +self.delete_on_error = delete_on_error + +def _create_cluster(self, hook): +operation = hook.create_cluster( +project_id=self.project_id, +region=self.region, +cluster=self.cluster, +request_id=self.request_id, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) +cluster = operation.result() +self.log.info("Cluster created.") +return cluster + +def _delete_cluster(self, hook): +self.log.info("Deleting the cluster") +hook.delete_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info("Cluster %s deleted", self.cluster_name) Review comment: But I meant to raise an exception **after** the cluster is deleted. Then, as far as I understand, on retry we would follow the logic of cluster existing but being in 'DELETING' state, so we would get another chance to create it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on pull request #9624: Move StackdriverTaskHandler to the provider package
turbaszek commented on pull request #9624: URL: https://github.com/apache/airflow/pull/9624#issuecomment-653009381 @ephraimbuddy can you please take a look at the CI errors? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
turbaszek commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653009423 @ephraimbuddy can you please take a look at the CI errors? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek merged pull request #9628: fix PR checks
turbaszek merged pull request #9628: URL: https://github.com/apache/airflow/pull/9628 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
olchas commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449056748 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -502,32 +506,79 @@ def __init__(self, self.timeout = timeout self.metadata = metadata self.gcp_conn_id = gcp_conn_id +self.delete_on_error = delete_on_error + +def _create_cluster(self, hook): +operation = hook.create_cluster( +project_id=self.project_id, +region=self.region, +cluster=self.cluster, +request_id=self.request_id, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) +cluster = operation.result() +self.log.info("Cluster created.") +return cluster + +def _delete_cluster(self, hook): +self.log.info("Deleting the cluster") +hook.delete_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info("Cluster %s deleted", self.cluster_name) + +def _get_cluster(self, hook): +return hook.get_cluster( +project_id=self.project_id, +region=self.region, +cluster_name=self.cluster_name, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) + +def _handle_error_state(self, hook): +self.log.info("Cluster is in ERROR state") +gcs_uri = hook.diagnose_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info( +'Diagnostic information for cluster %s available at: %s', +self.cluster_name, gcs_uri +) +if self.delete_on_error: +self._delete_cluster(hook) def execute(self, context): self.log.info('Creating cluster: %s', self.cluster_name) hook = DataprocHook(gcp_conn_id=self.gcp_conn_id) try: -operation = hook.create_cluster( -project_id=self.project_id, -region=self.region, -cluster=self.cluster, -request_id=self.request_id, -retry=self.retry, -timeout=self.timeout, -metadata=self.metadata, -) -cluster = operation.result() -self.log.info("Cluster created.") +cluster = self._create_cluster(hook) except AlreadyExists: -cluster = hook.get_cluster( -project_id=self.project_id, -region=self.region, -cluster_name=self.cluster_name, -retry=self.retry, -timeout=self.timeout, -metadata=self.metadata, -) self.log.info("Cluster already exists.") +cluster = self._get_cluster(hook) + +if cluster.status.state == cluster.status.ERROR: +self._handle_error_state(hook) +elif cluster.status.state == cluster.status.DELETING: +# Wait for cluster to delete +for time_to_sleep in exponential_sleep_generator(initial=10, maximum=120): Review comment: I know that we are waiting here for an operation that is supposed to finish within a finite amount but what do you think about adding a maximum amount of total sleep before raising an exception? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
potiuk commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653056290 Look at the logs @ephraimbuddy -> the "ExternalLoggingMixin" is a bit strange as I cannot see it in the logging_mixin. Where is it from? Traceback (most recent call last): File "/import_all_provider_classes.py", line 61, in import_all_provider_classes _module = importlib.import_module(module_name) File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 994, in _gcd_import File "", line 971, in _find_and_load File "", line 955, in _find_and_load_unlocked File "", line 665, in _load_unlocked File "", line 678, in exec_module File "", line 219, in _call_with_frames_removed File "/usr/local/lib/python3.6/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 34, in from airflow.utils.log.logging_mixin import ExternalLoggingMixin, LoggingMixin ImportError: cannot import name 'ExternalLoggingMixin' ERROR ENCOUNTERED! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on a change in pull request #9277: [WIP] Health endpoint spec
turbaszek commented on a change in pull request #9277: URL: https://github.com/apache/airflow/pull/9277#discussion_r448975124 ## File path: airflow/api_connexion/endpoints/health_endpoint.py ## @@ -14,13 +14,35 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. - -# TODO(mik-laj): We have to implement it. -# Do you want to help? Please look at: https://github.com/apache/airflow/issues/8144 +from airflow.api_connexion.schemas.health_schema import health_schema +from airflow.jobs.scheduler_job import SchedulerJob def get_health(): """ -Checks if the API works +Return the health of the airflow scheduler and metadatabase """ -return "OK" +HEALTHY = "healthy" # pylint: disable=invalid-name +UNHEALTHY = "unhealthy" # pylint: disable=invalid-name Review comment: If you will move those out of the function then there will be no need to disable pylint :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on pull request #9616: local job heartbeat callback should use session from provide_session
turbaszek commented on pull request #9616: URL: https://github.com/apache/airflow/pull/9616#issuecomment-653012832 @pingzh can you please take a look at the failing tests? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil commented on pull request #8992: [AIRFLOW-5391] Do not re-run skipped tasks when they are cleared (#7276)
kaxil commented on pull request #8992: URL: https://github.com/apache/airflow/pull/8992#issuecomment-653022789 Hi @yuqian90, apologies this won't make it to 1.10.11 as the LatestOnlyOperator causes a change in behaviour. Is it possible for you to achieve this without changing the behaviour (talking about the note in Updating.md) or adding a flag to have old behavior vs new behaviour (default would be old behaviour). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on pull request #9604: Move CloudwatchTaskHandler to the provider package
turbaszek commented on pull request #9604: URL: https://github.com/apache/airflow/pull/9604#issuecomment-653022472 @ephraimbuddy please rebase, I've just merged this fix in #9628 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9629: Updated link to official documentation
boring-cyborg[bot] commented on pull request #9629: URL: https://github.com/apache/airflow/pull/9629#issuecomment-653036293 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek merged pull request #9629: Updated link to official documentation
turbaszek merged pull request #9629: URL: https://github.com/apache/airflow/pull/9629 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch master updated (611d449 -> 37ca8ad)
This is an automated email from the ASF dual-hosted git repository. turbaszek pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/airflow.git. from 611d449 Use supports_read instead of is_supported in log endpoint (#9628) add 37ca8ad Updated link to official documentation (#9629) No new revisions were added by this update. Summary of changes: docs/project.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [airflow] ephraimbuddy commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
ephraimbuddy commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653044347 Hi @turbaszek , what could be the cause of the backport packages CI build error? I can't seem to figure it out This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek opened a new pull request #9631: Add function to get current context
turbaszek opened a new pull request #9631: URL: https://github.com/apache/airflow/pull/9631 Support for getting current context at any codelocation that runs under the scope of BaseOperator.execute function. This functionality is part of AIP-31 closes: #8058 Work based on @jonathanshir PR #8651 --- Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [ ] Unit tests coverage for changes (not needed for documentation changes) - [x] Target Github ISSUE in description if exists - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
olchas commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449062517 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -437,6 +437,9 @@ class DataprocCreateClusterOperator(BaseOperator): :type project_id: str :param region: leave as 'global', might become relevant in the future. (templated) :type region: str +:parm delete_on_error: If true the claster will be deleted if created with ERROR state. Default Review comment: ```suggestion :param delete_on_error: If true the cluster will be deleted if created with ERROR state. Default ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] olchas commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
olchas commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449062072 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -502,32 +506,79 @@ def __init__(self, self.timeout = timeout self.metadata = metadata self.gcp_conn_id = gcp_conn_id +self.delete_on_error = delete_on_error + +def _create_cluster(self, hook): +operation = hook.create_cluster( +project_id=self.project_id, +region=self.region, +cluster=self.cluster, +request_id=self.request_id, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) +cluster = operation.result() +self.log.info("Cluster created.") +return cluster + +def _delete_cluster(self, hook): +self.log.info("Deleting the cluster") +hook.delete_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info("Cluster %s deleted", self.cluster_name) Review comment: I wonder if it would not be a good idea to raise an exception here. It just seems weird to me that an operator that is supposed to create a cluster ends up deleting one instead and even returns a reference to no-longer-existing cluster. Raising an exception would also allow a second attempt to create the cluster on retry (if retries apply, of course). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #9628: fix PR checks
potiuk commented on pull request #9628: URL: https://github.com/apache/airflow/pull/9628#issuecomment-652965696 I think this was a transient error - rerunning it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] OmairK commented on a change in pull request #9277: [WIP] Health endpoint spec
OmairK commented on a change in pull request #9277: URL: https://github.com/apache/airflow/pull/9277#discussion_r448972246 ## File path: airflow/api_connexion/endpoints/health_endpoint.py ## @@ -14,13 +14,34 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. - -# TODO(mik-laj): We have to implement it. -# Do you want to help? Please look at: https://github.com/apache/airflow/issues/8144 +from airflow.api_connexion.schemas.health_schema import health_schema +from airflow.configuration import conf +from airflow.jobs.scheduler_job import SchedulerJob +from airflow.utils.session import provide_session def get_health(): """ -Checks if the API works +Return the health of the airflow scheduler and metadatabase """ -return "OK" +payload = { +'metadatabase': {'status': 'unhealthy'} +} + +latest_scheduler_heartbeat = None +scheduler_status = 'unhealthy' +payload['metadatabase'] = {'status': 'healthy'} Review comment: Thanks, fixed it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] OmairK commented on a change in pull request #9277: [WIP] Health endpoint spec
OmairK commented on a change in pull request #9277: URL: https://github.com/apache/airflow/pull/9277#discussion_r448971809 ## File path: airflow/api_connexion/endpoints/health_endpoint.py ## @@ -14,13 +14,34 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. - -# TODO(mik-laj): We have to implement it. -# Do you want to help? Please look at: https://github.com/apache/airflow/issues/8144 +from airflow.api_connexion.schemas.health_schema import health_schema +from airflow.configuration import conf +from airflow.jobs.scheduler_job import SchedulerJob +from airflow.utils.session import provide_session def get_health(): """ -Checks if the API works +Return the health of the airflow scheduler and metadatabase """ -return "OK" +payload = { +'metadatabase': {'status': 'unhealthy'} +} + +latest_scheduler_heartbeat = None +scheduler_status = 'unhealthy' +payload['metadatabase'] = {'status': 'healthy'} Review comment: Thanks. [Fixed it](https://github.com/apache/airflow/pull/9277/files#diff-d7bb321505e9703c67dc7c78ff5b55deR25-R48) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] j-y-matsubara commented on a change in pull request #9531: Support .airflowignore for plugins
j-y-matsubara commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r448993794 ## File path: airflow/utils/file.py ## @@ -90,6 +90,47 @@ def open_maybe_zipped(fileloc, mode='r'): return io.open(fileloc, mode=mode) +def find_path_from_directory( +base_dir_path: str, +ignore_list_file: str) -> Generator[str, None, None]: +""" +Search the file and return the path of the file that should not be ignored. +:param base_dir_path: the base path to be searched for. +:param ignore_file_list_name: the file name in which specifies a regular expression pattern is written. Review comment: I'm sorry. It's my simple mistake. I fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on a change in pull request #9615: Mask other forms of password arguments in SparkSubmitOperator
turbaszek commented on a change in pull request #9615: URL: https://github.com/apache/airflow/pull/9615#discussion_r449012333 ## File path: airflow/providers/apache/spark/hooks/spark_submit.py ## @@ -237,8 +237,8 @@ def _mask_cmd(self, connection_cmd): # Mask any password related fields in application args with key value pair # where key contains password (case insensitive), e.g. HivePassword='abc' connection_cmd_masked = re.sub( -r"(\S*?(?:secret|password)\S*?\s*=\s*')[^']*(?=')", -r'\1**', ' '.join(connection_cmd), flags=re.I) +r"(\S*?(?:secret|password)\S*?\s*(?:=|\s+)(['\"]?))[^'^\"]+(\2)", Review comment: It would be nice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] turbaszek commented on a change in pull request #9615: Mask other forms of password arguments in SparkSubmitOperator
turbaszek commented on a change in pull request #9615: URL: https://github.com/apache/airflow/pull/9615#discussion_r449011845 ## File path: tests/providers/apache/spark/hooks/test_spark_submit.py ## @@ -748,3 +750,64 @@ def test_k8s_process_on_kill(self, mock_popen, mock_client_method): client.delete_namespaced_pod.assert_called_once_with( 'spark-pi-edf2ace37be7353a958b38733a12f8e6-driver', 'mynamespace', **kwargs) + + +@pytest.mark.parametrize( +("command", "expected"), +( +( +("spark-submit", "foo", "--bar", "baz", "--password='secret'"), +"spark-submit foo --bar baz --password='**'", +), +( +("spark-submit", "foo", "--bar", "baz", '--password="secret"'), +'spark-submit foo --bar baz --password="**"', +), +( +("spark-submit", "foo", "--bar", "baz", "--password=secret"), +"spark-submit foo --bar baz --password=**", +), +( +("spark-submit", "foo", "--bar", "baz", "--password 'secret'"), +"spark-submit foo --bar baz --password '**'", +), +( +("spark-submit", "foo", "--bar", "baz", "--password secret"), +"spark-submit foo --bar baz --password **", +), +( +("spark-submit", "foo", "--bar", "baz", '--password "secret"'), +'spark-submit foo --bar baz --password "**"', +), +( +("spark-submit", "foo", "--bar", "baz", "--secret='secret'"), +"spark-submit foo --bar baz --secret='**'", +), +( +("spark-submit", "foo", "--bar", "baz", "--foo.password='secret'"), +"spark-submit foo --bar baz --foo.password='**'", +), +( +("spark-submit",), +"spark-submit", +), + +( +("spark-submit", "foo", "--bar", "baz", "--password \"secret'"), +"spark-submit foo --bar baz --password \"secret'", +), +( +("spark-submit", "foo", "--bar", "baz", "--password 'secret\""), +"spark-submit foo --bar baz --password 'secret\"", +), +), +) +def test_masks_passwords(command: str, expected: str) -> None: Review comment: You can use https://pypi.org/project/parameterized/ in future we will probably migrate to pytest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb opened a new issue #9630: Officially support HA for scheduler component (AIP-15)
ashb opened a new issue #9630: URL: https://github.com/apache/airflow/issues/9630 Placeholder issue - details to follow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] annotated tag 1.10.11rc1 updated (317b041 -> 96e8507)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to annotated tag 1.10.11rc1 in repository https://gitbox.apache.org/repos/asf/airflow.git. *** WARNING: tag 1.10.11rc1 was modified! *** from 317b041 (commit) to 96e8507 (tag) tagging 317b0412383ccda571fbef568c9eabd70ab8e666 (commit) replaces 1.10.10rc4 by Kaxil Naik on Thu Jul 2 15:31:42 2020 +0100 - Log - Airflow 1.10.11rc1 -BEGIN PGP SIGNATURE- iQEzBAABCAAdFiEEEnF1VgQO7y7q8bnCdfzNCiX6DksFAl7979kACgkQdfzNCiX6 DkvNRQgAguHDONDBZnEfcsLonuSEq48F61dCzp8ox8rDK/yA+mhl5SmQEFiv/45A iDQD9aEAoW67tHngElTO5wagAYtccVbCHRoMKSIc8EadrSWWDyy0VxoiDEMkalI2 bMVwsSHDxGDyA0nkl4QWRDOdaGe5xcsYWm+k4QgAz0GCeOWKaCup6TZmGTtQelNH Fiz2r1njpdlQWXrl1L0ncXtS0hfmiaGQaaG58j+wUqKhWvhurHWcae+EuBEQYbuy APBmyBn1m3xkBc41pZCr8/0FGpasKMDxwSWH0a7QMfh7HAG/fO/YbS+5XwRlGz/o paGdSBlwzX8Fd469l8f5VPfGManLcA== =MBko -END PGP SIGNATURE- --- No new revisions were added by this update. Summary of changes:
[GitHub] [airflow] dossett commented on a change in pull request #9593: Improve handling Dataproc cluster creation with ERROR state
dossett commented on a change in pull request #9593: URL: https://github.com/apache/airflow/pull/9593#discussion_r449064584 ## File path: airflow/providers/google/cloud/operators/dataproc.py ## @@ -502,32 +506,79 @@ def __init__(self, self.timeout = timeout self.metadata = metadata self.gcp_conn_id = gcp_conn_id +self.delete_on_error = delete_on_error + +def _create_cluster(self, hook): +operation = hook.create_cluster( +project_id=self.project_id, +region=self.region, +cluster=self.cluster, +request_id=self.request_id, +retry=self.retry, +timeout=self.timeout, +metadata=self.metadata, +) +cluster = operation.result() +self.log.info("Cluster created.") +return cluster + +def _delete_cluster(self, hook): +self.log.info("Deleting the cluster") +hook.delete_cluster( +region=self.region, +cluster_name=self.cluster_name, +project_id=self.project_id, +) +self.log.info("Cluster %s deleted", self.cluster_name) Review comment: If the cluster isn't deleted, then the retries will fail because the cluster already exists (even if it exists in an ERROR state and is not usable). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
potiuk commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653057551 We can't use ExtenralLoggingMixin in Airflow 1.10 providers because it only appeared TODAY in master This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch v1-10-test updated: Update README.md for 1.10.11
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/v1-10-test by this push: new 317b041 Update README.md for 1.10.11 317b041 is described below commit 317b0412383ccda571fbef568c9eabd70ab8e666 Author: Kaxil Naik AuthorDate: Thu Jul 2 15:29:41 2020 +0100 Update README.md for 1.10.11 --- README.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index d5883bf..81b935d 100644 --- a/README.md +++ b/README.md @@ -67,7 +67,7 @@ Apache Airflow is tested with: * Sqlite - latest stable (it is used mainly for development purpose) * Kubernetes - 1.16.2, 1.17.0 -### Stable version (1.10.10) +### Stable version * Python versions: 2.7, 3.5, 3.6, 3.7, 3.8 * Postgres DB: 9.6, 10 @@ -107,14 +107,14 @@ in the URL. 1. Installing just airflow: ```bash -pip install apache-airflow==1.10.10 \ - --constraint https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt +pip install apache-airflow==1.10.11 \ + --constraint https://raw.githubusercontent.com/apache/airflow/1.10.11/requirements/requirements-python3.7.txt ``` 2. Installing with extras (for example postgres,gcp) ```bash -pip install apache-airflow[postgres,gcp]==1.10.10 \ - --constraint https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt +pip install apache-airflow[postgres,gcp]==1.10.11 \ + --constraint https://raw.githubusercontent.com/apache/airflow/1.10.11/requirements/requirements-python3.7.txt ``` ## Beyond the Horizon
[GitHub] [airflow] turbaszek commented on pull request #9593: Improve handling Dataproc cluster creation with ERROR state
turbaszek commented on pull request #9593: URL: https://github.com/apache/airflow/pull/9593#issuecomment-652980847 @olchas would you mind taking a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] j-y-matsubara commented on a change in pull request #9531: Support .airflowignore for plugins
j-y-matsubara commented on a change in pull request #9531: URL: https://github.com/apache/airflow/pull/9531#discussion_r448993794 ## File path: airflow/utils/file.py ## @@ -90,6 +90,47 @@ def open_maybe_zipped(fileloc, mode='r'): return io.open(fileloc, mode=mode) +def find_path_from_directory( +base_dir_path: str, +ignore_list_file: str) -> Generator[str, None, None]: +""" +Search the file and return the path of the file that should not be ignored. +:param base_dir_path: the base path to be searched for. +:param ignore_file_list_name: the file name in which specifies a regular expression pattern is written. Review comment: I'm sorry. It's a simple mistake on my part. I fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9629: Updated link to official documentation
boring-cyborg[bot] commented on pull request #9629: URL: https://github.com/apache/airflow/pull/9629#issuecomment-653031230 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://apache-airflow-slack.herokuapp.com/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] aviralwal opened a new pull request #9629: Updated link to official documentation
aviralwal opened a new pull request #9629: URL: https://github.com/apache/airflow/pull/9629 The link to official documentation should point to the documentation page instead of the home page. --- Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Target Github ISSUE in description if exists - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] 01/02: Update the tree view of dag on Concepts Last Run Only (#8268)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 34aabacb1a7bf8ad34fe66b54be152dbee250134 Author: Rafael Bottega AuthorDate: Thu Apr 16 23:13:18 2020 +0100 Update the tree view of dag on Concepts Last Run Only (#8268) Resolves #8246 (cherry picked from commit 44ddf54adf7cfe57bfea98cd2726152a2ba19e18) --- docs/img/latest_only_with_trigger.png | Bin 49510 -> 42887 bytes 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/docs/img/latest_only_with_trigger.png b/docs/img/latest_only_with_trigger.png index 623f8ee..8fc2df9 100644 Binary files a/docs/img/latest_only_with_trigger.png and b/docs/img/latest_only_with_trigger.png differ
[jira] [Commented] (AIRFLOW-5391) Clearing a task skipped by BranchPythonOperator will cause the task to execute
[ https://issues.apache.org/jira/browse/AIRFLOW-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150314#comment-17150314 ] ASF GitHub Bot commented on AIRFLOW-5391: - kaxil commented on pull request #8992: URL: https://github.com/apache/airflow/pull/8992#issuecomment-653022789 Hi @yuqian90, apologies this won't make it to 1.10.11 as the LatestOnlyOperator causes a change in behaviour. Is it possible for you to achieve this without changing the behaviour (talking about the note in Updating.md) or adding a flag to have old behavior vs new behaviour (default would be old behaviour). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Clearing a task skipped by BranchPythonOperator will cause the task to execute > -- > > Key: AIRFLOW-5391 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5391 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Affects Versions: 1.10.4 >Reporter: Qian Yu >Assignee: Qian Yu >Priority: Major > Fix For: 2.0.0 > > > I tried this on 1.10.3 and 1.10.4, both have this issue: > E.g. in this example from the doc, branch_a executed, branch_false was > skipped because of branching condition. However if someone Clear > branch_false, it'll cause branch_false to execute. > !https://airflow.apache.org/_images/branch_good.png! > This behaviour is understandable given how BranchPythonOperator is > implemented. BranchPythonOperator does not store its decision anywhere. It > skips its own downstream tasks in the branch at runtime. So there's currently > no way for branch_false to know it should be skipped without rerunning the > branching task. > This is obviously counter-intuitive from the user's perspective. In this > example, users would not expect branch_false to execute when they clear it > because the branching task should have skipped it. > There are a few ways to improve this: > Option 1): Make downstream tasks skipped by BranchPythonOperator not > clearable without also clearing the upstream BranchPythonOperator. In this > example, if someone clears branch_false without clearing branching, the Clear > action should just fail with an error telling the user he needs to clear the > branching task as well. > Option 2): Make BranchPythonOperator store the result of its skip condition > somewhere. Make downstream tasks check for this stored decision and skip > themselves if they should have been skipped by the condition. This probably > means the decision of BranchPythonOperator needs to be stored in the db. > > [kevcampb|https://blog.diffractive.io/author/kevcampb/] attempted a > workaround and on this blog. And he acknowledged his workaround is not > perfect and a better permanent fix is needed: > [https://blog.diffractive.io/2018/08/07/replacement-shortcircuitoperator-for-airflow/] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[airflow] 02/02: Add Changelog for 1.10.11
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 3e080c240e9968c4086c32943a22b3c4010453df Author: Kaxil Naik AuthorDate: Wed Jul 1 19:40:29 2020 +0100 Add Changelog for 1.10.11 --- .pre-commit-config.yaml | 3 +- CHANGELOG.txt | 194 2 files changed, 196 insertions(+), 1 deletion(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index e7b90db..a017dad 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -245,7 +245,8 @@ repos: (?x) ^airflow/contrib/hooks/cassandra_hook.py$| ^airflow/operators/hive_stats_operator.py$| - ^tests/contrib/hooks/test_cassandra_hook.py + ^tests/contrib/hooks/test_cassandra_hook.py| + ^CHANGELOG.txt - id: dont-use-safe-filter language: pygrep name: Don't use safe in templates diff --git a/CHANGELOG.txt b/CHANGELOG.txt index a8aa353..0313e7b 100644 --- a/CHANGELOG.txt +++ b/CHANGELOG.txt @@ -1,3 +1,196 @@ +Airflow 1.10.11, 2020-07-05 +- + +New Features + + +- Add task instance mutation hook (#8852) +- Allow changing Task States Colors (#9520) +- Add support for AWS Secrets Manager as Secrets Backend (#8186) +- Add airflow info command to the CLI (#8704) +- Add Local Filesystem Secret Backend (#8596) +- Add Airflow config CLI command (#8694) +- Add Support for Python 3.8 (#8836)(#8823) +- Allow K8S worker pod to be configured from JSON/YAML file (#6230) +- Add quarterly to crontab presets (#6873) +- Add support for ephemeral storage on KubernetesPodOperator (#6337) +- Add AirflowFailException to fail without any retry (#7133) +- Add SQL Branch Operator (#8942) + +Bug Fixes +" + +- Use NULL as dag.description default value (#7593) +- BugFix: DAG trigger via UI error in RBAC UI (#8411) +- Fix logging issue when running tasks (#9363) +- Fix JSON encoding error in DockerOperator (#8287) +- Fix alembic crash due to typing import (#6547) +- Correctly restore upstream_task_ids when deserializing Operators (#8775) +- Correctly store non-default Nones in serialized tasks/dags (#8772) +- Correctly deserialize dagrun_timeout field on DAGs (#8735) +- Fix tree view if config contains " (#9250) +- Fix Dag Run UI execution date with timezone cannot be saved issue (#8902) +- Fix Migration for MSSQL (#8385) +- RBAC ui: Fix missing Y-axis labels with units in plots (#8252) +- RBAC ui: Fix missing task runs being rendered as circles instead (#8253) +- Fix: DagRuns page renders the state column with artifacts in old UI (#9612) +- Fix task and dag stats on home page (#8865) +- Fix the trigger_dag api in the case of nested subdags (#8081) +- UX Fix: Prevent undesired text selection with DAG title selection in Chrome (#8912) +- Fix connection add/edit for spark (#8685) +- Fix retries causing constraint violation on MySQL with DAG Serialization (#9336) +- [AIRFLOW-4472] Use json.dumps/loads for templating lineage data (#5253) +- Restrict google-cloud-texttospeach to committer (#7392) - [AIRFLOW-] Remove duplicated paragraph in docs (#7662) - Fix reference to KubernetesPodOperator (#8100) +- Update the tree view of dag on Concepts Last Run Only (#8268) Airflow 1.10.9, 2020-02-07
[airflow] branch v1-10-test updated (a5a588e -> 3e080c2)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git. omit a5a588e Add Changelog for 1.10.11 new 34aabac Update the tree view of dag on Concepts Last Run Only (#8268) new 3e080c2 Add Changelog for 1.10.11 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (a5a588e) \ N -- N -- N refs/heads/v1-10-test (3e080c2) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG.txt | 1 + docs/img/latest_only_with_trigger.png | Bin 49510 -> 42887 bytes 2 files changed, 1 insertion(+)
[airflow] branch v1-10-stable updated (8b05289 -> 317b041)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch v1-10-stable in repository https://gitbox.apache.org/repos/asf/airflow.git. from 8b05289 Cache 1 10 ci images (#8955) add 69eeeda Add Local Filesystem Secret Backend (v1-10) (#8596) add ac257fe Reduce response payload size of /dag_stats and /task_stats (#8655) add 313d09e Backport Airflow config command (1.10.*) (#8694) add 8eb4565 Add airflow info command (v1-10-test) (#8704) add c79e7df Latest debian-buster release broke image build (#8758) add a8d8903 Show Deprecation warning on duplicate Task ids (#8728) add 3b70308 [8650] Add Yandex.Cloud custom connection to 1.10 (#8791) add 908962a [AIRFLOW-4052] Allow filtering using "event" and "owner" in "Log" view (#4881) add cd32afa Azure storage 0.37.0 is not installable any more (#8833) add 0c17935 Avoid failure on transient requirements in CI image add f5d89ed Use Debian's provided JRE from Buster (#8919) add a2d3acd Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938) add d0b0207 Fix new flake8 warnings on v1-10-test branch (#8953) add b2a4032 [AIRFLOW-3367] Run celery integration test with redis broker. (#4207) add 64db6e6 Fix race in Celery tests by pre-creating result tables (#8909) add a3aa995 Pin Version of Azure Cosmos to <4 (#8956) add 5664e36 Fix timing-based flakey test in TestLocalTaskJob (#8405) add 23d5ea0 Use production image for k8s tests (#9038) add 3437663 Move k8sexecutor out of contrib to closer match master (#8904) add 0925741 [AIRFLOW-4851] Refactor K8S codebase with k8s API models (#5481) add a5e7b99 [AIRFLOW-5443] Use alpine image in Kubernetes's sidecar (#6059) add 9444b4c [AIRFLOW-5445] Reduce the required resources for the Kubernetes's sidecar (#6062) add 4c484ef [AIRFLOW-5873] KubernetesPodOperator fixes and test (#6524) add 4918b85 [AIRFLOW-6959] Use NULL as dag.description default value (#7593) add 2fa5157 Add note about using dag_run.conf in BashOperator (#9143) add 570c9fa Fix --forward-credentials flag in Breeze (#8554) add 79d34ea Fixed optimistions of non-py-code builds (#8601) add c3d4396 Fix the process of requirements generations (#8648) add 264a94b Fixed test-target command (#8795) add c057430 Add comments to breeze scripts (#8797) add 6ba874b Useful help information in test-target and docker-compose commands (#8796) add 1831e79 The librabbitmq library stopped installing for python3.7 (#8853) add 41808a7 Use Debian's provided JRE from Buster (#8919) add cf25e53 Re-run all tests when Dockerfile or Github worflow change (#8924) add 0efaa00 Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938) add 7b4e1a4 Python base images are stored in cache (#8943) add a41801c Add ADDITIONAL_PYTHON_DEPS (#9031) add ffe496a Add ADDITIONAL_AIRFLOW_EXTRAS (#9032) add 5683783 Additional python extras and deps can be set in breeze (#9035) add dbb4284 detect incompatible docker server version in breeze (#9042) add 214b508 Adds hive as extra in pyhive (#9075) add 7d3dab1 Prevents failure on fixing permissions for files with space in it (#9076) add 4f1a319 Enable configurable git sync depth (#9094) add 5c45091 Don't reuse MY_DIR in breeze to mean different folder from ci/_utils.sh (#9098) add e5df858 You can push with Breeze as separate command and to cache (#8976) add d83331b Produce less verbose output when building docker mount options (#9103) add f099416 Display docs errors summary (#8392) add 66ab8c3 Remove Hive/Hadoop/Java dependency from unit tests (#9029) add 32ed3c6 Kubernetes Cluster is started on host not in the container (#8265) add d505e8d Fixes a bug where `build-image` command did not calculate md5 (#9130) add c7c3561 Fix INTEGRATIONS[*]: unbound variable error in breeze (#9135) add 77998f5 Cope with multiple processes get_remote_image_info in parallel (#9105) add 6d07eac Remove remnant kubernetes stuff from breeze scripts (#9138) add 19f6065 Restrict google-cloud-texttospeach to }/executors/kubernetes_executor.py | 195 +-- airflow/executors/local_executor.py| 32 +- airflow/executors/sequential_executor.py |9 + airflow/hooks/base_hook.py |4 +- airflow/hooks/dbapi_hook.py|2 +- airflow/hooks/hive_hooks.py|7 +- airflow/hooks/webhdfs_hook.py |6 +- airflow/jobs/backfill_job.py |5 +- airflow/jobs/base_job.py | 23 +- airflow/jobs/local_task_job.py |8 - airflow/jobs/scheduler_job.py | 64 +- .../__init__.py
[airflow] branch v1-10-stable updated (8b05289 -> 317b041)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch v1-10-stable in repository https://gitbox.apache.org/repos/asf/airflow.git. from 8b05289 Cache 1 10 ci images (#8955) add 69eeeda Add Local Filesystem Secret Backend (v1-10) (#8596) add ac257fe Reduce response payload size of /dag_stats and /task_stats (#8655) add 313d09e Backport Airflow config command (1.10.*) (#8694) add 8eb4565 Add airflow info command (v1-10-test) (#8704) add c79e7df Latest debian-buster release broke image build (#8758) add a8d8903 Show Deprecation warning on duplicate Task ids (#8728) add 3b70308 [8650] Add Yandex.Cloud custom connection to 1.10 (#8791) add 908962a [AIRFLOW-4052] Allow filtering using "event" and "owner" in "Log" view (#4881) add cd32afa Azure storage 0.37.0 is not installable any more (#8833) add 0c17935 Avoid failure on transient requirements in CI image add f5d89ed Use Debian's provided JRE from Buster (#8919) add a2d3acd Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938) add d0b0207 Fix new flake8 warnings on v1-10-test branch (#8953) add b2a4032 [AIRFLOW-3367] Run celery integration test with redis broker. (#4207) add 64db6e6 Fix race in Celery tests by pre-creating result tables (#8909) add a3aa995 Pin Version of Azure Cosmos to <4 (#8956) add 5664e36 Fix timing-based flakey test in TestLocalTaskJob (#8405) add 23d5ea0 Use production image for k8s tests (#9038) add 3437663 Move k8sexecutor out of contrib to closer match master (#8904) add 0925741 [AIRFLOW-4851] Refactor K8S codebase with k8s API models (#5481) add a5e7b99 [AIRFLOW-5443] Use alpine image in Kubernetes's sidecar (#6059) add 9444b4c [AIRFLOW-5445] Reduce the required resources for the Kubernetes's sidecar (#6062) add 4c484ef [AIRFLOW-5873] KubernetesPodOperator fixes and test (#6524) add 4918b85 [AIRFLOW-6959] Use NULL as dag.description default value (#7593) add 2fa5157 Add note about using dag_run.conf in BashOperator (#9143) add 570c9fa Fix --forward-credentials flag in Breeze (#8554) add 79d34ea Fixed optimistions of non-py-code builds (#8601) add c3d4396 Fix the process of requirements generations (#8648) add 264a94b Fixed test-target command (#8795) add c057430 Add comments to breeze scripts (#8797) add 6ba874b Useful help information in test-target and docker-compose commands (#8796) add 1831e79 The librabbitmq library stopped installing for python3.7 (#8853) add 41808a7 Use Debian's provided JRE from Buster (#8919) add cf25e53 Re-run all tests when Dockerfile or Github worflow change (#8924) add 0efaa00 Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938) add 7b4e1a4 Python base images are stored in cache (#8943) add a41801c Add ADDITIONAL_PYTHON_DEPS (#9031) add ffe496a Add ADDITIONAL_AIRFLOW_EXTRAS (#9032) add 5683783 Additional python extras and deps can be set in breeze (#9035) add dbb4284 detect incompatible docker server version in breeze (#9042) add 214b508 Adds hive as extra in pyhive (#9075) add 7d3dab1 Prevents failure on fixing permissions for files with space in it (#9076) add 4f1a319 Enable configurable git sync depth (#9094) add 5c45091 Don't reuse MY_DIR in breeze to mean different folder from ci/_utils.sh (#9098) add e5df858 You can push with Breeze as separate command and to cache (#8976) add d83331b Produce less verbose output when building docker mount options (#9103) add f099416 Display docs errors summary (#8392) add 66ab8c3 Remove Hive/Hadoop/Java dependency from unit tests (#9029) add 32ed3c6 Kubernetes Cluster is started on host not in the container (#8265) add d505e8d Fixes a bug where `build-image` command did not calculate md5 (#9130) add c7c3561 Fix INTEGRATIONS[*]: unbound variable error in breeze (#9135) add 77998f5 Cope with multiple processes get_remote_image_info in parallel (#9105) add 6d07eac Remove remnant kubernetes stuff from breeze scripts (#9138) add 19f6065 Restrict google-cloud-texttospeach to }/executors/kubernetes_executor.py | 195 +-- airflow/executors/local_executor.py| 32 +- airflow/executors/sequential_executor.py |9 + airflow/hooks/base_hook.py |4 +- airflow/hooks/dbapi_hook.py|2 +- airflow/hooks/hive_hooks.py|7 +- airflow/hooks/webhdfs_hook.py |6 +- airflow/jobs/backfill_job.py |5 +- airflow/jobs/base_job.py | 23 +- airflow/jobs/local_task_job.py |8 - airflow/jobs/scheduler_job.py | 64 +- .../__init__.py
[GitHub] [airflow] freget commented on issue #9609: TimeSensor triggers immediately when used over midnight (UTC)
freget commented on issue #9609: URL: https://github.com/apache/airflow/issues/9609#issuecomment-65311 I would also be fine with a new sensor. I still believe that this problem should be at least documented, as it is everything but obvious if the DAG is initialized timezone aware. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil commented on pull request #9586: support new released version of sendgrid
kaxil commented on pull request #9586: URL: https://github.com/apache/airflow/pull/9586#issuecomment-653175384 we will need another rebase @ephraimbuddy sorry :( This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj edited a comment on pull request #9618: Fix typos, older versions, and deprecated operators with AI platform example DAG
mik-laj edited a comment on pull request #9618: URL: https://github.com/apache/airflow/pull/9618#issuecomment-653176749 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #9618: Fix typos, older versions, and deprecated operators with AI platform example DAG
mik-laj commented on pull request #9618: URL: https://github.com/apache/airflow/pull/9618#issuecomment-653176749 @vuppalli We don't need unit tests for DAG. We use them in the documentation and in system tests. System tests are sufficient in this case. Have the files that are needed to run the tests changed? It is better if the contribution is small because then it is much easier to review and merge in the project. In this case, I would prefer you to create a new PR that will only contain documentation. If you need any changes that are included in this PR then you can copy the changes from this PR and add annotations in the PR title `[depends on ]`. Additionally, you can add a note in the description, but not everyone reads the description of the changes. Example: ``` Add guide for MLEngine [depends on #9618] ``` In order for this change to be merged, you must fix static check errors in this change. ``` airflow/providers/google/cloud/example_dags/example_mlengine.py:29:89: W291 trailing whitespace airflow/providers/google/cloud/example_dags/example_mlengine.py:30:80: W291 trailing whitespace ``` + isort For more information: https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#id1 How do you like an internship at Google? Do you have any concerns about contributing to Open Source? The community of this project is very open to new contributors and interns. We currently have two active interns who contribute to the project. If you would like to get more involved in this project, we'll be happy to help. Apache Airflow is the core of the Cloud Composer service, so contributions to this project will be appreciated by your company. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on pull request #9482: Add CRUD endpoint for XCom
ephraimbuddy commented on pull request #9482: URL: https://github.com/apache/airflow/pull/9482#issuecomment-653119162 Hi, @turbaszek can I ask for review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook
vanka56 commented on a change in pull request #9472: URL: https://github.com/apache/airflow/pull/9472#discussion_r449187394 ## File path: tests/providers/apache/hive/hooks/test_hive.py ## @@ -383,6 +383,10 @@ def test_table_exists(self): self.hook.table_exists(str(random.randint(1, 1))) ) +def test_drop_partition(self): +self.assertTrue(self.hook.drop_partitions(self.table, db=self.database, + part_vals=[DEFAULT_DATE_DS])) + Review comment: @turbaszek Yes. it uses Hivemetastore Thrift client. it does the partition from the test table set up for unit testing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] albertocalderari commented on a change in pull request #9590: Improve idempotency of BigQueryInsertJobOperator
albertocalderari commented on a change in pull request #9590: URL: https://github.com/apache/airflow/pull/9590#discussion_r449173278 ## File path: airflow/providers/google/cloud/operators/bigquery.py ## @@ -1692,32 +1692,52 @@ def prepare_template(self) -> None: with open(self.configuration, 'r') as file: self.configuration = json.loads(file.read()) +def _submit_job(self, hook: BigQueryHook, job_id: str): +# Submit a new job +job = hook.insert_job( +configuration=self.configuration, +project_id=self.project_id, +location=self.location, +job_id=job_id, +) +# Start the job and wait for it to complete and get the result. +job.result() +return job + def execute(self, context: Any): hook = BigQueryHook( gcp_conn_id=self.gcp_conn_id, delegate_to=self.delegate_to, ) -job_id = self.job_id or f"airflow_{self.task_id}_{int(time())}" +exec_date = context['execution_date'].isoformat() +job_id = self.job_id or f"airflow_{self.dag_id}_{self.task_id}_{exec_date}" + try: -job = hook.insert_job( -configuration=self.configuration, -project_id=self.project_id, -location=self.location, -job_id=job_id, -) -# Start the job and wait for it to complete and get the result. -job.result() +# Submit a new job +job = self._submit_job(hook, job_id) except Conflict: +# If the job already exists retrieve it job = hook.get_job( project_id=self.project_id, location=self.location, job_id=job_id, ) -# Get existing job and wait for it to be ready -for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): -sleep(time_to_wait) -job.reload() -if job.done(): -break + +if job.done() and job.error_result: +# The job exists and finished with an error and we are probably reruning it +# So we have to make a new job_id because it has to be unique +job_id = f"{self.job_id}_{int(time())}" +job = self._submit_job(hook, job_id) +elif not job.done(): +# The job is still running so wait for it to be ready +for time_to_wait in exponential_sleep_generator(initial=10, maximum=120): Review comment: In case the job is already running I still do ```job.result()``` and it polls, but without ding reload This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #9044: Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill`
ashb commented on pull request #9044: URL: https://github.com/apache/airflow/pull/9044#issuecomment-653168107 Could you rebase to latest master, hopefully that should fix the failing Kube tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow-pgbouncer-exporter] branch master created (now 160f560)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/airflow-pgbouncer-exporter.git. at 160f560 Merge pull request #4 from jbub/const-metrics This branch includes the following new commits: new 7ce3d9d Initial commit. new 8ac8035 Fix docker badge image url in README. new 4f3a283 Add code comments. new c276e65 Add missing method comments. new 6389264 Make database column ForceUser nullable. new 5ef12a0 Fill missing Active field in sql store GetPools method. new bf8a081 Update CHANGELOG. new d38b321 Capture build version using prometheus/common/version package. new c9de406 Refactor http server. new df56966 Return error from store Close method. new 7d075b5 Refactor http server to improve testability. new 7efe5dd Update vendor. new b14b1a1 Update CHANGELOG. new 4d8d9f1 Cleanup unused fields from HTTPServer. new 816c0c9 Reorder logging in server command to be able to actually see any logs. new 99c7f24 Add healthcheck. new b887927 Update CHANGELOG. new 920f2c9 Build with Go 1.9.2. new 7b4ae95 Add docker config to goreleaser config. new 64d97f3 Update changelog for 0.1.5. new 2ce51ba Add new fields to support PgBouncer 1.8. new 2f68ad4 Update CHANGELOG for 0.2.0. new e6fcc3d Build with Go 1.9.4. new 1f3b701 Add golangci.yml. new 7492ef7 Update vendored libs, prune tests and unused pkgs. new 58ea421 Use Go 1.10.3. new 08b31eb Bump testify version. new 6d45552 Update CHANGELOG for 0.2.2. new 23071e1 Fix duplication of release field in .goreleaser.yml. new f7a5640 Expose more stats metrics new d38d0a8 Expose more pool metrics new d10b51d Merge pull request #3 from Ometria/extra-stats-metrics new a997ab3 Use Go 1.11.2 in travis config. new 5e5a68c Add Go modules support. new ee127f5 Drop dep support. new 599d095 Update CHANGELOG for 0.3.0. new d5392f9 Fix build version passing in .goreleaser.yml. new 08553d9 Enable GO111MODULE in travis build. new 3df8ff9 Run install step in travis. new 10aaf7e Add initial drone config. new e3db533 Use drone ci, drop coveralls and golangci. new 90b1b9c Drop travis. new 044c3e8 Welcome 2019. new b30fa9b Pin dependencies versions to tags. new 2451596 Move code to internal. new e9edb61 Add go version to go.mod. new d214032 Test drone with golang:1.12. new 97437a2 Update CHANGELOG for 0.4.0. new 2dc730b Add docker example to README. new bc8ea94 Update to Go 1.13. new 7cecfdb Bump github.com/lib/pq to v1.3.0. new c2c0876 Bump github.com/prometheus/client_golang to v1.3.0. new 01fe33e Update to github.com/urfave/cli/v2. new aebee06 Add docker compose for testing. new c1a4faf Update CHANGELOG for 0.5.0. new 4bac8ef Update goreleaser yaml to be compatible with latest release. new 3be0522 Use skip_push auto in docker release. new e55c5f0 Update CHANGELOG for 0.5.1. new 1f60abd Do not use draft release in goreleaser. new 03e87c1 Update CHANGELOG for 0.5.2. new 24e1fc3 Use custom query in store.Check. new 777e244 Use sqlx.Open instead of sqlx.Connect to skip calling Ping. new d4e8ae3 Check store on startup. new fe94b3a Rename store.NewSQLStore to store.NewSQL. new da78521 Update CHANGELOG for 0.5.0. new 470d1d6 Refactor exporter to use NewConstMetric. new 160f560 Merge pull request #4 from jbub/const-metrics The 67 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[airflow] 02/09: Fix typo in helm chart upgrade command for 2.0 (#9484)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 47e1a875229657251fdeddaae5a2dd083572079b Author: Ash Berlin-Taylor AuthorDate: Tue Jun 23 10:38:06 2020 +0100 Fix typo in helm chart upgrade command for 2.0 (#9484) (cherry picked from commit b1cd382db9367ec828b8ee16899ecea9fcf824a7) --- chart/templates/scheduler/scheduler-deployment.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chart/templates/scheduler/scheduler-deployment.yaml b/chart/templates/scheduler/scheduler-deployment.yaml index 1b46f6a..d5c3a06 100644 --- a/chart/templates/scheduler/scheduler-deployment.yaml +++ b/chart/templates/scheduler/scheduler-deployment.yaml @@ -96,7 +96,7 @@ spec: image: {{ template "airflow_image" . }} imagePullPolicy: {{ .Values.images.airflow.pullPolicy }} # Support running against 1.10.x and 2.0.0dev/master - args: ["bash", "-c", "airflow upgradedb || airfow db upgrade"] + args: ["bash", "-c", "airflow upgradedb || airflow db upgrade"] env: {{- include "custom_airflow_environment" . | indent 10 }} {{- include "standard_airflow_environment" . | indent 10 }}
[airflow] 05/09: Remove redundant airflowVersion from Helm Chart readme (#9592)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 9bc18c1938ebd5f19e1de4a2ce7270957fd3fea6 Author: Kaxil Naik AuthorDate: Tue Jun 30 17:02:56 2020 +0100 Remove redundant airflowVersion from Helm Chart readme (#9592) We no longer use `airflowVersion` , we instead use `defaultAirflowRepository` and `defaultAirflowTag` (cherry picked from commit d6b323b0cd9be2aa941cbb1e1e15d766b4d6539b) --- chart/README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/chart/README.md b/chart/README.md index 402a9d7..089ea22 100644 --- a/chart/README.md +++ b/chart/README.md @@ -91,7 +91,6 @@ The following tables lists the configurable parameters of the Airflow chart and | `networkPolicies.enabled` | Enable Network Policies to restrict traffic | `true`| | `airflowHome` | Location of airflow home directory | `/opt/airflow`| | `rbacEnabled` | Deploy pods with Kubernets RBAC enabled | `true`| -| `airflowVersion` | Default Airflow image version | `1.10.5` | | `executor`| Airflow executor (eg SequentialExecutor, LocalExecutor, CeleryExecutor, KubernetesExecutor) | `KubernetesExecutor` | | `allowPodLaunching` | Allow airflow pods to talk to Kubernetes API to launch more pods | `true`| | `defaultAirflowRepository`| Fallback docker repository to pull airflow image from | `apache/airflow` |
[airflow] 04/09: Fix typo of resultBackendConnection in chart README (#9537)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit a213847e99b4b9f0afb7a8c0b6fc3968d04d6e40 Author: Vicken Simonian AuthorDate: Fri Jun 26 11:40:30 2020 -0700 Fix typo of resultBackendConnection in chart README (#9537) (cherry picked from commit 096f5c5cba963b364ee75f6686d128cd4d34d66e) --- chart/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/chart/README.md b/chart/README.md index d0366f6..402a9d7 100644 --- a/chart/README.md +++ b/chart/README.md @@ -119,11 +119,11 @@ The following tables lists the configurable parameters of the Airflow chart and | `data.metadataSecretName` | Secret name to mount Airflow connection string from | `~` | | `data.resultBackendSecretName`| Secret name to mount Celery result backend connection string from | `~` | | `data.metadataConection` | Field separated connection data (alternative to secret name) | `{}` | -| `data.resultBakcnedConnection`| Field separated connection data (alternative to secret name) | `{}` | +| `data.resultBackendConnection`| Field separated connection data (alternative to secret name) | `{}` | | `fernetKey` | String representing an Airflow fernet key | `~` | | `fernetKeySecretName` | Secret name for Airlow fernet key | `~` | | `workers.replicas`| Replica count for Celery workers (if applicable) | `1` | -| `workers.keda.enabled` | Enable KEDA autoscaling features | `false` | +| `workers.keda.enabled`| Enable KEDA autoscaling features | `false` | | `workers.keda.pollingInverval`| How often KEDA should poll the backend database for metrics in seconds | `5` | | `workers.keda.cooldownPeriod` | How often KEDA should wait before scaling down in seconds | `30` | | `workers.keda.maxReplicaCount`| Maximum number of Celery workers KEDA can scale to | `10` |
[airflow] 07/09: Switches to Helm Chart for Kubernetes tests (#9468)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit b4a620cc1abfee581e5a6291914efc5479e72c18 Author: Jarek Potiuk AuthorDate: Wed Jul 1 14:50:30 2020 +0200 Switches to Helm Chart for Kubernetes tests (#9468) The Kubernetes tests are now run using Helm chart rather than the custom templates we used to have. The Helm Chart uses locally build production image so the tests are testing not only Airflow but also Helm Chart and a Production image - all at the same time. Later on we will add more tests covering more functionalities of both Helm Chart and Production Image. This is the first step to get all of those bundle together and become testable. This change introduces also 'shell' sub-command for Breeze's kind-cluster command and EMBEDDED_DAGS build args for production image - both of them useful to run the Kubernetes tests more easily - without building two images and with an easy-to-iterate-over-tests shell command - which works without any other development environment. Co-authored-by: Jarek Potiuk Co-authored-by: Daniel Imberman (cherry picked from commit 8bd15ef634cca40f3cf6ca3442262f3e05144512) --- .github/workflows/ci.yml | 23 +- BREEZE.rst | 81 +++-- CI.rst | 2 +- Dockerfile | 4 + IMAGES.rst | 3 + TESTING.rst| 67 ++-- airflow/kubernetes/pod_launcher.py | 2 +- breeze | 51 ++- breeze-complete| 14 +- chart/README.md| 5 +- chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes chart/requirements.lock| 4 +- chart/templates/configmap.yaml | 2 + chart/templates/rbac/pod-launcher-role.yaml| 2 +- chart/templates/rbac/pod-launcher-rolebinding.yaml | 4 +- kubernetes_tests/test_kubernetes_executor.py | 40 ++- scripts/ci/ci_build_production_images.sh | 25 -- scripts/ci/ci_count_changed_files.sh | 2 +- scripts/ci/ci_deploy_app_to_kubernetes.sh | 16 +- scripts/ci/ci_docs.sh | 2 +- scripts/ci/ci_flake8.sh| 2 +- scripts/ci/ci_generate_requirements.sh | 2 +- scripts/ci/ci_load_image_to_kind.sh| 7 +- scripts/ci/ci_mypy.sh | 2 +- scripts/ci/ci_perform_kind_cluster_operation.sh| 6 +- scripts/ci/ci_run_airflow_testing.sh | 2 +- scripts/ci/ci_run_kubernetes_tests.sh | 6 +- scripts/ci/ci_run_static_checks.sh | 2 +- scripts/ci/kubernetes/app/postgres.yaml| 94 - .../kubernetes/app/templates/airflow.template.yaml | 207 --- .../app/templates/configmaps.template.yaml | 395 - .../app/templates/init_git_sync.template.yaml | 36 -- scripts/ci/kubernetes/app/volumes.yaml | 87 - .../docker/airflow-test-env-init-dags.sh | 36 -- .../kubernetes/docker/airflow-test-env-init-db.sh | 46 --- scripts/ci/kubernetes/docker/bootstrap.sh | 74 scripts/ci/kubernetes/kind-cluster-conf.yaml | 3 - .../kubernetes/{app/secrets.yaml => volumes.yaml} | 29 +- scripts/ci/libraries/_build_images.sh | 11 +- scripts/ci/libraries/_initialization.sh| 27 +- scripts/ci/libraries/_kind.sh | 380 +++- scripts/ci/libraries/_verbosity.sh | 31 ++ 42 files changed, 424 insertions(+), 1410 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d091d2e..195f7f7 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -100,7 +100,7 @@ jobs: steps: - uses: actions/checkout@master - name: "Build PROD image ${{ matrix.python-version }}" -run: ./scripts/ci/ci_build_production_images.sh +run: ./scripts/ci/ci_prepare_prod_image_on_ci.sh tests-kubernetes: timeout-minutes: 80 @@ -113,7 +113,11 @@ jobs: kube-mode: - image kubernetes-version: - - "v1.15.3" + - "v1.18.2" +kind-version: + - "v0.8.0" +helm-version: + - "v3.2.4" fail-fast: false env: BACKEND: postgres @@ -126,6 +130,8 @@ jobs: PYTHON_MAJOR_MINOR_VERSION: "${{ matrix.python-version }}" KUBERNETES_MODE: "${{ matrix.kube-mode }}" KUBERNETES_VERSION: "${{
[GitHub] [airflow] kaxil commented on pull request #9364: Add option for Vault token to automatically be renewed
kaxil commented on pull request #9364: URL: https://github.com/apache/airflow/pull/9364#issuecomment-653091216 Yup, let's add it to the client This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] 07/09: Switches to Helm Chart for Kubernetes tests (#9468)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit b4a620cc1abfee581e5a6291914efc5479e72c18 Author: Jarek Potiuk AuthorDate: Wed Jul 1 14:50:30 2020 +0200 Switches to Helm Chart for Kubernetes tests (#9468) The Kubernetes tests are now run using Helm chart rather than the custom templates we used to have. The Helm Chart uses locally build production image so the tests are testing not only Airflow but also Helm Chart and a Production image - all at the same time. Later on we will add more tests covering more functionalities of both Helm Chart and Production Image. This is the first step to get all of those bundle together and become testable. This change introduces also 'shell' sub-command for Breeze's kind-cluster command and EMBEDDED_DAGS build args for production image - both of them useful to run the Kubernetes tests more easily - without building two images and with an easy-to-iterate-over-tests shell command - which works without any other development environment. Co-authored-by: Jarek Potiuk Co-authored-by: Daniel Imberman (cherry picked from commit 8bd15ef634cca40f3cf6ca3442262f3e05144512) --- .github/workflows/ci.yml | 23 +- BREEZE.rst | 81 +++-- CI.rst | 2 +- Dockerfile | 4 + IMAGES.rst | 3 + TESTING.rst| 67 ++-- airflow/kubernetes/pod_launcher.py | 2 +- breeze | 51 ++- breeze-complete| 14 +- chart/README.md| 5 +- chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes chart/requirements.lock| 4 +- chart/templates/configmap.yaml | 2 + chart/templates/rbac/pod-launcher-role.yaml| 2 +- chart/templates/rbac/pod-launcher-rolebinding.yaml | 4 +- kubernetes_tests/test_kubernetes_executor.py | 40 ++- scripts/ci/ci_build_production_images.sh | 25 -- scripts/ci/ci_count_changed_files.sh | 2 +- scripts/ci/ci_deploy_app_to_kubernetes.sh | 16 +- scripts/ci/ci_docs.sh | 2 +- scripts/ci/ci_flake8.sh| 2 +- scripts/ci/ci_generate_requirements.sh | 2 +- scripts/ci/ci_load_image_to_kind.sh| 7 +- scripts/ci/ci_mypy.sh | 2 +- scripts/ci/ci_perform_kind_cluster_operation.sh| 6 +- scripts/ci/ci_run_airflow_testing.sh | 2 +- scripts/ci/ci_run_kubernetes_tests.sh | 6 +- scripts/ci/ci_run_static_checks.sh | 2 +- scripts/ci/kubernetes/app/postgres.yaml| 94 - .../kubernetes/app/templates/airflow.template.yaml | 207 --- .../app/templates/configmaps.template.yaml | 395 - .../app/templates/init_git_sync.template.yaml | 36 -- scripts/ci/kubernetes/app/volumes.yaml | 87 - .../docker/airflow-test-env-init-dags.sh | 36 -- .../kubernetes/docker/airflow-test-env-init-db.sh | 46 --- scripts/ci/kubernetes/docker/bootstrap.sh | 74 scripts/ci/kubernetes/kind-cluster-conf.yaml | 3 - .../kubernetes/{app/secrets.yaml => volumes.yaml} | 29 +- scripts/ci/libraries/_build_images.sh | 11 +- scripts/ci/libraries/_initialization.sh| 27 +- scripts/ci/libraries/_kind.sh | 380 +++- scripts/ci/libraries/_verbosity.sh | 31 ++ 42 files changed, 424 insertions(+), 1410 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d091d2e..195f7f7 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -100,7 +100,7 @@ jobs: steps: - uses: actions/checkout@master - name: "Build PROD image ${{ matrix.python-version }}" -run: ./scripts/ci/ci_build_production_images.sh +run: ./scripts/ci/ci_prepare_prod_image_on_ci.sh tests-kubernetes: timeout-minutes: 80 @@ -113,7 +113,11 @@ jobs: kube-mode: - image kubernetes-version: - - "v1.15.3" + - "v1.18.2" +kind-version: + - "v0.8.0" +helm-version: + - "v3.2.4" fail-fast: false env: BACKEND: postgres @@ -126,6 +130,8 @@ jobs: PYTHON_MAJOR_MINOR_VERSION: "${{ matrix.python-version }}" KUBERNETES_MODE: "${{ matrix.kube-mode }}" KUBERNETES_VERSION: "${{
[airflow] 08/09: Removes importlib usage - it's not needed (fails on Airflow 1.10) (#9613)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 74ecb8ac1c275d19f8f27161a61e723905bce23e Author: Jarek Potiuk AuthorDate: Wed Jul 1 18:07:12 2020 +0200 Removes importlib usage - it's not needed (fails on Airflow 1.10) (#9613) (cherry picked from commit a3a52c78b274483f2035ad975fc218abd8ffdf8a) --- chart/templates/_helpers.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/chart/templates/_helpers.yaml b/chart/templates/_helpers.yaml index 66d1850..ac121a4 100644 --- a/chart/templates/_helpers.yaml +++ b/chart/templates/_helpers.yaml @@ -205,7 +205,7 @@ log_connections = {{ .Values.pgbouncer.logConnections }} - python - -c - | -import importlib +import airflow import os import time @@ -215,7 +215,7 @@ log_connections = {{ .Values.pgbouncer.logConnections }} from airflow import settings -package_dir = os.path.dirname(importlib.util.find_spec('airflow').origin) +package_dir = os.path.abspath(os.path.dirname(airflow.__file__)) directory = os.path.join(package_dir, 'migrations') config = Config(os.path.join(package_dir, 'alembic.ini')) config.set_main_option('script_location', directory)
[airflow] 09/09: Update Breeze documentation (#9608)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 99d37e2d35a9d82103b35e4042c27a7f5620b568 Author: Jarek Potiuk AuthorDate: Wed Jul 1 16:02:24 2020 +0200 Update Breeze documentation (#9608) * Update Breeze documentation (cherry picked from commit f3e1f9a313d8a6f841f6a5c9f2663518fee16b8f) --- BREEZE.rst | 293 TESTING.rst | 2 +- 2 files changed, 198 insertions(+), 97 deletions(-) diff --git a/BREEZE.rst b/BREEZE.rst index 9b318e2..735286a 100644 --- a/BREEZE.rst +++ b/BREEZE.rst @@ -232,44 +232,6 @@ from your ``logs`` directory in the Airflow sources, so all logs created in the visible in the host as well. Every time you enter the container, the ``logs`` directory is cleaned so that logs do not accumulate. -CLIs for cloud providers - - -For development convenience we installed simple wrappers for the most common cloud providers CLIs. Those -CLIs are not installed when you build or pull the image - they will be downloaded as docker images -the first time you attempt to use them. It is downloaded and executed in your host's docker engine so once -it is downloaded, it will stay until you remove the downloaded images from your host container. - -For each of those CLI credentials are taken (automatically) from the credentials you have defined in -your ${HOME} directory on host. - -Those tools also have host Airflow source directory mounted in /opt/airflow path -so you can directly transfer files to/from your airflow host sources. - -Those are currently installed CLIs (they are available as aliases to the docker commands): - -+---+--+-+---+ -| Cloud Provider| CLI tool | Docker image | Configuration dir | -+===+==+=+===+ -| Amazon Web Services | aws | amazon/aws-cli:latest | .aws | -+---+--+-+---+ -| Microsoft Azure | az | mcr.microsoft.com/azure-cli:latest | .azure| -+---+--+-+---+ -| Google Cloud Platform | bq | gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud| -| +--+-+---+ -| | gcloud | gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud| -| +--+-+---+ -| | gsutil | gcr.io/google.com/cloudsdktool/cloud-sdk:latest | .config/gcloud| -+---+--+-+---+ - -For each of the CLIs we have also an accompanying ``*-update`` alias (for example ``aws-update``) which -will pull the latest image for the tool. Note that all Google Cloud Platform tools are served by one -image and they are updated together. - -Also - in case you run several different Breeze containers in parallel (from different directories, -with different versions) - they docker images for CLI Cloud Providers tools are shared so if you update it -for one Breeze container, they will also get updated for all the other containers. - Using the Airflow Breeze Environment = @@ -287,6 +249,7 @@ Managing CI environment: * Stop running interactive environment with ``breeze stop`` command * Restart running interactive environment with ``breeze restart`` command * Run test specified with ``breeze tests`` command +* Generate requirements with ``breeze generate-requirements`` command * Execute arbitrary command in the test environment with ``breeze shell`` command * Execute arbitrary docker-compose command with ``breeze docker-compose`` command * Push docker images with ``breeze push-image`` command (require committer's rights to push images) @@ -319,7 +282,7 @@ Manage and Interact with Kubernetes tests environment: Run static checks: * Run static checks - either for currently staged change or for all files with - ``breeze static-check`` or ``breeze static-check-all-files`` command + ``breeze static-check`` command Build documentation: @@ -330,10 +293,12 @@ Set up local development environment: * Setup local virtualenv with ``breeze setup-virtualenv`` command * Setup autocomplete for itself with ``breeze setup-autocomplete`` command -
[airflow] 03/09: Remove non-existent chart value from readme (#9511)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 20e106d31dc125812d2c5c9e421a406cdb3e6958 Author: Ash Berlin-Taylor AuthorDate: Thu Jun 25 12:19:26 2020 +0100 Remove non-existent chart value from readme (#9511) This was accidentally left over when this was extracted from Astronomer's chart. (cherry picked from commit 561060aaa82ddb63fe2a38473bfd920a5aeff786) --- chart/README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/chart/README.md b/chart/README.md index 8657eee..d0366f6 100644 --- a/chart/README.md +++ b/chart/README.md @@ -157,7 +157,6 @@ The following tables lists the configurable parameters of the Airflow chart and | `webserver.resources.limits.memory` | Memory Limit of webserver | `~` | | `webserver.resources.requests.cpu`| CPU Request of webserver | `~` | | `webserver.resources.requests.memory` | Memory Request of webserver | `~` | -| `webserver.jwtSigningCertificateSecretName` | Name of secret to mount Airflow Webserver JWT singing certificate from | `~` | | `webserver.defaultUser` | Optional default airflow user information | `{}` |
[airflow] 02/09: Fix typo in helm chart upgrade command for 2.0 (#9484)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 47e1a875229657251fdeddaae5a2dd083572079b Author: Ash Berlin-Taylor AuthorDate: Tue Jun 23 10:38:06 2020 +0100 Fix typo in helm chart upgrade command for 2.0 (#9484) (cherry picked from commit b1cd382db9367ec828b8ee16899ecea9fcf824a7) --- chart/templates/scheduler/scheduler-deployment.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chart/templates/scheduler/scheduler-deployment.yaml b/chart/templates/scheduler/scheduler-deployment.yaml index 1b46f6a..d5c3a06 100644 --- a/chart/templates/scheduler/scheduler-deployment.yaml +++ b/chart/templates/scheduler/scheduler-deployment.yaml @@ -96,7 +96,7 @@ spec: image: {{ template "airflow_image" . }} imagePullPolicy: {{ .Values.images.airflow.pullPolicy }} # Support running against 1.10.x and 2.0.0dev/master - args: ["bash", "-c", "airflow upgradedb || airfow db upgrade"] + args: ["bash", "-c", "airflow upgradedb || airflow db upgrade"] env: {{- include "custom_airflow_environment" . | indent 10 }} {{- include "standard_airflow_environment" . | indent 10 }}
[airflow] 04/09: Fix typo of resultBackendConnection in chart README (#9537)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit a213847e99b4b9f0afb7a8c0b6fc3968d04d6e40 Author: Vicken Simonian AuthorDate: Fri Jun 26 11:40:30 2020 -0700 Fix typo of resultBackendConnection in chart README (#9537) (cherry picked from commit 096f5c5cba963b364ee75f6686d128cd4d34d66e) --- chart/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/chart/README.md b/chart/README.md index d0366f6..402a9d7 100644 --- a/chart/README.md +++ b/chart/README.md @@ -119,11 +119,11 @@ The following tables lists the configurable parameters of the Airflow chart and | `data.metadataSecretName` | Secret name to mount Airflow connection string from | `~` | | `data.resultBackendSecretName`| Secret name to mount Celery result backend connection string from | `~` | | `data.metadataConection` | Field separated connection data (alternative to secret name) | `{}` | -| `data.resultBakcnedConnection`| Field separated connection data (alternative to secret name) | `{}` | +| `data.resultBackendConnection`| Field separated connection data (alternative to secret name) | `{}` | | `fernetKey` | String representing an Airflow fernet key | `~` | | `fernetKeySecretName` | Secret name for Airlow fernet key | `~` | | `workers.replicas`| Replica count for Celery workers (if applicable) | `1` | -| `workers.keda.enabled` | Enable KEDA autoscaling features | `false` | +| `workers.keda.enabled`| Enable KEDA autoscaling features | `false` | | `workers.keda.pollingInverval`| How often KEDA should poll the backend database for metrics in seconds | `5` | | `workers.keda.cooldownPeriod` | How often KEDA should wait before scaling down in seconds | `30` | | `workers.keda.maxReplicaCount`| Maximum number of Celery workers KEDA can scale to | `10` |
[airflow] 05/09: Remove redundant airflowVersion from Helm Chart readme (#9592)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 9bc18c1938ebd5f19e1de4a2ce7270957fd3fea6 Author: Kaxil Naik AuthorDate: Tue Jun 30 17:02:56 2020 +0100 Remove redundant airflowVersion from Helm Chart readme (#9592) We no longer use `airflowVersion` , we instead use `defaultAirflowRepository` and `defaultAirflowTag` (cherry picked from commit d6b323b0cd9be2aa941cbb1e1e15d766b4d6539b) --- chart/README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/chart/README.md b/chart/README.md index 402a9d7..089ea22 100644 --- a/chart/README.md +++ b/chart/README.md @@ -91,7 +91,6 @@ The following tables lists the configurable parameters of the Airflow chart and | `networkPolicies.enabled` | Enable Network Policies to restrict traffic | `true`| | `airflowHome` | Location of airflow home directory | `/opt/airflow`| | `rbacEnabled` | Deploy pods with Kubernets RBAC enabled | `true`| -| `airflowVersion` | Default Airflow image version | `1.10.5` | | `executor`| Airflow executor (eg SequentialExecutor, LocalExecutor, CeleryExecutor, KubernetesExecutor) | `KubernetesExecutor` | | `allowPodLaunching` | Allow airflow pods to talk to Kubernetes API to launch more pods | `true`| | `defaultAirflowRepository`| Fallback docker repository to pull airflow image from | `apache/airflow` |
[airflow] 06/09: Fix broken link in chart/README.md (#9591)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit e6b2d0f89d5e9348869c15dd78f5c1a03906ae38 Author: Kaxil Naik AuthorDate: Tue Jun 30 17:03:11 2020 +0100 Fix broken link in chart/README.md (#9591) `CONTRIBUTING.md` -> `../CONTRIBUTING.rst` (cherry picked from commit bbfaafeb552b48560960ab4aba84723b7ccbf386) --- chart/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chart/README.md b/chart/README.md index 089ea22..76d14b4 100644 --- a/chart/README.md +++ b/chart/README.md @@ -264,4 +264,4 @@ to port-forward the Airflow UI to http://localhost:8080/ to cofirm Airflow is wo ## Contributing -Check out [our contributing guide!](CONTRIBUTING.md) +Check out [our contributing guide!](../CONTRIBUTING.rst)
[airflow] branch v1-10-test updated (317b041 -> 99d37e2)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git. from 317b041 Update README.md for 1.10.11 new d43ca01 Add Production Helm chart support (#8777) new 47e1a87 Fix typo in helm chart upgrade command for 2.0 (#9484) new 20e106d Remove non-existent chart value from readme (#9511) new a213847 Fix typo of resultBackendConnection in chart README (#9537) new 9bc18c1 Remove redundant airflowVersion from Helm Chart readme (#9592) new e6b2d0f Fix broken link in chart/README.md (#9591) new b4a620c Switches to Helm Chart for Kubernetes tests (#9468) new 74ecb8a Removes importlib usage - it's not needed (fails on Airflow 1.10) (#9613) new 99d37e2 Update Breeze documentation (#9608) The 9 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .github/workflows/ci.yml | 23 +- .pre-commit-config.yaml| 2 +- BREEZE.rst | 374 -- CI.rst | 2 +- Dockerfile | 4 + IMAGES.rst | 3 + TESTING.rst| 67 ++-- airflow/kubernetes/pod_launcher.py | 2 +- breeze | 51 ++- breeze-complete| 14 +- chart/.gitignore | 9 + .../.helmignore| 33 +- .readthedocs.yml => chart/Chart.yaml | 20 +- chart/README.md| 270 + chart/charts/postgresql-6.3.12.tgz | Bin 22754 -> 0 bytes chart/requirements.lock| 6 + .../libs/helper.py => chart/requirements.yaml | 11 +- .../LICENSE.txt => chart/templates/NOTES.txt | 13 + chart/templates/_helpers.yaml | 260 chart/templates/cleanup/cleanup-cronjob.yaml | 67 .../templates/cleanup/cleanup-serviceaccount.yaml | 24 +- chart/templates/configmap.yaml | 119 ++ chart/templates/create-user-job.yaml | 87 chart/templates/flower/flower-deployment.yaml | 102 + chart/templates/flower/flower-networkpolicy.yaml | 51 +++ .../templates/flower/flower-service.yaml | 41 +- .../pod.yaml => chart/templates/limitrange.yaml| 33 +- .../templates/pgbouncer/pgbouncer-deployment.yaml | 128 ++ .../pgbouncer/pgbouncer-networkpolicy.yaml | 69 .../pgbouncer/pgbouncer-poddisruptionbudget.yaml | 56 ++- chart/templates/pgbouncer/pgbouncer-service.yaml | 56 +++ .../templates/rbac/pod-cleanup-role.yaml | 34 +- .../templates/rbac/pod-cleanup-rolebinding.yaml| 32 +- chart/templates/rbac/pod-launcher-role.yaml| 58 +++ chart/templates/rbac/pod-launcher-rolebinding.yaml | 51 +++ chart/templates/redis/redis-networkpolicy.yaml | 63 +++ .../templates/redis/redis-service.yaml | 41 +- chart/templates/redis/redis-statefulset.yaml | 99 + .../pod.yaml => chart/templates/resourcequota.yaml | 33 +- .../templates/scheduler/scheduler-deployment.yaml | 195 + .../scheduler/scheduler-networkpolicy.yaml | 55 +++ .../scheduler/scheduler-poddisruptionbudget.yaml | 39 +- .../templates/scheduler/scheduler-service.yaml | 41 +- .../scheduler/scheduler-serviceaccount.yaml| 24 +- .../templates/secrets/elasticsearch-secret.yaml| 22 +- .../templates/secrets/fernetkey-secret.yaml| 27 +- .../secrets/metadata-connection-secret.yaml| 42 ++ .../templates/secrets/pgbouncer-config-secret.yaml | 23 +- .../templates/secrets/pgbouncer-stats-secret.yaml | 22 +- chart/templates/secrets/redis-secrets.yaml | 61 +++ .../templates/secrets/registry-secret.yaml | 24 +- .../secrets/result-backend-connection-secret.yaml | 37 ++ chart/templates/statsd/statsd-deployment.yaml | 87 chart/templates/statsd/statsd-networkpolicy.yaml | 57 +++ chart/templates/statsd/statsd-service.yaml | 56 +++ .../templates/webserver/webserver-deployment.yaml | 139 +++ .../webserver/webserver-networkpolicy.yaml | 51 +++ .../templates/webserver/webserver-service.yaml | 39 +- chart/templates/workers/worker-deployment.yaml | 161 chart/templates/workers/worker-kedaautoscaler.yaml | 47 +++ chart/templates/workers/worker-networkpolicy.yaml | 53 +++ .../templates/workers/worker-service.yaml |
[GitHub] [airflow] mik-laj commented on pull request #9431: Move API page limit and offset parameters to views as kwargs Arguments
mik-laj commented on pull request #9431: URL: https://github.com/apache/airflow/pull/9431#issuecomment-653185780 > This PR depends on #9503 , I don't see the common parts. Can you say something more? Can you do a rebase? I would like to merge this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] aneesh-joseph commented on pull request #9044: Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill`
aneesh-joseph commented on pull request #9044: URL: https://github.com/apache/airflow/pull/9044#issuecomment-653191413 > Could you rebase to latest master, hopefully that should fix the failing Kube tests done, but failed again, is there a way to re-run them? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kblibr opened a new pull request #9632: Fixing typo in chart/README.me
kblibr opened a new pull request #9632: URL: https://github.com/apache/airflow/pull/9632 I found a simple typo in the readme and thought it would be good to fix it. -- No issue in github issues. -- No change to source. (no unit test changed.) -- Change did not alter the meaning of the readme. Make sure to mark the boxes below before creating PR: [x] - [X ] Description above provides context of the change - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Target Github ISSUE in description if exists - [X ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [X ] Relevant documentation is updated including usage instructions. - [ X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9632: Fixing typo in chart/README.me
boring-cyborg[bot] commented on pull request #9632: URL: https://github.com/apache/airflow/pull/9632#issuecomment-653198140 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://apache-airflow-slack.herokuapp.com/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook
vanka56 commented on a change in pull request #9472: URL: https://github.com/apache/airflow/pull/9472#discussion_r449299190 ## File path: airflow/providers/apache/hive/hooks/hive.py ## @@ -775,6 +775,23 @@ def table_exists(self, table_name, db='default'): except Exception: # pylint: disable=broad-except return False +def drop_partitions(self, table_name, part_vals, delete_data=False, db='default'): +""" +Drop partitions matching param_names input +>>> hh = HiveMetastoreHook() +>>> hh.drop_partitions(db='airflow', table_name='static_babynames', +part_vals="['2020-05-01']") +True Review comment: Done! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vanka56 commented on a change in pull request #9472: Add drop_partition functionality for HiveMetastoreHook
vanka56 commented on a change in pull request #9472: URL: https://github.com/apache/airflow/pull/9472#discussion_r449299122 ## File path: airflow/providers/apache/hive/hooks/hive.py ## @@ -775,6 +775,23 @@ def table_exists(self, table_name, db='default'): except Exception: # pylint: disable=broad-except return False +def drop_partitions(self, table_name, part_vals, delete_data=False, db='default'): Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil commented on a change in pull request #9632: Fixing typo in chart/README.me
kaxil commented on a change in pull request #9632: URL: https://github.com/apache/airflow/pull/9632#discussion_r449260079 ## File path: chart/README.md ## @@ -74,8 +74,7 @@ helm upgrade airflow . \ --set images.airflow.tag=8a0da78 ``` -For local development purppose you can also u -You can also build the image locally and use it via deployment method described by Breeze. +For local development purppose you can also build the image locally and use it via deployment method described by Breeze. Review comment: ```suggestion For local development purpose you can also build the image locally and use it via deployment method described by Breeze. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Acehaidrey commented on pull request #9544: Add metric for scheduling delay between first run task & expected start time
Acehaidrey commented on pull request #9544: URL: https://github.com/apache/airflow/pull/9544#issuecomment-653242280 @mik-laj please when you get a chance This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jpuffer opened a new issue #9633: "Try Number" is off by 1 in the Gantt view
jpuffer opened a new issue #9633: URL: https://github.com/apache/airflow/issues/9633 **Apache Airflow version**: 1.10.10 **Environment**: - **Cloud provider or hardware configuration**: Astronomer **What happened**: In the Gantt view, all tasks start at "Try number: 2" and increment from there. **What you expected to happen**: I expect the first try to be called try number 1. **How to reproduce it**: Look at gantt view for any DAG run **Anything else we need to know**: JIRA says this was solved as of v..10.7, but I'm finding that's not the case: https://issues.apache.org/jira/browse/AIRFLOW-2143 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jkbngl commented on pull request #9295: added mssql to oracle transfer operator
jkbngl commented on pull request #9295: URL: https://github.com/apache/airflow/pull/9295#issuecomment-653231283 Hi @ephraimbuddy would you mind helping me with the naming convention of the operators? I got an error for my operator to not match: .*Operator$ so I renamed it to MsSqlToOracleOperator, now I get an error for matching .*To[A-Z0-9].*Operator$, the issue is the "**To**" from my understanding... What would be the correct name for my operator? I also checked other operators and they also match .*To[A-Z0-9].*Operator$, like e.g.: - FileToWasbOperator - OracleToAzureDataLakeTransferOperator - OracleToOracleTransferOperator - Maybe it would be correct to move it in a transfer folder instead of in the operator folder, but this is also not the case for all other transfer operators? Your help would be appreciated! Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #9295: added mssql to oracle transfer operator
mik-laj commented on pull request #9295: URL: https://github.com/apache/airflow/pull/9295#issuecomment-653234573 Here is naming convention: https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#naming-conventions-for-provider-packages is this helpful for you? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #9623: Move ElasticsearchTaskHandler to the provider package
mik-laj commented on pull request #9623: URL: https://github.com/apache/airflow/pull/9623#issuecomment-653188369 @potiuk This will cherry-pick changes to the Javascript code. If you are willing, we can do it. For now, we should delete references to this class in backport packages. This will allow these classes to be used without this one new feature. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #9632: Fixing typo in chart/README.me
boring-cyborg[bot] commented on pull request #9632: URL: https://github.com/apache/airflow/pull/9632#issuecomment-653217267 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch master updated: Fixing typo in chart/README.me (#9632)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/master by this push: new 72d5a58 Fixing typo in chart/README.me (#9632) 72d5a58 is described below commit 72d5a58fd734cf4a02e351210b7db5ae2fae7e4e Author: Bryant Larsen AuthorDate: Thu Jul 2 14:57:55 2020 -0600 Fixing typo in chart/README.me (#9632) * Fixing typo in readme --- chart/README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/chart/README.md b/chart/README.md index 6cc361e..fb04517 100644 --- a/chart/README.md +++ b/chart/README.md @@ -74,8 +74,7 @@ helm upgrade airflow . \ --set images.airflow.tag=8a0da78 ``` -For local development purppose you can also u -You can also build the image locally and use it via deployment method described by Breeze. +For local development purpose you can also build the image locally and use it via deployment method described by Breeze. ## Parameters
[GitHub] [airflow] potiuk merged pull request #9632: Fixing typo in chart/README.me
potiuk merged pull request #9632: URL: https://github.com/apache/airflow/pull/9632 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on pull request #9295: added mssql to oracle transfer operator
mik-laj commented on pull request #9295: URL: https://github.com/apache/airflow/pull/9295#issuecomment-653235179 Here is voting and discussion about transfer package: https://lists.apache.org/x/thread.html/r3514ef575b437b9eb368111b1e4b03ad7455e63d64c359c22fd6ea9a@%3Cdev.airflow.apache.org%3E This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #9503: add date-time format validation for API spec
mik-laj commented on a change in pull request #9503: URL: https://github.com/apache/airflow/pull/9503#discussion_r449274053 ## File path: airflow/api_connexion/endpoints/dag_run_endpoint.py ## @@ -69,24 +61,24 @@ def get_dag_runs(session, dag_id, start_date_gte=None, start_date_lte=None, # filter start date if start_date_gte: -query = query.filter(DagRun.start_date >= start_date_gte) +query = query.filter(DagRun.start_date >= timezone.parse(start_date_gte)) Review comment: We cannot make such a change to the specification because API clients must support this type of field. It will not be easy to add custom type support for all generated clients. We can try to make such a change only in memory when loading the file, but I'm not sure if we need it. The current solution meets all our requirements. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (AIRFLOW-6497) Scheduler creates DagBag in the same process with outdated info
[ https://issues.apache.org/jira/browse/AIRFLOW-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150617#comment-17150617 ] ASF GitHub Bot commented on AIRFLOW-6497: - feng-tao commented on pull request #7597: URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511 just learn this pr, nice! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Scheduler creates DagBag in the same process with outdated info > --- > > Key: AIRFLOW-6497 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6497 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.7 >Reporter: Qian Yu >Assignee: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > > The following code in scheduler_job.py seems to be called in the same process > as the scheduler. It creates a DagBag. But since scheduler is a long running > process, it does not pick up the latest changes made to DAGs. For example, > changes to retries count, on_failure_callback, newly added tasks, etc are not > reflected. > > {code:python} > if ti.try_number == try_number and ti.state == State.QUEUED: > msg = ("Executor reports task instance {} finished ({}) " >"although the task says its {}. Was the task " >"killed externally?".format(ti, state, ti.state)) > Stats.incr('scheduler.tasks.killed_externally') > self.log.error(msg) > try: > simple_dag = simple_dag_bag.get_dag(dag_id) > dagbag = models.DagBag(simple_dag.full_filepath) > dag = dagbag.get_dag(dag_id) > ti.task = dag.get_task(task_id) > ti.handle_failure(msg) > except Exception: > self.log.error("Cannot load the dag bag to handle > failure for %s" >". Setting task to FAILED without > callbacks or " >"retries. Do you have enough > resources?", ti) > ti.state = State.FAILED > session.merge(ti) > session.commit() > {code} > This causes errors such as AttributeError due to stale code being hit. E.g. > when someone added a .join attribute to CustomOperator without bouncing the > scheduler, this is what he would get after a CeleryWorker timeout error > causes this line to be hit: > {code} > [2020-01-05 22:25:45,951] {dagbag.py:207} ERROR - Failed to import: > /dags/dag1.py > Traceback (most recent call last): > File "/lib/python3.6/site-packages/airflow/models/dagbag.py", line 204, in > process_file > m = imp.load_source(mod_name, filepath) > File "/usr/lib/python3.6/imp.py", line 172, in load_source > module = _load(spec) > File "", line 684, in _load > File "", line 665, in _load_unlocked > File "", line 678, in exec_module > File "", line 219, in _call_with_frames_removed > File "/dags/dag1.py", line 280, in > task1 >> task2.join > AttributeError: 'CustomOperator' object has no attribute 'join' > [2020-01-05 22:25:45,951] {scheduler_job.py:1314} ERROR - Cannot load the dag > bag to handle failure for [queued]>. Setting task to FAILED without callbacks or retries. Do you have > enough resou > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6497) Scheduler creates DagBag in the same process with outdated info
[ https://issues.apache.org/jira/browse/AIRFLOW-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150618#comment-17150618 ] ASF GitHub Bot commented on AIRFLOW-6497: - feng-tao edited a comment on pull request #7597: URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511 just found out this pr, nice! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Scheduler creates DagBag in the same process with outdated info > --- > > Key: AIRFLOW-6497 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6497 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.7 >Reporter: Qian Yu >Assignee: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > > The following code in scheduler_job.py seems to be called in the same process > as the scheduler. It creates a DagBag. But since scheduler is a long running > process, it does not pick up the latest changes made to DAGs. For example, > changes to retries count, on_failure_callback, newly added tasks, etc are not > reflected. > > {code:python} > if ti.try_number == try_number and ti.state == State.QUEUED: > msg = ("Executor reports task instance {} finished ({}) " >"although the task says its {}. Was the task " >"killed externally?".format(ti, state, ti.state)) > Stats.incr('scheduler.tasks.killed_externally') > self.log.error(msg) > try: > simple_dag = simple_dag_bag.get_dag(dag_id) > dagbag = models.DagBag(simple_dag.full_filepath) > dag = dagbag.get_dag(dag_id) > ti.task = dag.get_task(task_id) > ti.handle_failure(msg) > except Exception: > self.log.error("Cannot load the dag bag to handle > failure for %s" >". Setting task to FAILED without > callbacks or " >"retries. Do you have enough > resources?", ti) > ti.state = State.FAILED > session.merge(ti) > session.commit() > {code} > This causes errors such as AttributeError due to stale code being hit. E.g. > when someone added a .join attribute to CustomOperator without bouncing the > scheduler, this is what he would get after a CeleryWorker timeout error > causes this line to be hit: > {code} > [2020-01-05 22:25:45,951] {dagbag.py:207} ERROR - Failed to import: > /dags/dag1.py > Traceback (most recent call last): > File "/lib/python3.6/site-packages/airflow/models/dagbag.py", line 204, in > process_file > m = imp.load_source(mod_name, filepath) > File "/usr/lib/python3.6/imp.py", line 172, in load_source > module = _load(spec) > File "", line 684, in _load > File "", line 665, in _load_unlocked > File "", line 678, in exec_module > File "", line 219, in _call_with_frames_removed > File "/dags/dag1.py", line 280, in > task1 >> task2.join > AttributeError: 'CustomOperator' object has no attribute 'join' > [2020-01-05 22:25:45,951] {scheduler_job.py:1314} ERROR - Cannot load the dag > bag to handle failure for [queued]>. Setting task to FAILED without callbacks or retries. Do you have > enough resou > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] feng-tao edited a comment on pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
feng-tao edited a comment on pull request #7597: URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511 just found out this pr, nice! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] feng-tao commented on pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
feng-tao commented on pull request #7597: URL: https://github.com/apache/airflow/pull/7597#issuecomment-653247511 just learn this pr, nice! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on pull request #9431: Move API page limit and offset parameters to views as kwargs Arguments
ephraimbuddy commented on pull request #9431: URL: https://github.com/apache/airflow/pull/9431#issuecomment-653202146 > > This PR depends on #9503 , > > I don't see the common parts. Can you say something more? > It is because I removed `format_parameters` tests in #9503. I was hoping to add it back with this PR. However, I will rebase now then add the tests on the other PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ITriangle opened a new pull request #9635: fix : _run_task_by_executor pickle_id is None
ITriangle opened a new pull request #9635: URL: https://github.com/apache/airflow/pull/9635 session add need to commit,Otherwise, pickle_id is None This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] freedom1989 opened a new issue #9634: Cannot update XCOMs via RBAC UI
freedom1989 opened a new issue #9634: URL: https://github.com/apache/airflow/issues/9634 **Apache Airflow version**: 1.10.10 **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): **Environment**: - **Cloud provider or hardware configuration**: - **OS** (e.g. from /etc/os-release): - **Kernel** (e.g. `uname -a`): - **Install tools**: - **Others**: **What happened**: I cannot update the XCOMs via the RBAC UI. It always show me "Not a valid datetime value." I even tried `2020-06-30 16:56:02+00:00` and `2020-06-30 16:56:02` as the input. **What you expected to happen**: **How to reproduce it**: ![image](https://user-images.githubusercontent.com/3204415/86428573-a817f680-bd1f-11ea-913e-4ee011cf6671.png) **Anything else we need to know**: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] stale[bot] closed pull request #8768: [POC] Mark keywords-only arguments in hook method signatures
stale[bot] closed pull request #8768: URL: https://github.com/apache/airflow/pull/8768 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org