[GitHub] feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239715180 ## File path: airflow/api/common/experimental/delete_dag.py ## @@ -41,6 +49,8 @@ def delete_dag(dag_id): # noinspection PyUnresolvedReferences,PyProtectedMember for m in models.Base._decl_class_registry.values(): if hasattr(m, "dag_id"): +if keep_records_in_log and m.__name__ == 'Log': Review comment: thanks for the explanation. make sense This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jzucker2 commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
jzucker2 commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r239688963 ## File path: airflow/contrib/kubernetes/worker_configuration.py ## @@ -89,11 +100,18 @@ def _get_environment(self): if self.kube_config.airflow_configmap: env['AIRFLOW__CORE__AIRFLOW_HOME'] = self.worker_airflow_home -if self.kube_config.worker_dags_folder: -env['AIRFLOW__CORE__DAGS_FOLDER'] = self.kube_config.worker_dags_folder +env['AIRFLOW__CORE__DAGS_FOLDER'] = self.worker_airflow_dags Review comment: this always seemed like such an issue that it's hard coded in the `master` branch, I'm glad this is being addressed This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jzucker2 commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
jzucker2 commented on a change in pull request #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#discussion_r23968 ## File path: airflow/contrib/kubernetes/worker_configuration.py ## @@ -50,13 +55,13 @@ def _get_init_containers(self, volume_mounts): 'value': self.kube_config.git_branch }, { 'name': 'GIT_SYNC_ROOT', -'value': os.path.join( -self.worker_airflow_dags, -self.kube_config.git_subpath -) +'value': self.kube_config.git_sync_root }, { 'name': 'GIT_SYNC_DEST', -'value': 'dags' +'value': self.kube_config.git_sync_dest +}, { +'name': 'GIT_SYNC_DEPTH', Review comment: does this fix the issue of the revision being in the path directory? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2229) Scheduler cannot retry abrupt task failures within factory-generated DAGs
[ https://issues.apache.org/jira/browse/AIRFLOW-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712299#comment-16712299 ] Terrence Szymanski commented on AIRFLOW-2229: - Also getting these errors and I agree the exception handling and generic log message make it impossible to figure out what is going wrong. > Scheduler cannot retry abrupt task failures within factory-generated DAGs > - > > Key: AIRFLOW-2229 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2229 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.9.0 >Reporter: James Meickle >Priority: Major > > We had an issue where one of our tasks failed without the worker updating > state (unclear why, but let's assume it was an OOM), resulting in this series > of error messages: > {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com > airflow_scheduler-stdout.log: [2018-03-20 14:27:04,993] \{{models.py:1595 > ERROR - Executor reports task instance %s finished (%s) although the task > says its %s. Was the task killed externally? > {{ Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com > airflow_scheduler-stdout.log: NoneType}} > {{ Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com > airflow_scheduler-stdout.log: [2018-03-20 14:27:04,994] {{jobs.py:1435 ERROR > - Cannot load the dag bag to handle failure for nightly_dataload.dummy_operator 2018-03-19 00:00:00 [queued]>. Setting task > to FAILED without callbacks or retries. Do you have enough resources? > Mysterious failures are not unexpected, because we are in the cloud, after > all. The concern is the last line: ignoring callbacks and retries, implying > that it's a lack of resources. However, the machine was totally underutilized > at the time. > I dug into this code a bit more and as far as I can tell this error is > happening in this code path: > [https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/jobs.py#L1427] > {{self.log.error(msg)}} > {{try:}} > {{ simple_dag = simple_dag_bag.get_dag(dag_id)}} > {{ dagbag = models.DagBag(simple_dag.full_filepath)}} > {{ dag = dagbag.get_dag(dag_id)}} > {{ ti.task = dag.get_task(task_id)}} > {{ ti.handle_failure(msg)}} > {{except Exception:}} > {{ self.log.error("Cannot load the dag bag to handle failure for %s"}} > {{ ". Setting task to FAILED without callbacks or "}} > {{ "retries. Do you have enough resources?", ti)}} > {{ ti.state = State.FAILED}} > {{ session.merge(ti)}} > {{ session.commit()}}{{}} > I am not very familiar with this code, nor do I have time to attach a > debugger at the moment, but I think what is happening here is: > * I have a factory Python file, which imports and instantiates DAG code from > other files. > * The scheduler loads the DAGs from the factory file on the filesystem. It > gets a fileloc (as represented in the DB) not of the factory file, but of the > file it loaded code from. > * The scheduler makes a simple DAGBag from the instantiated DAGs. > * This line of code uses the simple DAG, which references the original DAG > object's fileloc, to create a new DAGBag object. > * This DAGBag looks for the original DAG in the fileloc, which is the file > containing that DAG's _code_, but is not actually importable by Airflow. > * An exception is raised trying to load the DAG from the DAGBag, which found > nothing. > * Handling of the task failure never occurs. > * The over-broad Exception code swallows all of the above occurring. > * There's just a generic error message that is not helpful to a system > operator. > If this is the case, at minimum, the try/except should be rewritten to be > more graceful and to have a better error message. But I question whether this > level of DAGBag abstraction/indirection isn't making this failure case worse > than it needs to be; under normal conditions the scheduler is definitely able > to find the relevant factory-generated DAGs and execute tasks within them as > expected, even with the fileloc set to the code path and not the import path. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] jcao219 commented on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI
jcao219 commented on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI URL: https://github.com/apache/incubator-airflow/pull/3930#issuecomment-445111625 @ron819 The tests are passing now This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445102544 Interesting. Seems that changing ownership of .minikube isn't allowed on macs ```+ sudo chown -R airflow.airflow . /home/airflow/.cache /home/airflow/.wheelhouse/ /home/airflow/.cache/pip /home/airflow/.kube /home/airflow/.minikube sudo: unable to resolve host linuxkit-0251 chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-kube-dns-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-sidecar-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/console-ring': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/initrd': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/tty': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/hyperkit.json': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/hyperkit.pid': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/bzimage': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/isolinux.cfg': Operation not permitted``` Wonder how they accomplish this on linux/if this is a security concern. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman edited a comment on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
dimberman edited a comment on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445102544 Interesting. Seems that changing ownership of .minikube isn't allowed on macs ``` + sudo chown -R airflow.airflow . /home/airflow/.cache /home/airflow/.wheelhouse/ /home/airflow/.cache/pip /home/airflow/.kube /home/airflow/.minikube sudo: unable to resolve host linuxkit-0251 chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-kube-dns-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-sidecar-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/cache/images/k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64_1.14.5': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/console-ring': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/initrd': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/tty': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/hyperkit.json': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/hyperkit.pid': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/bzimage': Operation not permitted chown: changing ownership of '/home/airflow/.minikube/machines/minikube/isolinux.cfg': Operation not permitted ``` Wonder how they accomplish this on linux/if this is a security concern. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2508) Handle non string types in render_template_from_field
[ https://issues.apache.org/jira/browse/AIRFLOW-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712268#comment-16712268 ] ASF GitHub Bot commented on AIRFLOW-2508: - galak75 opened a new pull request #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields URL: https://github.com/apache/incubator-airflow/pull/4292 ### Jira This PR addresses [[AIRFLOW-2508] Handle non string types in render_template_from_field](https://issues.apache.org/jira/browse/AIRFLOW-2508) ### Description This PR changes `BaseOperator.render_template_from_field` method behavior: - it was previously rendering `string_types` and supporting `Number`s (and collections or dictionaries of these types); but it was raising an `AirflowException` on any other type. - it is now supporting any other type (by returning the value as is) ### Tests This PR adds the following unit tests on `BaseOperator.render_template_from_field` method: - rendering a list of strings - rendering a tuple of strings - rendering dictionary values - showing dictionary keys are not templatized - returning a `date` as is - returning a `datetime` as is - returning a `UUID` as is - returning an `object` as is ### Notice **This PR adds [pyhamcrest](http://hamcrest.org/) to Airflow _test dependencies_.** This module helps writing meaningful assertions, and also provides very clear and helpful messages on test failures. If Airflow maintainers do not want to include this test module, just let me know, and I'll rework unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Handle non string types in render_template_from_field > - > > Key: AIRFLOW-2508 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2508 > Project: Apache Airflow > Issue Type: Bug > Components: models >Affects Versions: 2.0.0 >Reporter: Eugene Brown >Assignee: Galak >Priority: Minor > Labels: easyfix, newbie > Original Estimate: 2h > Remaining Estimate: 2h > > The render_template_from_field method of the BaseOperator class raises an > exception when it encounters content that is not a string_type, list, tuple > or dict. > Example exception: > {noformat} > airflow.exceptions.AirflowException: Type '' used for parameter > 'job_flow_overrides[Instances][InstanceGroups][InstanceCount]' is not > supported for templating{noformat} > I propose instead that when it encounters content of other types it returns > the content unchanged, rather than raising an exception. > Consider this case: I extended the EmrCreateJobFlowOperator to make the > job_flow_overrides argument a templatable field. job_flow_overrides is a > dictionary with a mix of strings, integers and booleans for values. > When I extended the class as such: > {code:java} > class EmrCreateJobFlowOperatorTemplateOverrides(EmrCreateJobFlowOperator): > template_fields = ['job_flow_overrides']{code} > And added a task to my dag with this format: > {code:java} > step_create_cluster = EmrCreateJobFlowOperatorTemplateOverrides( > task_id="create_cluster", > job_flow_overrides={ > "Name": "my-cluster {{ dag_run.conf['run_date'] }}", > "Instances": { > "InstanceGroups": [ > { > "Name": "Master nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 1 > }, > { > "Name": "Slave nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 4 > }, > "TerminationProtected": False > ] > }, > "BootstrapActions": [{ > "Name": "Custom action", > "ScriptBootstrapAction": { > "Path": "s3://repo/{{ dag_run.conf['branch'] > }}/requirements.txt" > } > }], >}, >aws_conn_id='aws_default', >emr_conn_id='aws_default', >dag=dag > ) > {code} > The exception I gave above was raised and the step failed. I think it would > be preferable for the method to instead pass over numeric and boolean values > as users may want to use template_fields in the way I have to template string > values in dictionaries or lists of mixed types. > Here is the render_template_from_field method from the BaseOperator: > {code:java} > def render_template_from_field(self, attr, content, context, jinja_env): >
[GitHub] galak75 opened a new pull request #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields
galak75 opened a new pull request #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields URL: https://github.com/apache/incubator-airflow/pull/4292 ### Jira This PR addresses [[AIRFLOW-2508] Handle non string types in render_template_from_field](https://issues.apache.org/jira/browse/AIRFLOW-2508) ### Description This PR changes `BaseOperator.render_template_from_field` method behavior: - it was previously rendering `string_types` and supporting `Number`s (and collections or dictionaries of these types); but it was raising an `AirflowException` on any other type. - it is now supporting any other type (by returning the value as is) ### Tests This PR adds the following unit tests on `BaseOperator.render_template_from_field` method: - rendering a list of strings - rendering a tuple of strings - rendering dictionary values - showing dictionary keys are not templatized - returning a `date` as is - returning a `datetime` as is - returning a `UUID` as is - returning an `object` as is ### Notice **This PR adds [pyhamcrest](http://hamcrest.org/) to Airflow _test dependencies_.** This module helps writing meaningful assertions, and also provides very clear and helpful messages on test failures. If Airflow maintainers do not want to include this test module, just let me know, and I'll rework unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1488) Add a sensor operator to wait on DagRuns
[ https://issues.apache.org/jira/browse/AIRFLOW-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712260#comment-16712260 ] ASF GitHub Bot commented on AIRFLOW-1488: - ybendana opened a new pull request #4291: [AIRFLOW-1488] Add the DagRunSensor operator URL: https://github.com/apache/incubator-airflow/pull/4291 ExternalTaskSensor allows one to wait on any valid combination of (dag_id, task_id). It is desirable though to be able to wait on entire DagRuns, as opposed to specific task instances in those DAGs. This pull request adds a new sensor in contrib called DagRunSensor. This version is a different approach from previous pull requests addressing the same issue. In this case, the DagRunSensor takes a trigger_task_id, the id of a task that triggers DagRuns. The trigger task returns a list of run_ids of the DagRuns it triggered and the DagRunSensor polls their status. For this purpose the TriggerDagRunOperator was modified so that it stores the run_id of the triggered DagRun. Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: See the above description. I also copied quite a bit from the previous pull requests on this issue #2500 and #3234. Unlike those attempts, the focus here is on triggered DagRuns. I realize it may not be generally applicable to everyone's workflows but it has been very useful for us. At one point I considered using subdags but since they have had their issues I think this a good alternative. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: I included test_dagrun_sensor.py and test_dagrun_sensor_dag.py. Currently, I'm running into an issue executing the test in the Docker environment because the sensor task has to wait on the triggered DagRun. The DagRun is triggered but is never scheduled. I think this is because the unit test uses the SequentialExecutor and SQLite. If I run the scheduler manually, the triggered DagRun executes. I'm not sure what to do about this. For now, I've commented the successful DagRun test and have only the failed DagRun test active. ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add a sensor operator to wait on DagRuns > > > Key: AIRFLOW-1488 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1488 > Project: Apache Airflow > Issue Type: New Feature > Components: contrib, operators >Reporter: Yati >Assignee: Yati >Priority: Major > > The > [ExternalTaskSensor|https://airflow.incubator.apache.org/code.html#airflow.operators.ExternalTaskSensor] > operator already allows for encoding dependencies on tasks in external DAGs. > However, when you have teams, each owning multiple small-to-medium sized > DAGs, it is desirable to be able to wait on an external DagRun as a whole. > This allows the owners of an upstream DAG to refactor their code freely by > splitting/squashing task responsibilities, without
[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445099589 I think those might be because I was in the wrong directory. Gonna try to make a PR to simplify local development if I can get this all working. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445099010 @odracci @Fokko are either of you able to actually run the docker-compose based integration tests on your local machines? I've had to make a fair number of modifications to get it to semi work and it's actually crashed my docker a few times. ``` Failure: OSError ([Errno 107] Transport endpoint is not connected: '/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/flask') ... ERROR ... ... Traceback (most recent call last): File "/app/.tox/py35-backend_postgres-env_kubernetes/bin/nosetests", line 11, in File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/core.py", line 121, in __init__ File "/usr/lib/python3.5/unittest/main.py", line 94, in __init__ self.runTests() File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/core.py", line 207, in runTests File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/core.py", line 66, in run File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/rednose.py", line 442, in printErrors File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/plugins/manager.py", line 99, in __call__ File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/plugins/manager.py", line 167, in simple File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/nose/plugins/cover.py", line 183, in report File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/coverage/control.py", line 818, in combine File "/app/.tox/py35-backend_postgres-env_kubernetes/lib/python3.5/site-packages/coverage/data.py", line 720, in combine_parallel_data coverage.misc.CoverageException: Couldn't combine from non-existent path '/app' ERRO[0191] error waiting for container: EOF ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ybendana opened a new pull request #4291: [AIRFLOW-1488] Add the DagRunSensor operator
ybendana opened a new pull request #4291: [AIRFLOW-1488] Add the DagRunSensor operator URL: https://github.com/apache/incubator-airflow/pull/4291 ExternalTaskSensor allows one to wait on any valid combination of (dag_id, task_id). It is desirable though to be able to wait on entire DagRuns, as opposed to specific task instances in those DAGs. This pull request adds a new sensor in contrib called DagRunSensor. This version is a different approach from previous pull requests addressing the same issue. In this case, the DagRunSensor takes a trigger_task_id, the id of a task that triggers DagRuns. The trigger task returns a list of run_ids of the DagRuns it triggered and the DagRunSensor polls their status. For this purpose the TriggerDagRunOperator was modified so that it stores the run_id of the triggered DagRun. Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: See the above description. I also copied quite a bit from the previous pull requests on this issue #2500 and #3234. Unlike those attempts, the focus here is on triggered DagRuns. I realize it may not be generally applicable to everyone's workflows but it has been very useful for us. At one point I considered using subdags but since they have had their issues I think this a good alternative. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: I included test_dagrun_sensor.py and test_dagrun_sensor_dag.py. Currently, I'm running into an issue executing the test in the Docker environment because the sensor task has to wait on the triggered DagRun. The DagRun is triggered but is never scheduled. I think this is because the unit test uses the SequentialExecutor and SQLite. If I run the scheduler manually, the triggered DagRun executes. I'm not sure what to do about this. For now, I've commented the successful DagRun test and have only the failed DagRun test active. ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI
codecov-io edited a comment on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI URL: https://github.com/apache/incubator-airflow/pull/3930#issuecomment-445096814 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=h1) Report > Merging [#3930](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/788bd6fcb35ce6354ae874a508772e58a850683e?src=pr=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `66.66%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3930/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3930 +/- ## == - Coverage 78.08% 78.06% -0.02% == Files 201 201 Lines 1646216470 +8 == + Hits1285412858 +4 - Misses 3608 3612 +4 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=) | `92.13% <50%> (-0.97%)` | :arrow_down: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.36% <66.66%> (-0.01%)` | :arrow_down: | | [airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==) | `72.39% <75%> (-0.02%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.3% <0%> (-0.05%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=footer). Last update [788bd6f...4d41b04](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI
codecov-io commented on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI URL: https://github.com/apache/incubator-airflow/pull/3930#issuecomment-445096814 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=h1) Report > Merging [#3930](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/788bd6fcb35ce6354ae874a508772e58a850683e?src=pr=desc) will **decrease** coverage by `0.2%`. > The diff coverage is `66.66%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3930/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3930 +/- ## == - Coverage 78.08% 77.88% -0.21% == Files 201 201 Lines 1646216470 +8 == - Hits1285412827 -27 - Misses 3608 3643 +35 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=) | `92.13% <50%> (-0.97%)` | :arrow_down: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.36% <66.66%> (-0.01%)` | :arrow_down: | | [airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==) | `72.39% <75%> (-0.02%)` | :arrow_down: | | [airflow/executors/sequential\_executor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvc2VxdWVudGlhbF9leGVjdXRvci5weQ==) | `50% <0%> (-50%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `72.85% <0%> (-5.72%)` | :arrow_down: | | [airflow/executors/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvX19pbml0X18ucHk=) | `59.61% <0%> (-3.85%)` | :arrow_down: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `58.6% <0%> (-1.08%)` | :arrow_down: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `76.66% <0%> (-0.73%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.3% <0%> (-0.05%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=footer). Last update [788bd6f...4d41b04](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI
codecov-io edited a comment on issue #3930: [AIRFLOW-2548] Output plugin import errors to web UI URL: https://github.com/apache/incubator-airflow/pull/3930#issuecomment-445096814 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=h1) Report > Merging [#3930](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/788bd6fcb35ce6354ae874a508772e58a850683e?src=pr=desc) will **decrease** coverage by `0.2%`. > The diff coverage is `66.66%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3930/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3930 +/- ## == - Coverage 78.08% 77.88% -0.21% == Files 201 201 Lines 1646216470 +8 == - Hits1285412827 -27 - Misses 3608 3643 +35 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=) | `92.13% <50%> (-0.97%)` | :arrow_down: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.36% <66.66%> (-0.01%)` | :arrow_down: | | [airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==) | `72.39% <75%> (-0.02%)` | :arrow_down: | | [airflow/executors/sequential\_executor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvc2VxdWVudGlhbF9leGVjdXRvci5weQ==) | `50% <0%> (-50%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `72.85% <0%> (-5.72%)` | :arrow_down: | | [airflow/executors/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvX19pbml0X18ucHk=) | `59.61% <0%> (-3.85%)` | :arrow_down: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `58.6% <0%> (-1.08%)` | :arrow_down: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `76.66% <0%> (-0.73%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3930/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.3% <0%> (-0.05%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=footer). Last update [788bd6f...4d41b04](https://codecov.io/gh/apache/incubator-airflow/pull/3930?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-2508) Handle non string types in render_template_from_field
[ https://issues.apache.org/jira/browse/AIRFLOW-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Galak reassigned AIRFLOW-2508: -- Assignee: Galak (was: Eugene Brown) > Handle non string types in render_template_from_field > - > > Key: AIRFLOW-2508 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2508 > Project: Apache Airflow > Issue Type: Bug > Components: models >Affects Versions: 2.0.0 >Reporter: Eugene Brown >Assignee: Galak >Priority: Minor > Labels: easyfix, newbie > Original Estimate: 2h > Remaining Estimate: 2h > > The render_template_from_field method of the BaseOperator class raises an > exception when it encounters content that is not a string_type, list, tuple > or dict. > Example exception: > {noformat} > airflow.exceptions.AirflowException: Type '' used for parameter > 'job_flow_overrides[Instances][InstanceGroups][InstanceCount]' is not > supported for templating{noformat} > I propose instead that when it encounters content of other types it returns > the content unchanged, rather than raising an exception. > Consider this case: I extended the EmrCreateJobFlowOperator to make the > job_flow_overrides argument a templatable field. job_flow_overrides is a > dictionary with a mix of strings, integers and booleans for values. > When I extended the class as such: > {code:java} > class EmrCreateJobFlowOperatorTemplateOverrides(EmrCreateJobFlowOperator): > template_fields = ['job_flow_overrides']{code} > And added a task to my dag with this format: > {code:java} > step_create_cluster = EmrCreateJobFlowOperatorTemplateOverrides( > task_id="create_cluster", > job_flow_overrides={ > "Name": "my-cluster {{ dag_run.conf['run_date'] }}", > "Instances": { > "InstanceGroups": [ > { > "Name": "Master nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 1 > }, > { > "Name": "Slave nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 4 > }, > "TerminationProtected": False > ] > }, > "BootstrapActions": [{ > "Name": "Custom action", > "ScriptBootstrapAction": { > "Path": "s3://repo/{{ dag_run.conf['branch'] > }}/requirements.txt" > } > }], >}, >aws_conn_id='aws_default', >emr_conn_id='aws_default', >dag=dag > ) > {code} > The exception I gave above was raised and the step failed. I think it would > be preferable for the method to instead pass over numeric and boolean values > as users may want to use template_fields in the way I have to template string > values in dictionaries or lists of mixed types. > Here is the render_template_from_field method from the BaseOperator: > {code:java} > def render_template_from_field(self, attr, content, context, jinja_env): > """ > Renders a template from a field. If the field is a string, it will > simply render the string and return the result. If it is a collection or > nested set of collections, it will traverse the structure and render > all strings in it. > """ > rt = self.render_template > if isinstance(content, six.string_types): > result = jinja_env.from_string(content).render(**context) > elif isinstance(content, (list, tuple)): > result = [rt(attr, e, context) for e in content] > elif isinstance(content, dict): > result = { > k: rt("{}[{}]".format(attr, k), v, context) > for k, v in list(content.items())} > else: > param_type = type(content) > msg = ( > "Type '{param_type}' used for parameter '{attr}' is " > "not supported for templating").format(**locals()) > raise AirflowException(msg) > return result{code} > I propose that the method returns content unchanged if the content is of one > of (int, float, complex, bool) types. So my solution would include an extra > elif in the form: > {code} > elif isinstance(content, (int, float, complex, bool)): > result = content > {code} > Are there any reasons this would be a bad idea? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] odracci edited a comment on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
odracci edited a comment on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445072628 @dimberman @jzucker2 please go ahead, any help is appreciated. I tried to debug this PR without success. This https://github.com/apache/incubator-airflow/pull/3855 is suspicious, the volume name is `test-volume` but there is already a volume named like that in the deployment https://github.com/apache/incubator-airflow/blob/788bd6fcb35ce6354ae874a508772e58a850683e/scripts/ci/kubernetes/kube/airflow.yaml#L129-L135 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
odracci commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-445072628 @dimberman please go ahead, any help is appreciated. I tried to debug this PR without success. This https://github.com/apache/incubator-airflow/pull/3855 is suspicious, the volume name is `test-volume` but there is already a volume named like that in the deployment https://github.com/apache/incubator-airflow/blob/788bd6fcb35ce6354ae874a508772e58a850683e/scripts/ci/kubernetes/kube/airflow.yaml#L133 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239656929 ## File path: airflow/api/common/experimental/delete_dag.py ## @@ -41,6 +49,8 @@ def delete_dag(dag_id): # noinspection PyUnresolvedReferences,PyProtectedMember for m in models.Base._decl_class_registry.values(): if hasattr(m, "dag_id"): +if keep_records_in_log and m.__name__ == 'Log': Review comment: Hi @feng-tao , regarding your questions: 1. `delete_dag()` function does not touch the log files (plain text files in `/log` folder) at all. It only deletes records of given dag_id in the tables in the backend database, including "dag", "task_instance", "dag_run", etc. (actually all tables in which `dag_id` is available). Earlier it also deletes the records in "log" table in the database, which may not be ideal (logs should not be touched by anyone other than Admin). This is what this PR is trying to solve. 1. The behaviour (API vs webserver/UI) will be consistent: when a user deletes a DAG from UI, what happens under the hood is that the URL of the button will invoke the API endpoint with the _default_ parameter (`keep_records_in_log=True`). The only difference is that users can delete records in `log` table using API if they want (of course they need to explicitly specify `keep_records_in_log=False`). Please let me know if you need any other clarification. Thanks for reviewing! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239655182 ## File path: tests/api/common/experimental/test_delete_dag.py ## @@ -0,0 +1,141 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import unittest + +from airflow import models +from airflow import settings +from airflow.api.common.experimental.delete_dag import delete_dag +from airflow.exceptions import DagNotFound, DagFileExists +from airflow.operators.dummy_operator import DummyOperator +from airflow.utils.dates import days_ago +from airflow.utils.state import State + +DM = models.DagModel +DS = models.DagStat +DR = models.DagRun +TI = models.TaskInstance +LOG = models.Log + + +class TestDeleteDAG_catch_error(unittest.TestCase): + +def setUp(self): +self.session = settings.Session() +self.dagbag = models.DagBag(include_examples=True) +self.dag_id = 'example_bash_operator' +self.dag = self.dagbag.dags[self.dag_id] + +def tearDown(self): +self.dag.clear() +self.session.close() + +def test_delete_dag_non_existent_dag(self): +with self.assertRaises(DagNotFound): +delete_dag("non-existent DAG") + +def test_delete_dag_dag_still_in_dagbag(self): +models_to_check = ['DagModel', 'DagStat', 'DagRun', 'TaskInstance'] +record_counts = {} + +for model_name in models_to_check: +m = getattr(models, model_name) +record_counts[model_name] = self.session.query(m).filter(m.dag_id == self.dag_id).count() + +with self.assertRaises(DagFileExists): +delete_dag(self.dag_id) + +# No change should happen in DB +for model_name in models_to_check: +m = getattr(models, model_name) +self.assertEqual( +self.session.query(m).filter( +m.dag_id == self.dag_id +).count(), +record_counts[model_name] +) + + +class TestDeleteDAG_successful_delete(unittest.TestCase): Review comment: Addressed. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239655165 ## File path: tests/api/common/experimental/test_delete_dag.py ## @@ -0,0 +1,141 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import unittest + +from airflow import models +from airflow import settings +from airflow.api.common.experimental.delete_dag import delete_dag +from airflow.exceptions import DagNotFound, DagFileExists +from airflow.operators.dummy_operator import DummyOperator +from airflow.utils.dates import days_ago +from airflow.utils.state import State + +DM = models.DagModel +DS = models.DagStat +DR = models.DagRun +TI = models.TaskInstance +LOG = models.Log + + +class TestDeleteDAG_catch_error(unittest.TestCase): Review comment: Thanks @feng-tao ! Have addressed this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3452) Cannot view dags at /home page
[ https://issues.apache.org/jira/browse/AIRFLOW-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712050#comment-16712050 ] Ash Berlin-Taylor commented on AIRFLOW-3452: If you are checking out from Git you will need to follow the steps in https://github.com/apache/incubator-airflow/blob/master/CONTRIBUTING.md#setting-up-the-node--npm-javascript-environment-only-for-www_rbac to build the assets. The {{display:none}} is odd, but I suspect something else is overriding that later? > Cannot view dags at /home page > -- > > Key: AIRFLOW-3452 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3452 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jinhui Zhang >Priority: Blocker > > I checked out the latest master branch(commit > {{[9dce1f0|https://github.com/apache/incubator-airflow/commit/9dce1f0740f69af0ee86709a1a34a002b245aa3e]}}) > and restarted my Airflow webserver. But I cannot view any dag at the home > page. I inspected the frontend code and found there's a > {{style="display:none;"}} on the \{{main-content}}, and the source code says > so at > [https://github.com/apache/incubator-airflow/blob/master/airflow/www_rbac/templates/airflow/dags.html#L31] > . Is this a known issue? How should I fix it? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3433) Create Google Cloud Spanner Hook
[ https://issues.apache.org/jira/browse/AIRFLOW-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712048#comment-16712048 ] ASF GitHub Bot commented on AIRFLOW-3433: - ryanyuan closed pull request #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook URL: https://github.com/apache/incubator-airflow/pull/4284 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/gcp_spanner_hook.py b/airflow/contrib/hooks/gcp_spanner_hook.py new file mode 100644 index 00..5f98cced9a --- /dev/null +++ b/airflow/contrib/hooks/gcp_spanner_hook.py @@ -0,0 +1,500 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +from airflow import AirflowException +from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook +from apiclient.discovery import HttpError, build + + +class SpannerHook(GoogleCloudBaseHook): +""" +Interact with Google Cloud Spanner. This hook uses the Google Cloud Platform +connection. + +This object is not threads safe. If you want to make multiple requests +simultaneously, you will need to create a hook per thread. +""" + +def __init__( +self, gcp_conn_id="google_cloud_default", delegate_to=None, version="v1" +): +super(SpannerHook, self).__init__(gcp_conn_id, delegate_to) +self.version = version + +def get_conn(self): +""" +Returns a Google Cloud Spanner service object. +""" +http_authorized = self._authorize() +return build( +"spanner", self.version, http=http_authorized, cache_discovery=False +) + +def list_instance_configs(self, project_id): +""" +Lists the supported instance configurations for a given project. + +.. seealso:: +For more information, see: + https://cloud.google.com/spanner/docs/reference/rest/v1/projects.instanceConfigs/list + +:param project_id: The name of the project for which a list of supported instance +configurations is requested +:type project_id: str +""" + +project_id = project_id if project_id is not None else self.project_id + +self.log.info("Retrieving Spanner instance configs") + +try: +resp = ( +self.get_conn() +.projects() +.instanceConfigs() +.list(parent="projects/{}".format(project_id)) +.execute() +) + +self.log.info("Spanner instance configs retrieved successfully") + +return resp +except HttpError as err: +raise AirflowException( +"BigQuery job failed. Error was: {}".format(err.content) +) + +def create_instance(self, project_id, body): +""" +Method to create a Cloud Spanner instance and begins preparing it to begin serving. +If the named instance already exists, it will return 409 Instance Already Exists. + +.. seealso:: +For more information, see: + https://cloud.google.com/spanner/docs/reference/rest/v1/projects.instances/create + +:param project_id: The name of the project in which to create the instance +:type project_id: str +:param body: the request body containing instance creation data +:type rows: dict + +**Example or body**: +body = { +"instanceId": "spanner-instance-1", +"instance": { +"nodeCount": 1, +"config": "projects/spanner-project/instanceConfigs/eur3", +"displayName": "Spanner Instance 1", +}, +} +""" + +project_id = project_id if project_id is not None else self.project_id + +if "instanceId" not in body: +raise ValueError("instanceId is undefined in the body")
[GitHub] ryanyuan commented on issue #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook
ryanyuan commented on issue #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook URL: https://github.com/apache/incubator-airflow/pull/4284#issuecomment-445038810 Hi @potiuk. I didn’t notice you guys were also implementing Spanner hook. I will close this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ryanyuan closed pull request #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook
ryanyuan closed pull request #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook URL: https://github.com/apache/incubator-airflow/pull/4284 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/gcp_spanner_hook.py b/airflow/contrib/hooks/gcp_spanner_hook.py new file mode 100644 index 00..5f98cced9a --- /dev/null +++ b/airflow/contrib/hooks/gcp_spanner_hook.py @@ -0,0 +1,500 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +from airflow import AirflowException +from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook +from apiclient.discovery import HttpError, build + + +class SpannerHook(GoogleCloudBaseHook): +""" +Interact with Google Cloud Spanner. This hook uses the Google Cloud Platform +connection. + +This object is not threads safe. If you want to make multiple requests +simultaneously, you will need to create a hook per thread. +""" + +def __init__( +self, gcp_conn_id="google_cloud_default", delegate_to=None, version="v1" +): +super(SpannerHook, self).__init__(gcp_conn_id, delegate_to) +self.version = version + +def get_conn(self): +""" +Returns a Google Cloud Spanner service object. +""" +http_authorized = self._authorize() +return build( +"spanner", self.version, http=http_authorized, cache_discovery=False +) + +def list_instance_configs(self, project_id): +""" +Lists the supported instance configurations for a given project. + +.. seealso:: +For more information, see: + https://cloud.google.com/spanner/docs/reference/rest/v1/projects.instanceConfigs/list + +:param project_id: The name of the project for which a list of supported instance +configurations is requested +:type project_id: str +""" + +project_id = project_id if project_id is not None else self.project_id + +self.log.info("Retrieving Spanner instance configs") + +try: +resp = ( +self.get_conn() +.projects() +.instanceConfigs() +.list(parent="projects/{}".format(project_id)) +.execute() +) + +self.log.info("Spanner instance configs retrieved successfully") + +return resp +except HttpError as err: +raise AirflowException( +"BigQuery job failed. Error was: {}".format(err.content) +) + +def create_instance(self, project_id, body): +""" +Method to create a Cloud Spanner instance and begins preparing it to begin serving. +If the named instance already exists, it will return 409 Instance Already Exists. + +.. seealso:: +For more information, see: + https://cloud.google.com/spanner/docs/reference/rest/v1/projects.instances/create + +:param project_id: The name of the project in which to create the instance +:type project_id: str +:param body: the request body containing instance creation data +:type rows: dict + +**Example or body**: +body = { +"instanceId": "spanner-instance-1", +"instance": { +"nodeCount": 1, +"config": "projects/spanner-project/instanceConfigs/eur3", +"displayName": "Spanner Instance 1", +}, +} +""" + +project_id = project_id if project_id is not None else self.project_id + +if "instanceId" not in body: +raise ValueError("instanceId is undefined in the body") + +self.log.info("Creating Spanner instance") + +try: +resp = ( +self.get_conn() +.projects() +.instances() +
[GitHub] apraovjr opened a new pull request #4290: [AIRFLOW-3282] Azure Kubernetes Service Operator
apraovjr opened a new pull request #4290: [AIRFLOW-3282] Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4290 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3282) Implement a Azure Kubernetes Service Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712037#comment-16712037 ] ASF GitHub Bot commented on AIRFLOW-3282: - apraovjr opened a new pull request #4290: [AIRFLOW-3282] Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4290 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement a Azure Kubernetes Service Operator > - > > Key: AIRFLOW-3282 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3282 > Project: Apache Airflow > Issue Type: New Feature >Reporter: Aparna >Assignee: Aparna >Priority: Major > > Add AKS Operator -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] mchalek commented on issue #2823: fix to inconsistent task instance state log message
mchalek commented on issue #2823: fix to inconsistent task instance state log message URL: https://github.com/apache/incubator-airflow/pull/2823#issuecomment-445024915 @ron819 it looks like someone else fixed it. i will close this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mchalek closed pull request #2823: fix to inconsistent task instance state log message
mchalek closed pull request #2823: fix to inconsistent task instance state log message URL: https://github.com/apache/incubator-airflow/pull/2823 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/jobs.py b/airflow/jobs.py index 868e785c8c..230c418cf4 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -1421,7 +1421,7 @@ def _process_executor_events(self, simple_dag_bag, session=None): if ti.state == State.QUEUED: msg = ("Executor reports task instance %s finished (%s) " "although the task says its %s. Was the task " - "killed externally?".format(ti, state, ti.state)) + "killed externally?" % (ti, state, ti.state)) self.log.error(msg) try: simple_dag = simple_dag_bag.get_dag(dag_id) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-3408) Systemd setup instructions mention deprecated variable
[ https://issues.apache.org/jira/browse/AIRFLOW-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3408. - Resolution: Fixed > Systemd setup instructions mention deprecated variable > -- > > Key: AIRFLOW-3408 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3408 > Project: Apache Airflow > Issue Type: Improvement > Components: Documentation >Reporter: Victor Villas Bôas Chaves >Assignee: Kaxil Naik >Priority: Minor > Fix For: 1.10.2 > > > AIRFLOW-1698 was solved in code, but the documentation drifted. > Places where {{SCHEDULER_RUNS}} is still mentioned as necessary adjustment: > [https://github.com/apache/incubator-airflow/blob/53b89b98371c7bb993b242c341d3941e9ce09f9a/scripts/systemd/README] > [https://github.com/apache/incubator-airflow/blob/b9fc03ea1ad5cea3c3aa668fcaca103f84167b9c/docs/howto/run-with-systemd.rst] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator
kaxil commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator URL: https://github.com/apache/incubator-airflow/pull/4289#issuecomment-445023414 @feng-tao Hmm.. that is strange. Raise a ticket with Apache-Infra to give you permissions to Resolve Issues This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator
feng-tao commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator URL: https://github.com/apache/incubator-airflow/pull/4289#issuecomment-445022613 https://user-images.githubusercontent.com/3223098/49610765-58f7bd00-f954-11e8-975a-0b5cd14cd28b.png;> This is what I see for a jira filed/assigned by others. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator
kaxil commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator URL: https://github.com/apache/incubator-airflow/pull/4289#issuecomment-445021717 @feng-tao You should have the permission. Can you give it a try on https://issues.apache.org/jira/browse/AIRFLOW-3408 . Click on **Resolve Issue** This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart
codecov-io edited a comment on issue #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart URL: https://github.com/apache/incubator-airflow/pull/4083#issuecomment-432075536 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=h1) Report > Merging [#4083](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/be6d35d3a415c5e06b55c830b25b57253dab0634?src=pr=desc) will **decrease** coverage by `6.08%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4083/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4083 +/- ## == - Coverage 77.82% 71.73% -6.09% == Files 201 201 Lines 1636618389+2023 == + Hits1273713192 +455 - Misses 3629 5197+1568 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/executors/celery\_executor.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvY2VsZXJ5X2V4ZWN1dG9yLnB5) | `41.36% <0%> (-39.26%)` | :arrow_down: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `54.01% <0%> (-23.36%)` | :arrow_down: | | [airflow/executors/base\_executor.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==) | `71.42% <0%> (-22.33%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `75.6% <0%> (-16.69%)` | :arrow_down: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `43.69% <0%> (-14.17%)` | :arrow_down: | | [airflow/utils/timeout.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy90aW1lb3V0LnB5) | `70.58% <0%> (-7.19%)` | :arrow_down: | | [airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==) | `29.33% <0%> (-4.27%)` | :arrow_down: | | [airflow/logging\_config.py](https://codecov.io/gh/apache/incubator-airflow/pull/4083/diff?src=pr=tree#diff-YWlyZmxvdy9sb2dnaW5nX2NvbmZpZy5weQ==) | `97.56% <0%> (+0.06%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=footer). Last update [be6d35d...c3c019e](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator
feng-tao commented on issue #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator URL: https://github.com/apache/incubator-airflow/pull/4289#issuecomment-445020858 thanks @kaxil, cc @tmiller-msft , btw, I can't close others jira with my current account, do I need to file any specific request to apache for that? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator
kaxil commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265#issuecomment-445020800 I have created a new Merge Request to address some of the minor changes related to this PR in https://github.com/apache/incubator-airflow/pull/4289/files This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil opened a new pull request #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator
kaxil opened a new pull request #4289: [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator URL: https://github.com/apache/incubator-airflow/pull/4289 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: - Fixed Documentation in integration.rst - Fixed Incorrect type in docstring of `AzureCosmosInsertDocumentOperator` - Added the Hook, Sensor and Operator in code.rst - Updated the name of example DAG and its filename to follow the convention ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3418) Task stuck in running state, unable to clear
[ https://issues.apache.org/jira/browse/AIRFLOW-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711972#comment-16711972 ] Gabriel Silk commented on AIRFLOW-3418: --- I'm seeing this issue as well, and I would re-iterate the criticality of this issue. It's currently breaking our production clusters. > Task stuck in running state, unable to clear > > > Key: AIRFLOW-3418 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3418 > Project: Apache Airflow > Issue Type: Bug > Components: worker >Affects Versions: 1.10.1 >Reporter: James Meickle >Priority: Critical > > One of our tasks (a custom operator that sleep-waits until NYSE market close) > got stuck in a "running" state in the metadata db without making any > progress. This is what it looked like in the logs: > {code:java} > [2018-11-29 00:01:14,064] {{base_task_runner.py:101}} INFO - Job 38275: > Subtask after_close [2018-11-29 00:01:14,063] {{cli.py:484}} INFO - Running > [running]> on host airflow-core-i-0a53cac37067d957d.dlg.fnd.dynoquant.com > [2018-11-29 06:03:57,643] {{models.py:1355}} INFO - Dependencies not met for > [running]>, dependency 'Task Instance State' FAILED: Task is in the 'running' > state which is not a valid state for execution. The task must be cleared in > order to be run. > [2018-11-29 06:03:57,644] {{models.py:1355}} INFO - Dependencies not met for > [running]>, dependency 'Task Instance Not Already Running' FAILED: Task is > already running, it started on 2018-11-29 00:01:10.876344+00:00. > [2018-11-29 06:03:57,646] {{logging_mixin.py:95}} INFO - [2018-11-29 > 06:03:57,646] {{jobs.py:2614}} INFO - Task is not able to be run > {code} > Seeing this state, we attempted to "clear" it in the web UI. This yielded a > complex backtrace: > {code:java} > Traceback (most recent call last): > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", > line 1982, in wsgi_app > response = self.full_dispatch_request() > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", > line 1614, in full_dispatch_request > rv = self.handle_user_exception(e) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", > line 1517, in handle_user_exception > reraise(exc_type, exc_value, tb) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/_compat.py", > line 33, in reraise > raise value > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", > line 1612, in full_dispatch_request > rv = self.dispatch_request() > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", > line 1598, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/flask_appbuilder/security/decorators.py", > line 26, in wraps > return f(self, *args, **kwargs) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/airflow/www_rbac/decorators.py", > line 55, in wrapper > return f(*args, **kwargs) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/airflow/www_rbac/views.py", > line 837, in clear > include_upstream=upstream) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/airflow/models.py", > line 4011, in sub_dag > dag = copy.deepcopy(self) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 166, > in deepcopy > y = copier(memo) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/airflow/models.py", > line 3996, in __deepcopy__ > setattr(result, k, copy.deepcopy(v, memo)) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 155, > in deepcopy > y = copier(x, memo) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 243, > in _deepcopy_dict > y[deepcopy(key, memo)] = deepcopy(value, memo) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 166, > in deepcopy > y = copier(memo) > File > "/home/airflow/virtualenvs/airflow/lib/python3.5/site-packages/airflow/models.py", > line 2740, in __deepcopy__ > setattr(result, k, copy.deepcopy(v, memo)) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 182, > in deepcopy > y = _reconstruct(x, rv, 1, memo) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 297, > in _reconstruct > state = deepcopy(state, memo) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 155, > in deepcopy > y = copier(x, memo) > File "/home/airflow/virtualenvs/airflow/lib/python3.5/copy.py", line 243, > in _deepcopy_dict >
[jira] [Created] (AIRFLOW-3481) Add SFTP conn type
cixuuz created AIRFLOW-3481: --- Summary: Add SFTP conn type Key: AIRFLOW-3481 URL: https://issues.apache.org/jira/browse/AIRFLOW-3481 Project: Apache Airflow Issue Type: Improvement Components: configuration Affects Versions: 1.10.1 Reporter: cixuuz Assignee: cixuuz Add a connection type: `SFTP` for distinguish from FTP because FTPhook and SFTPhook have different implementations. The use case is a factory method to choose hook based on conn type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil edited a comment on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator
kaxil edited a comment on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265#issuecomment-445011063 @feng-tao - Remember to close the Jira issues as well as since we moved to Gitbox, we need to manually close the Jira. I have closed this one :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jj-ian commented on a change in pull request #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart
jj-ian commented on a change in pull request #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart URL: https://github.com/apache/incubator-airflow/pull/4083#discussion_r239595197 ## File path: airflow/contrib/hooks/gcp_dataproc_hook.py ## @@ -33,12 +33,82 @@ def __init__(self, dataproc_api, project_id, job, region='global', self.dataproc_api = dataproc_api self.project_id = project_id self.region = region + +# Check if the job to submit is already running on the cluster. +# If so, don't resubmit the job. +try: +cluster_name = job['job']['placement']['clusterName'] +except KeyError: +self.log.error('Job to submit is incorrectly configured.') +raise + +jobs_on_cluster_response = dataproc_api.projects().regions().jobs().list( +projectId=self.project_id, +region=self.region, +clusterName=cluster_name).execute() + +UUID_LENGTH = 9 Review comment: Hi @fenglu-g , I started working on the regex idea and I don't think it makes sense to have the user define the deduping logic. It seems more error prone to me to have the user define the job ID they want to use for deduping, since the job ID is not user defined, but is instead hardcoded by Airflow here as the task ID + first 8 characters of UUID: https://github.com/apache/incubator-airflow/blob/0e8394fd23d067b7e226c011bb1825ff734219c5/airflow/contrib/hooks/gcp_dataproc_hook.py#L94. Since Airflow sets the job ID, Airflow should also set the deduping logic. I understand your concern about not wanting to hardcode the deduping logic, though. A solution for this could be: instead of having Airflow hardcode the job ID as in the link above, to have it instead call a function that creates the job ID from the task ID. And then my deduping logic would reference an inverse of that function to extract the task ID for deduping instead of hardcoding the deduping logic. In other words, Airflow would create the job ID with a function that maps the task ID -> job ID. And then the dedupe mechanism would call a function that maps from job ID back to task ID to use in deduping. Tests will be added to make sure that these two functions are inverses of each other. How does that sound? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-3406) Implement an Azure CosmosDB operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3406. - Resolution: Fixed Fix Version/s: 1.10.2 Resolved by https://github.com/apache/incubator-airflow/pull/4265 > Implement an Azure CosmosDB operator > - > > Key: AIRFLOW-3406 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3406 > Project: Apache Airflow > Issue Type: New Feature >Reporter: Tom Miller >Assignee: Tom Miller >Priority: Major > Fix For: 1.10.2 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator
kaxil commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265#issuecomment-445011063 @feng-tao - Remember to close the Jira issues as well as since we moved to Gitbox, we need to manually close the Jira. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3452) Cannot view dags at /home page
[ https://issues.apache.org/jira/browse/AIRFLOW-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711949#comment-16711949 ] Jinhui Zhang commented on AIRFLOW-3452: --- I managed to get it work again by replacing the frontend code (\{{static}} and \{{templates}} folders) with the code at 1.10.1 version, and changing the \{{task_runner}} at \{{airflow.cfg}} from \{{BashTaskRunner}} to \{{StandardTaskRunner}}. > Cannot view dags at /home page > -- > > Key: AIRFLOW-3452 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3452 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jinhui Zhang >Priority: Blocker > > I checked out the latest master branch(commit > {{[9dce1f0|https://github.com/apache/incubator-airflow/commit/9dce1f0740f69af0ee86709a1a34a002b245aa3e]}}) > and restarted my Airflow webserver. But I cannot view any dag at the home > page. I inspected the frontend code and found there's a > {{style="display:none;"}} on the \{{main-content}}, and the source code says > so at > [https://github.com/apache/incubator-airflow/blob/master/airflow/www_rbac/templates/airflow/dags.html#L31] > . Is this a known issue? How should I fix it? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on issue #4274: [AIRFLOW-3438] Fix default values in BigQuery Hook & BigQueryOperator
kaxil commented on issue #4274: [AIRFLOW-3438] Fix default values in BigQuery Hook & BigQueryOperator URL: https://github.com/apache/incubator-airflow/pull/4274#issuecomment-445007385 Can you please review this @fenglu-g ?? @xnuinside - Can you please review ?? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711947#comment-16711947 ] ASF GitHub Bot commented on AIRFLOW-2524: - kaxil closed pull request #4278: [AIRFLOW-2524] Add SageMaker doc to AWS integration section URL: https://github.com/apache/incubator-airflow/pull/4278 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/integration.rst b/docs/integration.rst index 7387fc25f4..be7e95bbea 100644 --- a/docs/integration.rst +++ b/docs/integration.rst @@ -398,6 +398,70 @@ AwsFirehoseHook .. autoclass:: airflow.contrib.hooks.aws_firehose_hook.AwsFirehoseHook +Amazon SageMaker + + +For more instructions on using Amazon SageMaker in Airflow, please see `the SageMaker Python SDK README`_. + +.. _the SageMaker Python SDK README: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/README.rst + +- :ref:`SageMakerHook` : Interact with Amazon SageMaker. +- :ref:`SageMakerTrainingOperator` : Create a SageMaker training job. +- :ref:`SageMakerTuningOperator` : Create a SageMaker tuning job. +- :ref:`SageMakerModelOperator` : Create a SageMaker model. +- :ref:`SageMakerTransformOperator` : Create a SageMaker transform job. +- :ref:`SageMakerEndpointConfigOperator` : Create a SageMaker endpoint config. +- :ref:`SageMakerEndpointOperator` : Create a SageMaker endpoint. + +.. _SageMakerHook: + +SageMakerHook +" + +.. autoclass:: airflow.contrib.hooks.sagemaker_hook.SageMakerHook + +.. _SageMakerTrainingOperator: + +SageMakerTrainingOperator +" + +.. autoclass:: airflow.contrib.operators.sagemaker_training_operator.SageMakerTrainingOperator + +.. _SageMakerTuningOperator: + +SageMakerTuningOperator +""" + +.. autoclass:: airflow.contrib.operators.sagemaker_tuning_operator.SageMakerTuningOperator + +.. _SageMakerModelOperator: + +SageMakerModelOperator +"" + +.. autoclass:: airflow.contrib.operators.sagemaker_model_operator.SageMakerModelOperator + +.. _SageMakerTransformOperator: + +SageMakerTransformOperator +"" + +.. autoclass:: airflow.contrib.operators.sagemaker_transform_operator.SageMakerTransformOperator + +.. _SageMakerEndpointConfigOperator: + +SageMakerEndpointConfigOperator +""" + +.. autoclass:: airflow.contrib.operators.sagemaker_endpoint_config_operator.SageMakerEndpointConfigOperator + +.. _SageMakerEndpointOperator: + +SageMakerEndpointOperator +" + +.. autoclass:: airflow.contrib.operators.sagemaker_endpoint_operator.SageMakerEndpointOperator + .. _Databricks: Databricks This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > Fix For: 1.10.1 > > Time Spent: 10m > Remaining Estimate: 0h > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil closed pull request #4278: [AIRFLOW-2524] Add SageMaker doc to AWS integration section
kaxil closed pull request #4278: [AIRFLOW-2524] Add SageMaker doc to AWS integration section URL: https://github.com/apache/incubator-airflow/pull/4278 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/integration.rst b/docs/integration.rst index 7387fc25f4..be7e95bbea 100644 --- a/docs/integration.rst +++ b/docs/integration.rst @@ -398,6 +398,70 @@ AwsFirehoseHook .. autoclass:: airflow.contrib.hooks.aws_firehose_hook.AwsFirehoseHook +Amazon SageMaker + + +For more instructions on using Amazon SageMaker in Airflow, please see `the SageMaker Python SDK README`_. + +.. _the SageMaker Python SDK README: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/README.rst + +- :ref:`SageMakerHook` : Interact with Amazon SageMaker. +- :ref:`SageMakerTrainingOperator` : Create a SageMaker training job. +- :ref:`SageMakerTuningOperator` : Create a SageMaker tuning job. +- :ref:`SageMakerModelOperator` : Create a SageMaker model. +- :ref:`SageMakerTransformOperator` : Create a SageMaker transform job. +- :ref:`SageMakerEndpointConfigOperator` : Create a SageMaker endpoint config. +- :ref:`SageMakerEndpointOperator` : Create a SageMaker endpoint. + +.. _SageMakerHook: + +SageMakerHook +" + +.. autoclass:: airflow.contrib.hooks.sagemaker_hook.SageMakerHook + +.. _SageMakerTrainingOperator: + +SageMakerTrainingOperator +" + +.. autoclass:: airflow.contrib.operators.sagemaker_training_operator.SageMakerTrainingOperator + +.. _SageMakerTuningOperator: + +SageMakerTuningOperator +""" + +.. autoclass:: airflow.contrib.operators.sagemaker_tuning_operator.SageMakerTuningOperator + +.. _SageMakerModelOperator: + +SageMakerModelOperator +"" + +.. autoclass:: airflow.contrib.operators.sagemaker_model_operator.SageMakerModelOperator + +.. _SageMakerTransformOperator: + +SageMakerTransformOperator +"" + +.. autoclass:: airflow.contrib.operators.sagemaker_transform_operator.SageMakerTransformOperator + +.. _SageMakerEndpointConfigOperator: + +SageMakerEndpointConfigOperator +""" + +.. autoclass:: airflow.contrib.operators.sagemaker_endpoint_config_operator.SageMakerEndpointConfigOperator + +.. _SageMakerEndpointOperator: + +SageMakerEndpointOperator +" + +.. autoclass:: airflow.contrib.operators.sagemaker_endpoint_operator.SageMakerEndpointOperator + .. _Databricks: Databricks This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3282) Implement a Azure Kubernetes Service Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711845#comment-16711845 ] ASF GitHub Bot commented on AIRFLOW-3282: - apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py new file mode 100644 index 00..79fa5c5c16 --- /dev/null +++ b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py @@ -0,0 +1,52 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow import DAG +from airflow.contrib.operators.aks_operator import AzureKubernetesOperator +from datetime import datetime, timedelta + +seven_days_ago = datetime.combine(datetime.today() - timedelta(7), + datetime.min.time()) +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': seven_days_ago, +'email': ['em...@microsoft.com'], +'email_on_failure': False, +'email_on_retry': False, +'retries': 1, +'retry_delay': timedelta(minutes=5), +} + +dag = DAG( +dag_id='aks_container', +default_args=default_args, +schedule_interval=None, +) + +start_aks_container = AzureKubernetesOperator( +task_id="start_aks_container", +ci_conn_id='azure_kubernetes_default', +resource_group="apraotest1", +name="akres1", +ssh_key_value=None, +dns_name_prefix=None, +location="eastus", +tags=None, +dag=dag) diff --git a/airflow/contrib/hooks/azure_kubernetes_hook.py b/airflow/contrib/hooks/azure_kubernetes_hook.py new file mode 100644 index 00..dfd9200f64 --- /dev/null +++ b/airflow/contrib/hooks/azure_kubernetes_hook.py @@ -0,0 +1,75 @@ +# -*- coding: utf-8 -*- +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os + +from airflow.hooks.base_hook import BaseHook +from airflow.exceptions import AirflowException + +from azure.common.credentials import ServicePrincipalCredentials +from azure.mgmt.containerservice import ContainerServiceClient +from azure.mgmt.resource import ResourceManagementClient +from airflow.contrib.utils.aks_utils import load_json + + +class AzureKubernetesServiceHook(BaseHook): + +def __init__(self, conn_id=None): +self.conn_id = conn_id +self.connection = self.get_conn() +self.configData = None +self.credentials = None +self.subscription_id = None +self.clientId = None +self.clientSecret = None + +def get_conn(self): +if self.conn_id: +conn = self.get_connection(self.conn_id) +key_path = conn.extra_dejson.get('key_path', False) +if key_path: +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') + +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +if os.environ.get('AZURE_AUTH_LOCATION'): +key_path = os.environ.get('AZURE_AUTH_LOCATION') +if key_path.endswith('.json'): +self.log.info('Getting
[jira] [Commented] (AIRFLOW-3282) Implement a Azure Kubernetes Service Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711853#comment-16711853 ] ASF GitHub Bot commented on AIRFLOW-3282: - apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py new file mode 100644 index 00..79fa5c5c16 --- /dev/null +++ b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py @@ -0,0 +1,52 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow import DAG +from airflow.contrib.operators.aks_operator import AzureKubernetesOperator +from datetime import datetime, timedelta + +seven_days_ago = datetime.combine(datetime.today() - timedelta(7), + datetime.min.time()) +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': seven_days_ago, +'email': ['em...@microsoft.com'], +'email_on_failure': False, +'email_on_retry': False, +'retries': 1, +'retry_delay': timedelta(minutes=5), +} + +dag = DAG( +dag_id='aks_container', +default_args=default_args, +schedule_interval=None, +) + +start_aks_container = AzureKubernetesOperator( +task_id="start_aks_container", +ci_conn_id='azure_kubernetes_default', +resource_group="apraotest1", +name="akres1", +ssh_key_value=None, +dns_name_prefix=None, +location="eastus", +tags=None, +dag=dag) diff --git a/airflow/contrib/hooks/azure_kubernetes_hook.py b/airflow/contrib/hooks/azure_kubernetes_hook.py new file mode 100644 index 00..dfd9200f64 --- /dev/null +++ b/airflow/contrib/hooks/azure_kubernetes_hook.py @@ -0,0 +1,75 @@ +# -*- coding: utf-8 -*- +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os + +from airflow.hooks.base_hook import BaseHook +from airflow.exceptions import AirflowException + +from azure.common.credentials import ServicePrincipalCredentials +from azure.mgmt.containerservice import ContainerServiceClient +from azure.mgmt.resource import ResourceManagementClient +from airflow.contrib.utils.aks_utils import load_json + + +class AzureKubernetesServiceHook(BaseHook): + +def __init__(self, conn_id=None): +self.conn_id = conn_id +self.connection = self.get_conn() +self.configData = None +self.credentials = None +self.subscription_id = None +self.clientId = None +self.clientSecret = None + +def get_conn(self): +if self.conn_id: +conn = self.get_connection(self.conn_id) +key_path = conn.extra_dejson.get('key_path', False) +if key_path: +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') + +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +if os.environ.get('AZURE_AUTH_LOCATION'): +key_path = os.environ.get('AZURE_AUTH_LOCATION') +if key_path.endswith('.json'): +self.log.info('Getting
[jira] [Commented] (AIRFLOW-3282) Implement a Azure Kubernetes Service Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711851#comment-16711851 ] ASF GitHub Bot commented on AIRFLOW-3282: - apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement a Azure Kubernetes Service Operator > - > > Key: AIRFLOW-3282 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3282 > Project: Apache Airflow > Issue Type: New Feature >Reporter: Aparna >Assignee: Aparna >Priority: Major > > Add AKS Operator -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator
apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py new file mode 100644 index 00..79fa5c5c16 --- /dev/null +++ b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py @@ -0,0 +1,52 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow import DAG +from airflow.contrib.operators.aks_operator import AzureKubernetesOperator +from datetime import datetime, timedelta + +seven_days_ago = datetime.combine(datetime.today() - timedelta(7), + datetime.min.time()) +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': seven_days_ago, +'email': ['em...@microsoft.com'], +'email_on_failure': False, +'email_on_retry': False, +'retries': 1, +'retry_delay': timedelta(minutes=5), +} + +dag = DAG( +dag_id='aks_container', +default_args=default_args, +schedule_interval=None, +) + +start_aks_container = AzureKubernetesOperator( +task_id="start_aks_container", +ci_conn_id='azure_kubernetes_default', +resource_group="apraotest1", +name="akres1", +ssh_key_value=None, +dns_name_prefix=None, +location="eastus", +tags=None, +dag=dag) diff --git a/airflow/contrib/hooks/azure_kubernetes_hook.py b/airflow/contrib/hooks/azure_kubernetes_hook.py new file mode 100644 index 00..dfd9200f64 --- /dev/null +++ b/airflow/contrib/hooks/azure_kubernetes_hook.py @@ -0,0 +1,75 @@ +# -*- coding: utf-8 -*- +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os + +from airflow.hooks.base_hook import BaseHook +from airflow.exceptions import AirflowException + +from azure.common.credentials import ServicePrincipalCredentials +from azure.mgmt.containerservice import ContainerServiceClient +from azure.mgmt.resource import ResourceManagementClient +from airflow.contrib.utils.aks_utils import load_json + + +class AzureKubernetesServiceHook(BaseHook): + +def __init__(self, conn_id=None): +self.conn_id = conn_id +self.connection = self.get_conn() +self.configData = None +self.credentials = None +self.subscription_id = None +self.clientId = None +self.clientSecret = None + +def get_conn(self): +if self.conn_id: +conn = self.get_connection(self.conn_id) +key_path = conn.extra_dejson.get('key_path', False) +if key_path: +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') + +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +if os.environ.get('AZURE_AUTH_LOCATION'): +key_path = os.environ.get('AZURE_AUTH_LOCATION') +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +self.credentials =
[GitHub] apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator
apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator
apraovjr closed pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py new file mode 100644 index 00..79fa5c5c16 --- /dev/null +++ b/airflow/contrib/example_dags/example_azure_kubernetes_container_operator.py @@ -0,0 +1,52 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow import DAG +from airflow.contrib.operators.aks_operator import AzureKubernetesOperator +from datetime import datetime, timedelta + +seven_days_ago = datetime.combine(datetime.today() - timedelta(7), + datetime.min.time()) +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': seven_days_ago, +'email': ['em...@microsoft.com'], +'email_on_failure': False, +'email_on_retry': False, +'retries': 1, +'retry_delay': timedelta(minutes=5), +} + +dag = DAG( +dag_id='aks_container', +default_args=default_args, +schedule_interval=None, +) + +start_aks_container = AzureKubernetesOperator( +task_id="start_aks_container", +ci_conn_id='azure_kubernetes_default', +resource_group="apraotest1", +name="akres1", +ssh_key_value=None, +dns_name_prefix=None, +location="eastus", +tags=None, +dag=dag) diff --git a/airflow/contrib/hooks/azure_kubernetes_hook.py b/airflow/contrib/hooks/azure_kubernetes_hook.py new file mode 100644 index 00..dfd9200f64 --- /dev/null +++ b/airflow/contrib/hooks/azure_kubernetes_hook.py @@ -0,0 +1,75 @@ +# -*- coding: utf-8 -*- +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os + +from airflow.hooks.base_hook import BaseHook +from airflow.exceptions import AirflowException + +from azure.common.credentials import ServicePrincipalCredentials +from azure.mgmt.containerservice import ContainerServiceClient +from azure.mgmt.resource import ResourceManagementClient +from airflow.contrib.utils.aks_utils import load_json + + +class AzureKubernetesServiceHook(BaseHook): + +def __init__(self, conn_id=None): +self.conn_id = conn_id +self.connection = self.get_conn() +self.configData = None +self.credentials = None +self.subscription_id = None +self.clientId = None +self.clientSecret = None + +def get_conn(self): +if self.conn_id: +conn = self.get_connection(self.conn_id) +key_path = conn.extra_dejson.get('key_path', False) +if key_path: +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') + +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +if os.environ.get('AZURE_AUTH_LOCATION'): +key_path = os.environ.get('AZURE_AUTH_LOCATION') +if key_path.endswith('.json'): +self.log.info('Getting connection using a JSON key file.') +self.configData = load_json(self, key_path) +else: +raise AirflowException('Unrecognised extension for key file.') + +self.credentials =
[jira] [Commented] (AIRFLOW-3282) Implement a Azure Kubernetes Service Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711840#comment-16711840 ] ASF GitHub Bot commented on AIRFLOW-3282: - apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement a Azure Kubernetes Service Operator > - > > Key: AIRFLOW-3282 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3282 > Project: Apache Airflow > Issue Type: New Feature >Reporter: Aparna >Assignee: Aparna >Priority: Major > > Add AKS Operator -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator
apraovjr opened a new pull request #4288: [AIRFLOW-3282] Implement an Azure Kubernetes Service Operator URL: https://github.com/apache/incubator-airflow/pull/4288 Add an operator to spin up azure kubernetes service. Azure Kubernetes Service is use to deploy a managed kubernetes cluster in Azure. Operator supports creating different AKS cluster.It checks whether there is already existing cluster if not creates one. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3282 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3406) Implement an Azure CosmosDB operator
[ https://issues.apache.org/jira/browse/AIRFLOW-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711822#comment-16711822 ] ASF GitHub Bot commented on AIRFLOW-3406: - feng-tao closed pull request #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_cosmosdb_sensor.py b/airflow/contrib/example_dags/example_cosmosdb_sensor.py new file mode 100644 index 00..a801d9f41b --- /dev/null +++ b/airflow/contrib/example_dags/example_cosmosdb_sensor.py @@ -0,0 +1,64 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +""" +This is only an example DAG to highlight usage of AzureCosmosDocumentSensor to detect +if a document now exists. + +You can trigger this manually with `airflow trigger_dag example_cosmosdb_sensor`. + +*Note: Make sure that connection `azure_cosmos_default` is properly set before running +this example.* +""" + +from airflow import DAG +from airflow.contrib.sensors.azure_cosmos_sensor import AzureCosmosDocumentSensor +from airflow.contrib.operators.azure_cosmos_insertdocument_operator import AzureCosmosInsertDocumentOperator +from airflow.utils import dates + +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': dates.days_ago(2), +'email': ['airf...@example.com'], +'email_on_failure': False, +'email_on_retry': False +} + +dag = DAG('example_cosmosdb_sensor', default_args=default_args) + +dag.doc_md = __doc__ + +t1 = AzureCosmosDocumentSensor( +task_id='check_cosmos_file', +database_name='airflow_example_db', +collection_name='airflow_example_coll', +document_id='airflow_checkid', +azure_cosmos_conn_id='azure_cosmos_default', +dag=dag) + +t2 = AzureCosmosInsertDocumentOperator( +task_id='insert_cosmos_file', +dag=dag, +database_name='airflow_example_db', +collection_name='new-collection', +document={"id": "someuniqueid", "param1": "value1", "param2": "value2"}, +azure_cosmos_conn_id='azure_cosmos_default') + +t2.set_upstream(t1) diff --git a/airflow/contrib/hooks/azure_cosmos_hook.py b/airflow/contrib/hooks/azure_cosmos_hook.py new file mode 100644 index 00..01b4007b03 --- /dev/null +++ b/airflow/contrib/hooks/azure_cosmos_hook.py @@ -0,0 +1,287 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +import azure.cosmos.cosmos_client as cosmos_client +from azure.cosmos.errors import HTTPFailure +import uuid + +from airflow.exceptions import AirflowBadRequest +from airflow.hooks.base_hook import BaseHook + + +class AzureCosmosDBHook(BaseHook): +""" +Interacts with Azure CosmosDB. + +login should be the endpoint uri, password should be the master key +optionally, you can use the following extras to default these values +{"database_name": "", "collection_name": "COLLECTION_NAME"}. + +:param azure_cosmos_conn_id: Reference to the Azure CosmosDB connection. +:type azure_cosmos_conn_id: str +""" + +def __init__(self,
[GitHub] dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
dimberman commented on issue #3770: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/incubator-airflow/pull/3770#issuecomment-444977780 @odracci would it be ok for @jzucker2 and I to make some PR's against this to get it mergeable? This is a pretty critical bug and we'd really like to merge it ASAP so it can go out in 1.10.2 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator
feng-tao commented on issue #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265#issuecomment-444975208 thanks @tmiller-msft, merge. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239559570 ## File path: airflow/api/common/experimental/delete_dag.py ## @@ -41,6 +49,8 @@ def delete_dag(dag_id): # noinspection PyUnresolvedReferences,PyProtectedMember for m in models.Base._decl_class_registry.values(): if hasattr(m, "dag_id"): +if keep_records_in_log and m.__name__ == 'Log': Review comment: does it work with the log is in remote(like s3)? And will it have two different behaviour(api vs webserver) when deleting the log? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao closed pull request #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator
feng-tao closed pull request #4265: [AIRFLOW-3406] Implement an Azure CosmosDB operator URL: https://github.com/apache/incubator-airflow/pull/4265 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/example_dags/example_cosmosdb_sensor.py b/airflow/contrib/example_dags/example_cosmosdb_sensor.py new file mode 100644 index 00..a801d9f41b --- /dev/null +++ b/airflow/contrib/example_dags/example_cosmosdb_sensor.py @@ -0,0 +1,64 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +""" +This is only an example DAG to highlight usage of AzureCosmosDocumentSensor to detect +if a document now exists. + +You can trigger this manually with `airflow trigger_dag example_cosmosdb_sensor`. + +*Note: Make sure that connection `azure_cosmos_default` is properly set before running +this example.* +""" + +from airflow import DAG +from airflow.contrib.sensors.azure_cosmos_sensor import AzureCosmosDocumentSensor +from airflow.contrib.operators.azure_cosmos_insertdocument_operator import AzureCosmosInsertDocumentOperator +from airflow.utils import dates + +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': dates.days_ago(2), +'email': ['airf...@example.com'], +'email_on_failure': False, +'email_on_retry': False +} + +dag = DAG('example_cosmosdb_sensor', default_args=default_args) + +dag.doc_md = __doc__ + +t1 = AzureCosmosDocumentSensor( +task_id='check_cosmos_file', +database_name='airflow_example_db', +collection_name='airflow_example_coll', +document_id='airflow_checkid', +azure_cosmos_conn_id='azure_cosmos_default', +dag=dag) + +t2 = AzureCosmosInsertDocumentOperator( +task_id='insert_cosmos_file', +dag=dag, +database_name='airflow_example_db', +collection_name='new-collection', +document={"id": "someuniqueid", "param1": "value1", "param2": "value2"}, +azure_cosmos_conn_id='azure_cosmos_default') + +t2.set_upstream(t1) diff --git a/airflow/contrib/hooks/azure_cosmos_hook.py b/airflow/contrib/hooks/azure_cosmos_hook.py new file mode 100644 index 00..01b4007b03 --- /dev/null +++ b/airflow/contrib/hooks/azure_cosmos_hook.py @@ -0,0 +1,287 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +import azure.cosmos.cosmos_client as cosmos_client +from azure.cosmos.errors import HTTPFailure +import uuid + +from airflow.exceptions import AirflowBadRequest +from airflow.hooks.base_hook import BaseHook + + +class AzureCosmosDBHook(BaseHook): +""" +Interacts with Azure CosmosDB. + +login should be the endpoint uri, password should be the master key +optionally, you can use the following extras to default these values +{"database_name": "", "collection_name": "COLLECTION_NAME"}. + +:param azure_cosmos_conn_id: Reference to the Azure CosmosDB connection. +:type azure_cosmos_conn_id: str +""" + +def __init__(self, azure_cosmos_conn_id='azure_cosmos_default'): +self.conn_id = azure_cosmos_conn_id +self.connection = self.get_connection(self.conn_id) +self.extras = self.connection.extra_dejson + +self.endpoint_uri = self.connection.login +self.master_key =
[GitHub] feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239559208 ## File path: tests/api/common/experimental/test_delete_dag.py ## @@ -0,0 +1,141 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import unittest + +from airflow import models +from airflow import settings +from airflow.api.common.experimental.delete_dag import delete_dag +from airflow.exceptions import DagNotFound, DagFileExists +from airflow.operators.dummy_operator import DummyOperator +from airflow.utils.dates import days_ago +from airflow.utils.state import State + +DM = models.DagModel +DS = models.DagStat +DR = models.DagRun +TI = models.TaskInstance +LOG = models.Log + + +class TestDeleteDAG_catch_error(unittest.TestCase): Review comment: please use capwords for class name(https://www.python.org/dev/peps/pep-0008/#class-names). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
feng-tao commented on a change in pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#discussion_r239559313 ## File path: tests/api/common/experimental/test_delete_dag.py ## @@ -0,0 +1,141 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import unittest + +from airflow import models +from airflow import settings +from airflow.api.common.experimental.delete_dag import delete_dag +from airflow.exceptions import DagNotFound, DagFileExists +from airflow.operators.dummy_operator import DummyOperator +from airflow.utils.dates import days_ago +from airflow.utils.state import State + +DM = models.DagModel +DS = models.DagStat +DR = models.DagRun +TI = models.TaskInstance +LOG = models.Log + + +class TestDeleteDAG_catch_error(unittest.TestCase): + +def setUp(self): +self.session = settings.Session() +self.dagbag = models.DagBag(include_examples=True) +self.dag_id = 'example_bash_operator' +self.dag = self.dagbag.dags[self.dag_id] + +def tearDown(self): +self.dag.clear() +self.session.close() + +def test_delete_dag_non_existent_dag(self): +with self.assertRaises(DagNotFound): +delete_dag("non-existent DAG") + +def test_delete_dag_dag_still_in_dagbag(self): +models_to_check = ['DagModel', 'DagStat', 'DagRun', 'TaskInstance'] +record_counts = {} + +for model_name in models_to_check: +m = getattr(models, model_name) +record_counts[model_name] = self.session.query(m).filter(m.dag_id == self.dag_id).count() + +with self.assertRaises(DagFileExists): +delete_dag(self.dag_id) + +# No change should happen in DB +for model_name in models_to_check: +m = getattr(models, model_name) +self.assertEqual( +self.session.query(m).filter( +m.dag_id == self.dag_id +).count(), +record_counts[model_name] +) + + +class TestDeleteDAG_successful_delete(unittest.TestCase): Review comment: same This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
codecov-io commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444939775 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=h1) Report > Merging [#4287](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/d9f036ef4d52be3dd75e426fa1b58f05d7ada235?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4287/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4287 +/- ## == + Coverage 78.08% 78.09% +<.01% == Files 201 201 Lines 1645816460 +2 == + Hits1285112854 +3 + Misses 3607 3606 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/api/common/experimental/delete\_dag.py](https://codecov.io/gh/apache/incubator-airflow/pull/4287/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9kZWxldGVfZGFnLnB5) | `88% <100%> (+5.39%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=footer). Last update [d9f036e...008237f](https://codecov.io/gh/apache/incubator-airflow/pull/4287?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] potiuk commented on issue #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook
potiuk commented on issue #4284: [AIRFLOW-3433] Create Google Cloud Spanner Hook URL: https://github.com/apache/incubator-airflow/pull/4284#issuecomment-444927988 Hello @ryanyuan . Are you sure you want to work on this in parallel to our work :)? We are just about to get merged (with @sprzedwojski ) all the create/update/delete/query operators for Cloud Spanner for Databases and Instances: Those are the three issues affected: - https://issues.apache.org/jira/browse/AIRFLOW-3310 - https://issues.apache.org/jira/browse/AIRFLOW-3398 - https://issues.apache.org/jira/browse/AIRFLOW-3480 The first one is in review already and the two are just about to get PR (we have the code merged and tested - including system tests with real Spanner instances). That also includes all operators, example dags and full documentation in docstrings. I guess you can take a look at our code and see if it also works for you? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3480) Google Cloud Spanner Instance Database Deploy/Update/Delete
Jarek Potiuk created AIRFLOW-3480: - Summary: Google Cloud Spanner Instance Database Deploy/Update/Delete Key: AIRFLOW-3480 URL: https://issues.apache.org/jira/browse/AIRFLOW-3480 Project: Apache Airflow Issue Type: New Feature Components: gcp Reporter: Jarek Potiuk Assignee: Jarek Potiuk We need to have operators to implement Instance management operations: * InstanceDeploy (create database if it does not exist, succeed if already created( * Update (run update_ddl method changing database structure) * Delete (delete the database) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XD-DENG removed a comment on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG removed a comment on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444920576 1 test (Py35-sqlite) failed due to transient error. May you help re-trigger it? @ashb Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444920576 1 test (Py35-sqlite) failed due to transient error. May you help re-trigger it? @ashb Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444920407 1 test (Py35-sqlite) failed due to transient error. May you help re-trigger it? @ashb Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ron819 commented on issue #2700: [AIRFLOW-1236] Slack Operator uses deprecated API, and should use Connection
ron819 commented on issue #2700: [AIRFLOW-1236] Slack Operator uses deprecated API, and should use Connection URL: https://github.com/apache/incubator-airflow/pull/2700#issuecomment-444912360 @zfanswer any progress with this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
[ https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711580#comment-16711580 ] Andrew Harmon commented on AIRFLOW-3369: fyi, tested in 1.10.2 and the issue still exists > Un-pausing a DAG with catchup =False creates an extra DAG run (1.10) > > > Key: AIRFLOW-3369 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3369 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Andrew Harmon >Priority: Major > Attachments: image.png > > > If you create a DAG with catchup=False, when it is un-paused, it creates 2 > dag runs. One for the most recent scheduled interval (expected) and one for > the interval before that (unexpected). > *Sample DAG* > {code:java} > from airflow import DAG > from datetime import datetime > from airflow.operators.dummy_operator import DummyOperator > dag = DAG( > dag_id='DummyTest', > start_date=datetime(2018,1,1), > catchup=False > ) > do = DummyOperator( > task_id='dummy_task', > dag=dag > ) > {code} > *Result:* > 2 DAG runs are created. 2018-11-18 and 108-11-17 > *Expected Result:* > Only 1 DAG run should have been created (2018-11-18) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ron819 commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
ron819 commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444904587 I can say that for my use case when I delete a dag I have no need for the log table. It would be awesome if this could be done from the UI. Maybe instead of alerting message changing it to a derision message where the user can confirm or deny the delete of log table and it would set the flag accordingly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG commented on issue #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287#issuecomment-444902340 Hi @ashb @kaxil , PTAL. Earlier I asked for your inputs for this on Slack. Cheers. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3479) Records of "Log" table should be kept (by default) when users delete a DAG.
[ https://issues.apache.org/jira/browse/AIRFLOW-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711565#comment-16711565 ] ASF GitHub Bot commented on AIRFLOW-3479: - XD-DENG opened a new pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287 ### Jira - https://issues.apache.org/jira/browse/AIRFLOW-3479 ### Description Problem This PR Tries to Address Currently when we delete a DAG (using API or from the UI), it will delete all related records in all tables (all tables in which "dag_id" is available), including "log" table. However, the records in "log" table should be kept (by default). This would be ideal for multiple reasons, like auditing. What This PR Does - **When we delete DAG using API**: provide one boolean parameter to let users decide if they want to keep records in Log table when they delete a DAG. Default value it True (to keep records in Log table). - **When we delete DAG from UI**: will keep records in the Log table when delete records for a specific DAG ID (pop-up message is updated accordingly). ### Tests Test cases are added to cover: 1. Delete a DAG which doesn't exist. 1. Delete a DAG which is still in DagBag. 1. Successfully delete a DAG + keep records in Log table. 1. Successfully delete a DAG + delete records in Log table as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Records of "Log" table should be kept (by default) when users delete a DAG. > --- > > Key: AIRFLOW-3479 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3479 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: 1.10.1 >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Minor > > Currently when we delete a DAG (using API or from the UI), it will delete all > related records in all tables (all tables in which "dag_id" is available), > including "log" table. > However, the records in "log" table should be kept (by default). This would > be ideal for multiple reasons, like auditing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XD-DENG opened a new pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests
XD-DENG opened a new pull request #4287: [AIRFLOW-3479] Keeps records in Log Table when delete DAG & refine related tests URL: https://github.com/apache/incubator-airflow/pull/4287 ### Jira - https://issues.apache.org/jira/browse/AIRFLOW-3479 ### Description Problem This PR Tries to Address Currently when we delete a DAG (using API or from the UI), it will delete all related records in all tables (all tables in which "dag_id" is available), including "log" table. However, the records in "log" table should be kept (by default). This would be ideal for multiple reasons, like auditing. What This PR Does - **When we delete DAG using API**: provide one boolean parameter to let users decide if they want to keep records in Log table when they delete a DAG. Default value it True (to keep records in Log table). - **When we delete DAG from UI**: will keep records in the Log table when delete records for a specific DAG ID (pop-up message is updated accordingly). ### Tests Test cases are added to cover: 1. Delete a DAG which doesn't exist. 1. Delete a DAG which is still in DagBag. 1. Successfully delete a DAG + keep records in Log table. 1. Successfully delete a DAG + delete records in Log table as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3449) Airflow DAG parsing logs aren't written when using S3 logging
[ https://issues.apache.org/jira/browse/AIRFLOW-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711560#comment-16711560 ] Ash Berlin-Taylor commented on AIRFLOW-3449: I wonder if something else is going on - as I used and tested with default config + S3 logging in at least 1.10.0 > Airflow DAG parsing logs aren't written when using S3 logging > - > > Key: AIRFLOW-3449 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3449 > Project: Apache Airflow > Issue Type: Bug > Components: logging, scheduler >Affects Versions: 1.10.0, 1.10.1 >Reporter: James Meickle >Priority: Critical > > The default Airflow logging class outputs provides some logs to stdout, some > to "task" folders, and some to "processor" folders (generated during DAG > parsing). The 1.10.0 logging update broke this, but only for users who are > also using S3 logging. This is because of this feature in the default logging > config file: > {code:python} > if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'): > DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['s3']) > {code} > That replaces this functioning handlers block: > {code:python} > 'task': { > 'class': 'airflow.utils.log.file_task_handler.FileTaskHandler', > 'formatter': 'airflow', > 'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER), > 'filename_template': FILENAME_TEMPLATE, > }, > 'processor': { > 'class': > 'airflow.utils.log.file_processor_handler.FileProcessorHandler', > 'formatter': 'airflow', > 'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER), > 'filename_template': PROCESSOR_FILENAME_TEMPLATE, > }, > {code} > With this non-functioning block: > {code:python} > 'task': { > 'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler', > 'formatter': 'airflow', > 'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER), > 's3_log_folder': REMOTE_BASE_LOG_FOLDER, > 'filename_template': FILENAME_TEMPLATE, > }, > 'processor': { > 'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler', > 'formatter': 'airflow', > 'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER), > 's3_log_folder': REMOTE_BASE_LOG_FOLDER, > 'filename_template': PROCESSOR_FILENAME_TEMPLATE, > }, > {code} > The key issue here is that both "task" and "processor" are being given a > "S3TaskHandler" class to use for logging. But that is not a generic S3 class; > it's actually a subclass of FileTaskHandler! > https://github.com/apache/incubator-airflow/blob/1.10.1/airflow/utils/log/s3_task_handler.py#L26 > Since the template vars don't match the template string, the path to log to > evaluates to garbage. The handler then silently fails to log anything at all. > It is likely that anyone using a default-like logging config, plus the remote > S3 logging feature, stopped getting DAG parsing logs (either locally *or* in > S3) as of 1.10.0 > Commenting out the DAG parsing section of the S3 block fixed this on my > instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1490) In order to get details on exceptions thrown by tasks, the onfailure callback needs an enhancement
[ https://issues.apache.org/jira/browse/AIRFLOW-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor resolved AIRFLOW-1490. Resolution: Duplicate AIRFLOW-843 merged and in 1.10.1 release > In order to get details on exceptions thrown by tasks, the onfailure callback > needs an enhancement > -- > > Key: AIRFLOW-1490 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1490 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.8.0, 2.0.0 >Reporter: Steen Manniche >Priority: Major > Attachments: > 0001-AIRFLOW-1490-carry-exceptions-through-to-the-on_fail.patch > > > The code called when an exception is thrown by a task receives information on > the exception thrown from the task, but fails to delegate this information to > the registered callbacks. > https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1524 > sends the context to the registered failure callback, but this context does > not include the thrown exception. > The supplied patch proposes a non-api-breaking way of including the exception > in the context in order to provide clients with the full exception type and > -traceback -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3479) Records of "Log" table should be kept (by default) when users delete a DAG.
Xiaodong DENG created AIRFLOW-3479: -- Summary: Records of "Log" table should be kept (by default) when users delete a DAG. Key: AIRFLOW-3479 URL: https://issues.apache.org/jira/browse/AIRFLOW-3479 Project: Apache Airflow Issue Type: Improvement Affects Versions: 1.10.1 Reporter: Xiaodong DENG Assignee: Xiaodong DENG Currently when we delete a DAG (using API or from the UI), it will delete all related records in all tables (all tables in which "dag_id" is available), including "log" table. However, the records in "log" table should be kept (by default). This would be ideal for multiple reasons, like auditing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2508) Handle non string types in render_template_from_field
[ https://issues.apache.org/jira/browse/AIRFLOW-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711502#comment-16711502 ] Galak commented on AIRFLOW-2508: This issue is partially resolved by AIRFLOW-2415, but only for numeric types... IMHO, it should be extended to any other type (date, datetime, UUID, even custom classes...) I would be glad to work on this issue and to submit a PR. Any objection ? > Handle non string types in render_template_from_field > - > > Key: AIRFLOW-2508 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2508 > Project: Apache Airflow > Issue Type: Bug > Components: models >Affects Versions: 2.0.0 >Reporter: Eugene Brown >Assignee: Eugene Brown >Priority: Minor > Labels: easyfix, newbie > Original Estimate: 2h > Remaining Estimate: 2h > > The render_template_from_field method of the BaseOperator class raises an > exception when it encounters content that is not a string_type, list, tuple > or dict. > Example exception: > {noformat} > airflow.exceptions.AirflowException: Type '' used for parameter > 'job_flow_overrides[Instances][InstanceGroups][InstanceCount]' is not > supported for templating{noformat} > I propose instead that when it encounters content of other types it returns > the content unchanged, rather than raising an exception. > Consider this case: I extended the EmrCreateJobFlowOperator to make the > job_flow_overrides argument a templatable field. job_flow_overrides is a > dictionary with a mix of strings, integers and booleans for values. > When I extended the class as such: > {code:java} > class EmrCreateJobFlowOperatorTemplateOverrides(EmrCreateJobFlowOperator): > template_fields = ['job_flow_overrides']{code} > And added a task to my dag with this format: > {code:java} > step_create_cluster = EmrCreateJobFlowOperatorTemplateOverrides( > task_id="create_cluster", > job_flow_overrides={ > "Name": "my-cluster {{ dag_run.conf['run_date'] }}", > "Instances": { > "InstanceGroups": [ > { > "Name": "Master nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 1 > }, > { > "Name": "Slave nodes", > "InstanceType": "c3.4xlarge", > "InstanceCount": 4 > }, > "TerminationProtected": False > ] > }, > "BootstrapActions": [{ > "Name": "Custom action", > "ScriptBootstrapAction": { > "Path": "s3://repo/{{ dag_run.conf['branch'] > }}/requirements.txt" > } > }], >}, >aws_conn_id='aws_default', >emr_conn_id='aws_default', >dag=dag > ) > {code} > The exception I gave above was raised and the step failed. I think it would > be preferable for the method to instead pass over numeric and boolean values > as users may want to use template_fields in the way I have to template string > values in dictionaries or lists of mixed types. > Here is the render_template_from_field method from the BaseOperator: > {code:java} > def render_template_from_field(self, attr, content, context, jinja_env): > """ > Renders a template from a field. If the field is a string, it will > simply render the string and return the result. If it is a collection or > nested set of collections, it will traverse the structure and render > all strings in it. > """ > rt = self.render_template > if isinstance(content, six.string_types): > result = jinja_env.from_string(content).render(**context) > elif isinstance(content, (list, tuple)): > result = [rt(attr, e, context) for e in content] > elif isinstance(content, dict): > result = { > k: rt("{}[{}]".format(attr, k), v, context) > for k, v in list(content.items())} > else: > param_type = type(content) > msg = ( > "Type '{param_type}' used for parameter '{attr}' is " > "not supported for templating").format(**locals()) > raise AirflowException(msg) > return result{code} > I propose that the method returns content unchanged if the content is of one > of (int, float, complex, bool) types. So my solution would include an extra > elif in the form: > {code} > elif isinstance(content, (int, float, complex, bool)): > result = content > {code} > Are there any reasons this would be a bad idea? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
codecov-io edited a comment on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276#issuecomment-444138570 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=h1) Report > Merging [#4276](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc) will **not change** coverage. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4276/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4276 +/- ## === Coverage 78.08% 78.08% === Files 201 201 Lines 1645816458 === Hits1285112851 Misses 3607 3607 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4276/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.33% <100%> (ø)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=footer). Last update [9c04e8f...1f36056](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
codecov-io edited a comment on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276#issuecomment-444138570 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=h1) Report > Merging [#4276](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc) will **not change** coverage. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4276/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4276 +/- ## === Coverage 78.08% 78.08% === Files 201 201 Lines 1645816458 === Hits1285112851 Misses 3607 3607 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4276/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.33% <100%> (ø)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=footer). Last update [9c04e8f...1f36056](https://codecov.io/gh/apache/incubator-airflow/pull/4276?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1490) In order to get details on exceptions thrown by tasks, the onfailure callback needs an enhancement
[ https://issues.apache.org/jira/browse/AIRFLOW-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711477#comment-16711477 ] jack commented on AIRFLOW-1490: --- You can prepare PR with your code suggestions > In order to get details on exceptions thrown by tasks, the onfailure callback > needs an enhancement > -- > > Key: AIRFLOW-1490 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1490 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.8.0, 2.0.0 >Reporter: Steen Manniche >Priority: Major > Attachments: > 0001-AIRFLOW-1490-carry-exceptions-through-to-the-on_fail.patch > > > The code called when an exception is thrown by a task receives information on > the exception thrown from the task, but fails to delegate this information to > the registered callbacks. > https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1524 > sends the context to the registered failure callback, but this context does > not include the thrown exception. > The supplied patch proposes a non-api-breaking way of including the exception > in the context in order to provide clients with the full exception type and > -traceback -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3478) Make sure that database connections are closed
Fokko Driesprong created AIRFLOW-3478: - Summary: Make sure that database connections are closed Key: AIRFLOW-3478 URL: https://issues.apache.org/jira/browse/AIRFLOW-3478 Project: Apache Airflow Issue Type: Task Reporter: Fokko Driesprong Calling .close() will make sure that the connection is being given back to the pool. When using settings.Session() directly, it is easy to forget to commit and close the session. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] mik-laj edited a comment on issue #4286: [AIRFLOW-3310] Google Cloud Spanner deploy / delete operators
mik-laj edited a comment on issue #4286: [AIRFLOW-3310] Google Cloud Spanner deploy / delete operators URL: https://github.com/apache/incubator-airflow/pull/4286#issuecomment-444860579 it may be interesting for you to look at: Https://github.com/apache/incubator-airflow/pull/4284 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mik-laj commented on issue #4286: [AIRFLOW-3310] Google Cloud Spanner deploy / delete operators
mik-laj commented on issue #4286: [AIRFLOW-3310] Google Cloud Spanner deploy / delete operators URL: https://github.com/apache/incubator-airflow/pull/4286#issuecomment-444860579 it may be interesting for you to look at: ttps://github.com/apache/incubator-airflow/pull/4284 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1405) Airflow v 1.8.1 unable to properly initialize with MySQL
[ https://issues.apache.org/jira/browse/AIRFLOW-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711385#comment-16711385 ] jack commented on AIRFLOW-1405: --- I think the best solution for this is for Airflow to catch this exception and replace it with message announcing the user that he must use a newer version of MySQL. > Airflow v 1.8.1 unable to properly initialize with MySQL > > > Key: AIRFLOW-1405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1405 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.8.1 > Environment: CentOS7 >Reporter: Aakash Bhardwaj >Priority: Major > Fix For: 1.8.1 > > Attachments: error_log.txt > > > While working on a CentOS7 system, I was trying to configure Airflow version > 1.8.1 to run with MySql in the backend. > I have installed Airflow in a Virtual Environment, and the MySQL has a > database named airflow (default). > But on running the command - > {code:none} > airflow initdb > {code} > the following error is reported > {noformat} > [2017-07-12 13:22:36,558] {__init__.py:57} INFO - Using executor LocalExecutor > DB: mysql://airflow:***@localhost/airflow > [2017-07-12 13:22:37,218] {db.py:287} INFO - Creating tables > INFO [alembic.runtime.migration] Context impl MySQLImpl. > INFO [alembic.runtime.migration] Will assume non-transactional DDL. > INFO [alembic.runtime.migration] Running upgrade f2ca10b85618 -> > 4addfa1236f1, Add fractional seconds to mysql tables > Traceback (most recent call last): > File "/opt/airflow_virtual_environment/airflow_venv/bin/airflow", line 28, > in > args.func(args) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/bin/cli.py", > line 951, in initdb > db_utils.initdb() > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/utils/db.py", > line 106, in initdb > upgradedb() > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/utils/db.py", > line 294, in upgradedb > command.upgrade(config, 'heads') > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/command.py", > line 174, in upgrade > script.run_env() > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/script/base.py", > line 416, in run_env > util.load_python_file(self.dir, 'env.py') > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/util/pyfiles.py", > line 93, in load_python_file > module = load_module_py(module_id, path) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/util/compat.py", > line 79, in load_module_py > mod = imp.load_source(module_id, path, fp) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/migrations/env.py", > line 86, in > run_migrations_online() > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/migrations/env.py", > line 81, in run_migrations_online > context.run_migrations() > File "", line 8, in run_migrations > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/runtime/environment.py", > line 807, in run_migrations > self.get_context().run_migrations(**kw) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/runtime/migration.py", > line 321, in run_migrations > step.migration_fn(**kw) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/airflow/migrations/versions/4addfa1236f1_add_fractional_seconds_to_mysql_tables.py", > line 36, in upgrade > op.alter_column(table_name='dag', column_name='last_scheduler_run', > type_=mysql.DATETIME(fsp=6)) > File "", line 8, in alter_column > File "", line 3, in alter_column > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/operations/ops.py", > line 1420, in alter_column > return operations.invoke(alt) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/operations/base.py", > line 318, in invoke > return fn(self, operation) > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/operations/toimpl.py", > line 53, in alter_column > **operation.kw > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/ddl/mysql.py", > line 67, in alter_column > else existing_autoincrement > File > "/opt/airflow_virtual_environment/airflow_venv/lib/python2.7/site-packages/alembic/ddl/impl.py", > line
[jira] [Created] (AIRFLOW-3474) Refactor: Move SlaMiss out of models.py
Fokko Driesprong created AIRFLOW-3474: - Summary: Refactor: Move SlaMiss out of models.py Key: AIRFLOW-3474 URL: https://issues.apache.org/jira/browse/AIRFLOW-3474 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3477) Refactor: Move KubeWorkerIdentifier out of models.py
Fokko Driesprong created AIRFLOW-3477: - Summary: Refactor: Move KubeWorkerIdentifier out of models.py Key: AIRFLOW-3477 URL: https://issues.apache.org/jira/browse/AIRFLOW-3477 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3476) Refactor: Move KubeResourceVersion out of models.py
Fokko Driesprong created AIRFLOW-3476: - Summary: Refactor: Move KubeResourceVersion out of models.py Key: AIRFLOW-3476 URL: https://issues.apache.org/jira/browse/AIRFLOW-3476 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3475) Refactor: Move ImportError out of models.py
Fokko Driesprong created AIRFLOW-3475: - Summary: Refactor: Move ImportError out of models.py Key: AIRFLOW-3475 URL: https://issues.apache.org/jira/browse/AIRFLOW-3475 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3472) Refactor: Move DagStat out of models.py
Fokko Driesprong created AIRFLOW-3472: - Summary: Refactor: Move DagStat out of models.py Key: AIRFLOW-3472 URL: https://issues.apache.org/jira/browse/AIRFLOW-3472 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3473) Refactor: Move DagRun out of models.py
Fokko Driesprong created AIRFLOW-3473: - Summary: Refactor: Move DagRun out of models.py Key: AIRFLOW-3473 URL: https://issues.apache.org/jira/browse/AIRFLOW-3473 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3470) Refactor: Move Variable out of models.py
Fokko Driesprong created AIRFLOW-3470: - Summary: Refactor: Move Variable out of models.py Key: AIRFLOW-3470 URL: https://issues.apache.org/jira/browse/AIRFLOW-3470 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3468) Refactor: Move KnownEventType out of models.py
[ https://issues.apache.org/jira/browse/AIRFLOW-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711323#comment-16711323 ] Ash Berlin-Taylor commented on AIRFLOW-3468: KnownEventType could possible be deleted - I don't think anything actually uses it (anymore)? We should ask someone at AirBnB if they know what the idea was behind this model. > Refactor: Move KnownEventType out of models.py > -- > > Key: AIRFLOW-3468 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3468 > Project: Apache Airflow > Issue Type: Task > Components: models >Affects Versions: 1.10.1 >Reporter: Fokko Driesprong >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3471) Refactor: Move XCom out of models.py
Fokko Driesprong created AIRFLOW-3471: - Summary: Refactor: Move XCom out of models.py Key: AIRFLOW-3471 URL: https://issues.apache.org/jira/browse/AIRFLOW-3471 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3455) Snowflake connector needs region ID also
[ https://issues.apache.org/jira/browse/AIRFLOW-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] MOHAMMD SHAKEEL SHAIK updated AIRFLOW-3455: --- Description: While Connecting to snowflake. we have to use region ID also. Why? Snowflake their main region is US west region. if the warehouse exists in any other region than US west region. we have to give region ID also while connecting to snowflake. Thanks was: While Connecting to snowflake. we have to use region ID also. Why? Snowflake their main region is US west region. if the warehouse exists in any other region than US west region. we have to give region ID also while connecting to snowflake. !Screenshot 2018-12-06 at 3.21.25 PM.png! Thanks > Snowflake connector needs region ID also > > > Key: AIRFLOW-3455 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3455 > Project: Apache Airflow > Issue Type: Improvement >Reporter: MOHAMMD SHAKEEL SHAIK >Assignee: MOHAMMD SHAKEEL SHAIK >Priority: Major > Labels: patch > Attachments: Screenshot 2018-12-06 at 2.47.50 PM.png, Screenshot > 2018-12-06 at 3.21.25 PM.png > > > While Connecting to snowflake. we have to use region ID also. > Why? > Snowflake their main region is US west region. if the warehouse exists in any > other region than US west region. we have to give region ID also while > connecting to snowflake. > > > Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3469) Refactor: Move KnownEvent out of models.py
Fokko Driesprong created AIRFLOW-3469: - Summary: Refactor: Move KnownEvent out of models.py Key: AIRFLOW-3469 URL: https://issues.apache.org/jira/browse/AIRFLOW-3469 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3468) Refactor: Move KnownEventType out of models.py
Fokko Driesprong created AIRFLOW-3468: - Summary: Refactor: Move KnownEventType out of models.py Key: AIRFLOW-3468 URL: https://issues.apache.org/jira/browse/AIRFLOW-3468 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3466) Refactor: Move DAG out of models.py
Fokko Driesprong created AIRFLOW-3466: - Summary: Refactor: Move DAG out of models.py Key: AIRFLOW-3466 URL: https://issues.apache.org/jira/browse/AIRFLOW-3466 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3465) Refactor: Move BaseOperator out of models.py
Fokko Driesprong created AIRFLOW-3465: - Summary: Refactor: Move BaseOperator out of models.py Key: AIRFLOW-3465 URL: https://issues.apache.org/jira/browse/AIRFLOW-3465 Project: Apache Airflow Issue Type: Task Components: models Affects Versions: 1.10.1 Reporter: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)