[GitHub] [airflow] kaxil commented on pull request #7232: [AIRFLOW-6569] Flush pending Sentry exceptions before exiting forked process

2020-05-27 Thread GitBox


kaxil commented on pull request #7232:
URL: https://github.com/apache/airflow/pull/7232#issuecomment-634580331


   Yes can be included in 1.10.11 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (6fc555d -> 5a7a3d1)

2020-05-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 6fc555d  Add ADDITIONAL_PYTHON_DEPS (#9031)
 add 5a7a3d1  Add ADDITIONAL_AIRFLOW_EXTRAS (#9032)

No new revisions were added by this update.

Summary of changes:
 Dockerfile | 4 +++-
 IMAGES.rst | 6 +-
 2 files changed, 8 insertions(+), 2 deletions(-)



[GitHub] [airflow] mik-laj merged pull request #8897: Filter dags by clicking on tag

2020-05-27 Thread GitBox


mik-laj merged pull request #8897:
URL: https://github.com/apache/airflow/pull/8897


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ad-m commented on pull request #9023: Add snowflake to slack operator

2020-05-27 Thread GitBox


ad-m commented on pull request #9023:
URL: https://github.com/apache/airflow/pull/9023#issuecomment-634591770


   @potiuk , could you confirm that TravisCI fail is not related to that PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (5a7a3d1 -> 30b12a9)

2020-05-27 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 5a7a3d1  Add ADDITIONAL_AIRFLOW_EXTRAS (#9032)
 add 30b12a9  Filter dags by clicking on tag (#8897)

No new revisions were added by this update.

Summary of changes:
 airflow/www/templates/airflow/dags.html | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)



[GitHub] [airflow] mik-laj closed issue #8821: Add ability to filter dag list by clicking on specific tag

2020-05-27 Thread GitBox


mik-laj closed issue #8821:
URL: https://github.com/apache/airflow/issues/8821


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8897: Filter dags by clicking on tag

2020-05-27 Thread GitBox


boring-cyborg[bot] commented on pull request #8897:
URL: https://github.com/apache/airflow/pull/8897#issuecomment-634591805


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431046930



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation

Review comment:
   
   ```suggestion
   name: Apache Software Foundation
   ```
   
   Although maybe we should change it to Airflow PMC & Committers or something, 
I am not sure 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


ashb commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431190655



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   When do we filter by Dag id and run type?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9018: Improve SchedulerJob code style

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #9018:
URL: https://github.com/apache/airflow/pull/9018#discussion_r431013108



##
File path: airflow/jobs/scheduler_job.py
##
@@ -1440,94 +1436,89 @@ def _change_state_for_tasks_failed_to_execute(self, 
session=None):
 
 :param session: session for ORM operations
 """
-if self.executor.queued_tasks:
-TI = models.TaskInstance
-filter_for_ti_state_change = (
-[and_(
-TI.dag_id == dag_id,
-TI.task_id == task_id,
-TI.execution_date == execution_date,
-# The TI.try_number will return raw try_number+1 since the
-# ti is not running. And we need to -1 to match the DB 
record.
-TI._try_number == try_number - 1,  # pylint: 
disable=protected-access
-TI.state == State.QUEUED)
-for dag_id, task_id, execution_date, try_number
-in self.executor.queued_tasks.keys()])
-ti_query = (session.query(TI)
-.filter(or_(*filter_for_ti_state_change)))
-tis_to_set_to_scheduled = (ti_query
-   .with_for_update()
-   .all())
-if len(tis_to_set_to_scheduled) == 0:
-session.commit()
-return
-
-# set TIs to queued state
-filter_for_tis = TI.filter_for_tis(tis_to_set_to_scheduled)
-session.query(TI).filter(filter_for_tis).update(
-{TI.state: State.SCHEDULED, TI.queued_dttm: None}, 
synchronize_session=False
-)
+if not self.executor.queued_tasks:
+return
 
-for task_instance in tis_to_set_to_scheduled:
-self.executor.queued_tasks.pop(task_instance.key)
+filter_for_ti_state_change = (
+[and_(
+TI.dag_id == dag_id,
+TI.task_id == task_id,
+TI.execution_date == execution_date,
+# The TI.try_number will return raw try_number+1 since the
+# ti is not running. And we need to -1 to match the DB record.
+TI._try_number == try_number - 1,  # pylint: 
disable=protected-access
+TI.state == State.QUEUED)
+for dag_id, task_id, execution_date, try_number
+in self.executor.queued_tasks.keys()])
+ti_query = session.query(TI).filter(or_(*filter_for_ti_state_change))
+tis_to_set_to_scheduled = ti_query.with_for_update().all()
+if not tis_to_set_to_scheduled:
+return
 
-task_instance_str = "\n\t".join(
-[repr(x) for x in tis_to_set_to_scheduled])
+# set TIs to queued state
+filter_for_tis = TI.filter_for_tis(tis_to_set_to_scheduled)
+session.query(TI).filter(filter_for_tis).update(
+{TI.state: State.SCHEDULED, TI.queued_dttm: None}, 
synchronize_session=False
+)
 
-session.commit()

Review comment:
   You are correct, this method is restricted and called only in scheduler 
job.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9018: Improve SchedulerJob code style

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #9018:
URL: https://github.com/apache/airflow/pull/9018#discussion_r431013369



##
File path: airflow/jobs/scheduler_job.py
##
@@ -1440,94 +1436,89 @@ def _change_state_for_tasks_failed_to_execute(self, 
session=None):
 
 :param session: session for ORM operations
 """
-if self.executor.queued_tasks:
-TI = models.TaskInstance
-filter_for_ti_state_change = (
-[and_(
-TI.dag_id == dag_id,
-TI.task_id == task_id,
-TI.execution_date == execution_date,
-# The TI.try_number will return raw try_number+1 since the
-# ti is not running. And we need to -1 to match the DB 
record.
-TI._try_number == try_number - 1,  # pylint: 
disable=protected-access
-TI.state == State.QUEUED)
-for dag_id, task_id, execution_date, try_number
-in self.executor.queued_tasks.keys()])
-ti_query = (session.query(TI)
-.filter(or_(*filter_for_ti_state_change)))
-tis_to_set_to_scheduled = (ti_query
-   .with_for_update()
-   .all())
-if len(tis_to_set_to_scheduled) == 0:
-session.commit()
-return
-
-# set TIs to queued state
-filter_for_tis = TI.filter_for_tis(tis_to_set_to_scheduled)
-session.query(TI).filter(filter_for_tis).update(
-{TI.state: State.SCHEDULED, TI.queued_dttm: None}, 
synchronize_session=False
-)
+if not self.executor.queued_tasks:
+return
 
-for task_instance in tis_to_set_to_scheduled:
-self.executor.queued_tasks.pop(task_instance.key)
+filter_for_ti_state_change = (
+[and_(
+TI.dag_id == dag_id,
+TI.task_id == task_id,
+TI.execution_date == execution_date,
+# The TI.try_number will return raw try_number+1 since the
+# ti is not running. And we need to -1 to match the DB record.
+TI._try_number == try_number - 1,  # pylint: 
disable=protected-access
+TI.state == State.QUEUED)
+for dag_id, task_id, execution_date, try_number
+in self.executor.queued_tasks.keys()])
+ti_query = session.query(TI).filter(or_(*filter_for_ti_state_change))
+tis_to_set_to_scheduled = ti_query.with_for_update().all()
+if not tis_to_set_to_scheduled:
+return
 
-task_instance_str = "\n\t".join(
-[repr(x) for x in tis_to_set_to_scheduled])
+# set TIs to queued state
+filter_for_tis = TI.filter_for_tis(tis_to_set_to_scheduled)
+session.query(TI).filter(filter_for_tis).update(
+{TI.state: State.SCHEDULED, TI.queued_dttm: None}, 
synchronize_session=False
+)
 
-session.commit()
-self.log.info("Set the following tasks to scheduled state:\n\t%s", 
task_instance_str)
+task_instance_str = ""
+for task_instance in tis_to_set_to_scheduled:
+self.executor.queued_tasks.pop(task_instance.key)
+task_instance_str += f"\n\t{repr(task_instance)}"

Review comment:
   Thanks for your suggestion! However, I will revert this change 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-5615) BaseJob subclasses shouldn't implement own heartbeat logic

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117664#comment-17117664
 ] 

ASF subversion and git services commented on AIRFLOW-5615:
--

Commit 8ac90b0c4fca8a89592f3939c76ecb922d93b3da in airflow's branch 
refs/heads/master from Ash Berlin-Taylor
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8ac90b0 ]

[AIRFLOW-5615] Reduce duplicated logic around job heartbeating (#6311)

Both SchedulerJob and LocalTaskJob have their own timers and decide when
to call heartbeat based upon that. This makes those functions harder to
follow, (and the logs more confusing) so I've moved the logic to BaseJob

> BaseJob subclasses shouldn't implement own heartbeat logic
> --
>
> Key: AIRFLOW-5615
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5615
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Both SchedulerJob and LocalTaskJob have their own timers and decide when to 
> call heartbeat based upon that.
> That logic should be removed and live in BaseJob to simplify the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] zikun commented on issue #8872: Add dependencies on build image

2020-05-27 Thread GitBox


zikun commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-634616072


   This could be a nice feature but looks like there was an effort to remove 
`ONBUILD` from official images - 
https://github.com/docker-library/official-images/issues/2076
   
   As the above issue mentioned, different users might want different 
customizations. Initially I wanted to wait for this feature, but I realize 
`ONBUILD` does not solve my problem, e.g. installing packages like 
`docker-ce-cli` from third-party repositories. Therefore, I would rather build 
my own image `FROM apache/airflow`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #9036: Task instance log_filepath doesn't include try_number

2020-05-27 Thread GitBox


boring-cyborg[bot] commented on issue #9036:
URL: https://github.com/apache/airflow/issues/9036#issuecomment-634671177


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Pymancer opened a new issue #9036: Task instance log_filepath doesn't include try_number

2020-05-27 Thread GitBox


Pymancer opened a new issue #9036:
URL: https://github.com/apache/airflow/issues/9036


   Is it correct that ti.log_filepath doesn't include try_number?  How to get a 
valid path with it?
   
https://github.com/apache/airflow/blob/8ac90b0c4fca8a89592f3939c76ecb922d93b3da/airflow/models/taskinstance.py#L385-L390
 
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk merged pull request #9032: Add ADDITIONAL_AIRFLOW_EXTRAS

2020-05-27 Thread GitBox


potiuk merged pull request #9032:
URL: https://github.com/apache/airflow/pull/9032


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] BasPH commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


BasPH commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431044194



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.

Review comment:
   ```suggestion
   description: Airflow Stable API.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] stijndehaes commented on issue #8963: SparkSubmitOperator could not get Exit Code after log stream interrupted by k8s old resource version exception

2020-05-27 Thread GitBox


stijndehaes commented on issue #8963:
URL: https://github.com/apache/airflow/issues/8963#issuecomment-634596428


   @ywan2017 I also have a PR open on airflow to work with spark 3.0 
https://github.com/apache/airflow/pull/8730



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] mschickensoup commented on pull request #268: adding a logo for sift use case

2020-05-27 Thread GitBox


mschickensoup commented on pull request #268:
URL: https://github.com/apache/airflow-site/pull/268#issuecomment-634605066


   I should be available to take a look at it around Friday, would it work? 
@mik-laj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #8942: #8525 Add SQL Branch Operator

2020-05-27 Thread GitBox


mik-laj commented on pull request #8942:
URL: https://github.com/apache/airflow/pull/8942#issuecomment-634582759


   @samuelkhtu we can do it in a separate PR.  This will allow us to build a 
better git history. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8970: Improve guide for KubernetesPodOperator

2020-05-27 Thread GitBox


mik-laj commented on issue #8970:
URL: https://github.com/apache/airflow/issues/8970#issuecomment-634592478


   @tanjinP Fantastic! I am waiting for your contribution.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] mik-laj commented on pull request #268: adding a logo for sift use case

2020-05-27 Thread GitBox


mik-laj commented on pull request #268:
URL: https://github.com/apache/airflow-site/pull/268#issuecomment-634597352


   @mschickensoup  Do you have time to look at it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9035: Additional python extras and deps can be set in breeze

2020-05-27 Thread GitBox


potiuk commented on pull request #9035:
URL: https://github.com/apache/airflow/pull/9035#issuecomment-634665722


   CC @wittfabian 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new pull request #9035: Additional python extras and deps can be set in breeze

2020-05-27 Thread GitBox


potiuk opened a new pull request #9035:
URL: https://github.com/apache/airflow/pull/9035


   Closes #8604
   Closes #8866
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9030: Allow using Airflow with Flask CLI

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #9030:
URL: https://github.com/apache/airflow/pull/9030#discussion_r431166489



##
File path: airflow/www/app.py
##
@@ -70,6 +73,31 @@ def create_app(config=None, testing=False, 
app_name="Airflow"):
 app.json_encoder = AirflowJsonEncoder
 
 csrf.init_app(app)
+
+def apply_middlewares(flask_app: Flask):
+# Apply DispatcherMiddleware
+base_url = urlparse(conf.get('webserver', 'base_url'))[2]
+if not base_url or base_url == '/':
+base_url = ""
+if base_url:
+flask_app.wsgi_app = DispatcherMiddleware(  # type: ignore

Review comment:
   This middleware is now optional. We use addresses in the form of 
`blocked` which do not work properly with this middleware. This middleware 
expects the addresses to be in the form `/blocked`. However, if we don't use 
this middleware, it doesn't matter.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431176549



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] olchas commented on pull request #9018: Improve SchedulerJob code style

2020-05-27 Thread GitBox


olchas commented on pull request #9018:
URL: https://github.com/apache/airflow/pull/9018#issuecomment-634583902


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ad-m commented on pull request #7232: [AIRFLOW-6569] Flush pending Sentry exceptions before exiting forked process

2020-05-27 Thread GitBox


ad-m commented on pull request #7232:
URL: https://github.com/apache/airflow/pull/7232#issuecomment-634583299


   @kaxil , thank you for information and add to milestone "Airflow 1.10.11".



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] olchas removed a comment on pull request #9018: Improve SchedulerJob code style

2020-05-27 Thread GitBox


olchas removed a comment on pull request #9018:
URL: https://github.com/apache/airflow/pull/9018#issuecomment-634583902


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek opened a new issue #9034: Fix GoogleDiscoveryApiHook docs warnings

2020-05-27 Thread GitBox


turbaszek opened a new issue #9034:
URL: https://github.com/apache/airflow/issues/9034


   **Apache Airflow version**: master branch 
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): N/A
   
   **Environment**:
   
   - breeze
   
   **What happened**:
   
   Running `./breeze build-docs` prints warnings related to 
`airflow.providers.google.common.hooks.discovery_api`:
   
   ```
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/base/index.rst:2: 
WARNING: Title underline too short.
   
   :mod:`airflow.providers.google.common.hooks.base`
   
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:2:
 WARNING: Title underline too short.
   
   :mod:`airflow.providers.google.common.hooks.discovery_api`
   =
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:4:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api, other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:15:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook, 
other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:33:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._conn,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:39:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook.get_conn,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:49:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook.query,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:72:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._call_api_request,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:77:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._build_api_request,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:82:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._paginate_api,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/cloud/hooks/discovery_api/index.rst:87:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._build_next_api_request,
 other instance in 
_api/airflow/providers/google/common/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/common/hooks/discovery_api/index.rst:4:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api, other instance in 
_api/airflow/providers/google/cloud/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/common/hooks/discovery_api/index.rst:15:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook, 
other instance in 
_api/airflow/providers/google/cloud/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/common/hooks/discovery_api/index.rst:33:
 WARNING: duplicate object description of 
airflow.providers.google.common.hooks.discovery_api.GoogleDiscoveryApiHook._conn,
 other instance in 
_api/airflow/providers/google/cloud/hooks/discovery_api/index, use :noindex: 
for one of them
   
/opt/airflow/docs/_api/airflow/providers/google/common/hooks/discovery_api/index.rst:39:
 WARNING: duplicate object 

[GitHub] [airflow] zikun edited a comment on issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


zikun edited a comment on issue #9033:
URL: https://github.com/apache/airflow/issues/9033#issuecomment-634600902


   Before I make a PR to fix it, can someone confirm entering bash shell is the 
expected behaviour?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #9034: Fix GoogleDiscoveryApiHook docs warnings

2020-05-27 Thread GitBox


mik-laj commented on issue #9034:
URL: https://github.com/apache/airflow/issues/9034#issuecomment-634606513


   Can you remove _api  and _build directory?  That should solve this problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] feluelle commented on a change in pull request #8962: [AIRFLOW-8057] [AIP-31] Add @task decorator

2020-05-27 Thread GitBox


feluelle commented on a change in pull request #8962:
URL: https://github.com/apache/airflow/pull/8962#discussion_r431035377



##
File path: airflow/operators/python.py
##
@@ -145,6 +147,141 @@ def execute_callable(self):
 return self.python_callable(*self.op_args, **self.op_kwargs)
 
 
+class _PythonFunctionalOperator(BaseOperator):
+"""
+Wraps a Python callable and captures args/kwargs when called for execution.
+
+:param python_callable: A reference to an object that is callable
+:type python_callable: python callable
+:param multiple_outputs: if set, function return value will be
+unrolled to multiple XCom values. List/Tuples will unroll to xcom 
values
+with index as key. Dict will unroll to xcom values with keys as keys.
+Defaults to False.
+:type multiple_outputs: bool
+"""
+
+template_fields = ('_op_args', '_op_kwargs')
+ui_color = '#ffefeb'
+
+# since we won't mutate the arguments, we should just do the shallow copy
+# there are some cases we can't deepcopy the objects(e.g protobuf).
+shallow_copy_attrs = ('python_callable',)
+
+@apply_defaults
+def __init__(
+self,
+python_callable: Callable,
+multiple_outputs: bool = False,
+*args,
+**kwargs
+) -> None:
+# Check if we need to generate a new task_id
+task_id = kwargs.get('task_id', None)
+dag = kwargs.get('dag', None) or DagContext.get_current_dag()
+if task_id and dag and task_id in dag.task_ids:
+prefix = task_id.rsplit("__", 1)[0]
+task_id = sorted(
+filter(lambda x: x.startswith(prefix), dag.task_ids),
+reverse=True
+)[0]
+num = int(task_id[-1] if '__' in task_id else '0') + 1
+kwargs['task_id'] = f'{prefix}__{num}'
+
+if not kwargs.get('do_xcom_push', True) and not multiple_outputs:
+raise AirflowException('@task needs to have either 
do_xcom_push=True or '
+   'multiple_outputs=True.')
+if not callable(python_callable):
+raise AirflowException('`python_callable` param must be callable')
+self._fail_if_method(python_callable)
+super().__init__(*args, **kwargs)
+self.python_callable = python_callable
+self.multiple_outputs = multiple_outputs
+self._kwargs = kwargs

Review comment:
   Yes, I think that would be better. +1, for dropping it.

##
File path: docs/concepts.rst
##
@@ -116,6 +116,47 @@ DAGs can be used as context managers to automatically 
assign new operators to th
 
 op.dag is dag # True
 
+.. _concepts:functional_dags:
+
+Functional DAGs
+---
+*Added in Airflow 1.10.11*
+
+DAGs can be defined using functional abstractions. Outputs and inputs are sent 
between tasks using
+:ref:`XComs ` values. In addition, you can wrap functions as 
tasks using the
+:ref:`task decorator `. Dependencies are 
automatically inferred from
+the message dependencies.
+
+Example DAG with functional abstraction
+
+.. code:: python
+
+  with DAG(
+  'send_server_ip', default_args=default_args, schedule_interval=None
+  ) as dag:
+
+# Using default connection as it's set to httpbin.org by default
+get_ip = SimpleHttpOperator(
+task_id='get_ip', endpoint='get', method='GET', xcom_push=True
+)
+
+@dag.task(multiple_outputs=True)
+def prepare_email(raw_json: str) -> str:
+  external_ip = json.loads(raw_json)['origin']
+  return {
+'subject':f'Server connected from {external_ip}',
+'body': f'Seems like today your server executing Airflow is connected 
from the external IP {external_ip}'

Review comment:
   So if we want this in 1.10.11 we need to be very careful :/

##
File path: airflow/operators/python.py
##
@@ -145,6 +147,141 @@ def execute_callable(self):
 return self.python_callable(*self.op_args, **self.op_kwargs)
 
 
+class _PythonFunctionalOperator(BaseOperator):
+"""
+Wraps a Python callable and captures args/kwargs when called for execution.
+
+:param python_callable: A reference to an object that is callable
+:type python_callable: python callable
+:param multiple_outputs: if set, function return value will be
+unrolled to multiple XCom values. List/Tuples will unroll to xcom 
values
+with index as key. Dict will unroll to xcom values with keys as keys.
+Defaults to False.
+:type multiple_outputs: bool
+"""
+
+template_fields = ('_op_args', '_op_kwargs')
+ui_color = '#ffefeb'
+
+# since we won't mutate the arguments, we should just do the shallow copy
+# there are some cases we can't deepcopy the objects(e.g protobuf).
+shallow_copy_attrs = ('python_callable',)
+
+@apply_defaults
+def __init__(
+self,
+python_callable: Callable,
+multiple_outputs: 

[GitHub] [airflow] joppevos opened a new pull request #9037: Create guide for Dataproc Operators

2020-05-27 Thread GitBox


joppevos opened a new pull request #9037:
URL: https://github.com/apache/airflow/pull/9037


   A simple guide based on the example_dataproc file.  Addressing this 
[issue](https://github.com/apache/airflow/issues/8203)
   
   # qs
   - @mik-laj All the examples we have of job configurations are added in the 
guide. Maybe it is too much, but I assumed people will probably just search 
through the page. 
   - Is an explanation of the arguments needed? The operators themself are 
richly documented already so I kept it out for now.
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb merged pull request #6311: [AIRFLOW-5615] Reduce duplicated logic around job heartbeating

2020-05-27 Thread GitBox


ashb merged pull request #6311:
URL: https://github.com/apache/airflow/pull/6311


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (30b12a9 -> 8ac90b0)

2020-05-27 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 30b12a9  Filter dags by clicking on tag (#8897)
 add 8ac90b0  [AIRFLOW-5615] Reduce duplicated logic around job 
heartbeating (#6311)

No new revisions were added by this update.

Summary of changes:
 airflow/jobs/base_job.py  | 15 ---
 airflow/jobs/scheduler_job.py | 10 +-
 tests/conftest.py | 38 ++
 tests/jobs/test_base_job.py   | 21 -
 4 files changed, 71 insertions(+), 13 deletions(-)



[jira] [Commented] (AIRFLOW-5615) BaseJob subclasses shouldn't implement own heartbeat logic

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117663#comment-17117663
 ] 

ASF GitHub Bot commented on AIRFLOW-5615:
-

ashb merged pull request #6311:
URL: https://github.com/apache/airflow/pull/6311


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BaseJob subclasses shouldn't implement own heartbeat logic
> --
>
> Key: AIRFLOW-5615
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5615
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Both SchedulerJob and LocalTaskJob have their own timers and decide when to 
> call heartbeat based upon that.
> That logic should be removed and live in BaseJob to simplify the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] boring-cyborg[bot] commented on issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


boring-cyborg[bot] commented on issue #9033:
URL: https://github.com/apache/airflow/issues/9033#issuecomment-634600571


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] zikun commented on issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


zikun commented on issue #9033:
URL: https://github.com/apache/airflow/issues/9033#issuecomment-634600902


   Before I make a PR to fix it, can someone confirm this is the expected 
behaviour?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] zikun opened a new issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


zikun opened a new issue #9033:
URL: https://github.com/apache/airflow/issues/9033


   
   
   
   
   **Apache Airflow version**: 1.10.10 / master
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   Production Docker image
   
   **What happened**:
   
   `docker run --rm -it apache/airflow:1.10.10 bash`
   
   Output:
   
   airflow command error: argument subcommand: invalid choice: 'bash' (choose 
from 'backfill', 'list_dag_runs', 'list_tasks', 'clear', 'pause', 'unpause', 
'trigger_dag', 'delete_dag', 'show_dag', 'pool', 'variables', 'kerberos', 
'render', 'run', 'initdb', 'list_dags', 'dag_state', 'task_failed_deps', 
'task_state', 'serve_logs', 'test', 'webserver', 'resetdb', 'upgradedb', 
'checkdb', 'shell', 'scheduler', 'worker', 'flower', 'version', 'connections', 
'create_user', 'delete_user', 'list_users', 'sync_perm', 'next_execution', 
'rotate_fernet_key'), see help above.
   
   **What you expected to happen**:
   
   Enter bash shell
   `airflow@044b3772:/opt/airflow$`
   
   
   **How to reproduce it**:
   `docker run --rm -it apache/airflow:1.10.10 bash`
   
   
   
   **Anything else we need to know**:
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9023: Add snowflake to slack operator

2020-05-27 Thread GitBox


mik-laj commented on pull request #9023:
URL: https://github.com/apache/airflow/pull/9023#issuecomment-634600193


   @ad-m We have temporary problems with Kubernetes tests running on Travis Ci. 
This change does not apply to Kubernetes, so it is safe. @dimberman  is working 
on improvements for Kubernetes tests.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


potiuk commented on issue #9033:
URL: https://github.com/apache/airflow/issues/9033#issuecomment-634630189


   For 1.10.10 it was like that but it's been fixed in master and will be 
cherry-picked for 1.10.11



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #9033: entrypoint cannot enter bash shell

2020-05-27 Thread GitBox


potiuk closed issue #9033:
URL: https://github.com/apache/airflow/issues/9033


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431176549



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431176549



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] BasPH commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


BasPH commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431044194



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.

Review comment:
   ```suggestion
   description: Airfow Stable API.
   ```
   ```suggestion
   description: Airflow Stable API.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #8872: Add dependencies on build image

2020-05-27 Thread GitBox


potiuk commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-634664427


   Yeah, I see the point. I think also maybe we should not do onbuild if we can 
add easily build args to supplement that.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431205011



##
File path: openapi.yaml
##
@@ -0,0 +1,2411 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updaateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'

[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431137238



##
File path: airflow/models/dag.py
##
@@ -1468,14 +1472,26 @@ def create_dagrun(self,
 :param session: database session
 :type session: sqlalchemy.orm.session.Session
 """
+if run_id:
+if not isinstance(run_id, str):
+raise ValueError(f"`run_id` expected to be a str is 
{type(run_id)}")
+run_type: DagRunType = DagRunType.from_run_id(run_id)
+elif run_type and execution_date:
+run_id = DagRun.generate_run_id(run_type, execution_date)
+elif not run_id:
+raise AirflowException(
+"Creating DagRun needs either `run_id` or `run_type` and 
`execution_date`"
+)
+
 run = DagRun(
 dag_id=self.dag_id,
 run_id=run_id,
 execution_date=execution_date,
 start_date=start_date,
 external_trigger=external_trigger,
 conf=conf,
-state=state
+state=state,
+run_type=run_type.value,

Review comment:
   Does this mean `run_type` can only be passed an enum?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431187909



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation

Review comment:
   Airflow PMC and commiters are not a legal entity. It seems to me that 
it's better to provide the name of an entity that is easy to find.  However, I 
am not sure either.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431235391



##
File path: openapi.yaml
##
@@ -0,0 +1,2411 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updaateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'

[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431234943



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] dimberman commented on a change in pull request #9038: Use production image for k8s tests

2020-05-27 Thread GitBox


dimberman commented on a change in pull request #9038:
URL: https://github.com/apache/airflow/pull/9038#discussion_r431297732



##
File path: scripts/ci/in_container/kubernetes/docker/rebuild_airflow_image.sh
##
@@ -38,46 +38,25 @@ cp /entrypoint.sh scripts/docker/
 echo
 echo "Building image from ${AIRFLOW_CI_IMAGE} with latest sources"
 echo
-start_output_heartbeat "Rebuilding Kubernetes image" 3
-docker build \
---build-arg PYTHON_BASE_IMAGE="${PYTHON_BASE_IMAGE}" \
---build-arg PYTHON_MAJOR_MINOR_VERSION="${PYTHON_MAJOR_MINOR_VERSION}" \
---build-arg AIRFLOW_VERSION="${AIRFLOW_VERSION}" \
---build-arg AIRFLOW_EXTRAS="${AIRFLOW_EXTRAS}" \
---build-arg ADDITIONAL_AIRFLOW_EXTRAS="${ADDITIONAL_AIRFLOW_EXTRAS}" \
---build-arg ADDITIONAL_PYTHON_DEPS="${ADDITIONAL_PYTHON_DEPS}" \
---build-arg AIRFLOW_BRANCH="${AIRFLOW_BRANCH}" \
---build-arg 
AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD="${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" \
---build-arg 
UPGRADE_TO_LATEST_REQUIREMENTS="${UPGRADE_TO_LATEST_REQUIREMENTS}" \
---build-arg HOME="${HOME}" \
---cache-from "${AIRFLOW_CI_IMAGE}" \
---tag="${AIRFLOW_CI_IMAGE}" \
---target="main" \
--f Dockerfile.ci . >> "${OUTPUT_LOG}"
-echo
+#export 
AIRFLOW_PROD_BASE_TAG="${DEFAULT_BRANCH}-python${PYTHON_MAJOR_MINOR_VERSION}"

Review comment:
   @potiuk can you please point me to how I can initialize these env 
variables. I can't find it and there are a lot of bash scripts to navigate.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9037: Create guide for Dataproc Operators

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #9037:
URL: https://github.com/apache/airflow/pull/9037#discussion_r431308938



##
File path: docs/howto/operator/gcp/dataproc.rst
##
@@ -0,0 +1,186 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Google Cloud Dataproc Operators
+===
+
+Dataproc is a managed Apache Spark and Apache Hadoop service that lets you
+take advantage of open source data tools for batch processing, querying, 
streaming and machine learning.
+Dataproc automation helps you create clusters quickly, manage them easily, and
+save money by turning clusters off when you don't need them.

Review comment:
   Would you mind adding link to Dataproc website / API docs?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9037: Create guide for Dataproc Operators

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #9037:
URL: https://github.com/apache/airflow/pull/9037#discussion_r431310411



##
File path: docs/howto/operator/gcp/dataproc.rst
##
@@ -0,0 +1,186 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Google Cloud Dataproc Operators
+===
+
+Dataproc is a managed Apache Spark and Apache Hadoop service that lets you
+take advantage of open source data tools for batch processing, querying, 
streaming and machine learning.
+Dataproc automation helps you create clusters quickly, manage them easily, and
+save money by turning clusters off when you don't need them.

Review comment:
   Oh I see it's below in references, but personally I would put it in both 
places 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] alexbegg commented on issue #8712: docker image does not include 'pymssql'

2020-05-27 Thread GitBox


alexbegg commented on issue #8712:
URL: https://github.com/apache/airflow/issues/8712#issuecomment-634841172


   pymssql is installed with the airflow extra "mssql". It is not pre-installed 
with the docker image apache/airflow:1.10.10. I am currently installing it as a 
separate pip install of `apache-airflow[mssql]==1.10.10` in my helm chart and I 
have no issue using the MsSqlHook.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431236903



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   Maybe not directly but here:
   
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/jobs/scheduler_job.py#L567-L575
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] optimuspaul commented on issue #8281: CLI trigger_dag with json_client enabled raises TypeError: Object of type Pendulum is not JSON serializable

2020-05-27 Thread GitBox


optimuspaul commented on issue #8281:
URL: https://github.com/apache/airflow/issues/8281#issuecomment-634746271


   https://github.com/apache/airflow/pull/8384



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431311590



##
File path: airflow/models/dag.py
##
@@ -1468,14 +1472,26 @@ def create_dagrun(self,
 :param session: database session
 :type session: sqlalchemy.orm.session.Session
 """
+if run_id:
+if not isinstance(run_id, str):
+raise ValueError(f"`run_id` expected to be a str is 
{type(run_id)}")
+run_type: DagRunType = DagRunType.from_run_id(run_id)
+elif run_type and execution_date:
+run_id = DagRun.generate_run_id(run_type, execution_date)
+elif not run_id:
+raise AirflowException(
+"Creating DagRun needs either `run_id` or `run_type` and 
`execution_date`"
+)
+
 run = DagRun(
 dag_id=self.dag_id,
 run_id=run_id,
 execution_date=execution_date,
 start_date=start_date,
 external_trigger=external_trigger,
 conf=conf,
-state=state
+state=state,
+run_type=run_type.value,

Review comment:
   Yes. I think it's a good idea because the run types are limited to the 
`DagRunType`. Do you have any concerns about that? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6569) Broken sentry integration

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117645#comment-17117645
 ] 

ASF GitHub Bot commented on AIRFLOW-6569:
-

ad-m commented on pull request #7232:
URL: https://github.com/apache/airflow/pull/7232#issuecomment-634583299


   @kaxil , thank you for information and add to milestone "Airflow 1.10.11".



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Broken sentry integration
> -
>
> Key: AIRFLOW-6569
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6569
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, hooks
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Robin Edwards
>Priority: Minor
>
> I believe the new forking mechanism AIRFLOW-5931 has unintentionally broken 
> the sentry integration.
> Sentry relies on the atexit 
> http://man7.org/linux/man-pages/man3/atexit.3.html signal to flush collected 
> errors to their servers. Previously as the task was executed in a new process 
> as opposed to forked this got invoked. However now os._exit() is called 
> (which is semantically correct with child processes) 
> https://docs.python.org/3/library/os.html#os._exit
> Point os._exit is called in airflow:
> https://github.com/apache/airflow/pull/6627/files#diff-736081a3535ff0b9e60ada2f51154ca4R84
> Also related on sentry bug tracker: 
> https://github.com/getsentry/sentry-python/issues/291
> Unfortunately sentry doesn't provide (from what i can find) a public 
> interface for flushing errors to their system. The return value of their 
> init() functions returns an object containg a client but the property is 
> `_client` so it would be wrong to rely on it.
> I've side stepped this in two ways, you can disable the forking feature 
> through patching CAN_FORK to False. But after seeing the performance 
> improvement on my workers I opted to monkey patch the whole _exec_by_fork() 
> and naughtily call sys.exit instead as a temporary fix.
> I personally dont find the actual sentry integration in airflow useful as it 
> doesn't collect errors from the rest of the system only tasks. I've been 
> wiring it in through my log config module since before the integration was 
> added however its still effected by the above change.
> My personal vote (unless anyone has a better idea) would be to remove the 
> integration completely document the way of setting it up through the logging 
> class and providing a 'post_execute' hook of some form on the 
> StandardTaskRunner where people can flush errors using what not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6569) Broken sentry integration

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117639#comment-17117639
 ] 

ASF GitHub Bot commented on AIRFLOW-6569:
-

kaxil commented on pull request #7232:
URL: https://github.com/apache/airflow/pull/7232#issuecomment-634580331


   Yes can be included in 1.10.11 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Broken sentry integration
> -
>
> Key: AIRFLOW-6569
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6569
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, hooks
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Robin Edwards
>Priority: Minor
>
> I believe the new forking mechanism AIRFLOW-5931 has unintentionally broken 
> the sentry integration.
> Sentry relies on the atexit 
> http://man7.org/linux/man-pages/man3/atexit.3.html signal to flush collected 
> errors to their servers. Previously as the task was executed in a new process 
> as opposed to forked this got invoked. However now os._exit() is called 
> (which is semantically correct with child processes) 
> https://docs.python.org/3/library/os.html#os._exit
> Point os._exit is called in airflow:
> https://github.com/apache/airflow/pull/6627/files#diff-736081a3535ff0b9e60ada2f51154ca4R84
> Also related on sentry bug tracker: 
> https://github.com/getsentry/sentry-python/issues/291
> Unfortunately sentry doesn't provide (from what i can find) a public 
> interface for flushing errors to their system. The return value of their 
> init() functions returns an object containg a client but the property is 
> `_client` so it would be wrong to rely on it.
> I've side stepped this in two ways, you can disable the forking feature 
> through patching CAN_FORK to False. But after seeing the performance 
> improvement on my workers I opted to monkey patch the whole _exec_by_fork() 
> and naughtily call sys.exit instead as a temporary fix.
> I personally dont find the actual sentry integration in airflow useful as it 
> doesn't collect errors from the rest of the system only tasks. I've been 
> wiring it in through my log config module since before the integration was 
> added however its still effected by the above change.
> My personal vote (unless anyone has a better idea) would be to remove the 
> integration completely document the way of setting it up through the logging 
> class and providing a 'post_execute' hook of some form on the 
> StandardTaskRunner where people can flush errors using what not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431140537



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   ```suggestion
   Index('dag_id_state', dag_id, _state),
   Index('dag_run_type', run_type),
   ```
   
   OR
   
   ```
   Index('dag_id_state', dag_id, _state),
   Index('dag_id_run_type', dag_id, run_type),
   ```
   
   Maybe 2 indexes as most of the times we are filtering based on DagRunType 
and DagId 
   
   WDYT? @turbaszek @ashb @mik-laj 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431196294



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431230679



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] turbaszek commented on issue #8903: BigQueryHook refactor + deterministic BQ Job ID

2020-05-27 Thread GitBox


turbaszek commented on issue #8903:
URL: https://github.com/apache/airflow/issues/8903#issuecomment-634816078


   Summoning @edejong to hear his opinion :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #8962: [AIRFLOW-8057] [AIP-31] Add @task decorator

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #8962:
URL: https://github.com/apache/airflow/pull/8962#discussion_r431318258



##
File path: docs/concepts.rst
##
@@ -116,6 +116,47 @@ DAGs can be used as context managers to automatically 
assign new operators to th
 
 op.dag is dag # True
 
+.. _concepts:functional_dags:
+
+Functional DAGs
+---
+*Added in Airflow 1.10.11*
+
+DAGs can be defined using functional abstractions. Outputs and inputs are sent 
between tasks using
+:ref:`XComs ` values. In addition, you can wrap functions as 
tasks using the
+:ref:`task decorator `. Dependencies are 
automatically inferred from
+the message dependencies.
+
+Example DAG with functional abstraction
+
+.. code:: python
+
+  with DAG(
+  'send_server_ip', default_args=default_args, schedule_interval=None
+  ) as dag:
+
+# Using default connection as it's set to httpbin.org by default
+get_ip = SimpleHttpOperator(
+task_id='get_ip', endpoint='get', method='GET', xcom_push=True
+)
+
+@dag.task(multiple_outputs=True)
+def prepare_email(raw_json: str) -> str:
+  external_ip = json.loads(raw_json)['origin']
+  return {
+'subject':f'Server connected from {external_ip}',
+'body': f'Seems like today your server executing Airflow is connected 
from the external IP {external_ip}'

Review comment:
   The question is do we want to backport it?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #8974: Vault has now VaultHook not only SecretBackend

2020-05-27 Thread GitBox


potiuk commented on pull request #8974:
URL: https://github.com/apache/airflow/pull/8974#issuecomment-634822700


   Hey @kaxil -> I split the tests into separate client_vault (more detailed) 
and hook_vault tests (focused on "green-path" + reading parameters from 
connections). I think this way it's much better as the client get tested 
separately from the hook.
   
   Pls take a look.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431236903



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   Maybe not using an `and_` but at following places:
   
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/jobs/scheduler_job.py#L567-L575
   
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/models/dagrun.py#L159-L162





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mdediana commented on issue #8649: Add support for more than 1 cron exp per DAG

2020-05-27 Thread GitBox


mdediana commented on issue #8649:
URL: https://github.com/apache/airflow/issues/8649#issuecomment-634780827


   @mik-laj Sure, I will do that, thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #9039: Updated missing parameters for docker image building

2020-05-27 Thread GitBox


potiuk commented on pull request #9039:
URL: https://github.com/apache/airflow/pull/9039#issuecomment-634834608


   cc @wittfabian 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new pull request #9039: Updated missing parameters for docker image building

2020-05-27 Thread GitBox


potiuk opened a new pull request #9039:
URL: https://github.com/apache/airflow/pull/9039


   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (8ac90b0 -> 7386670)

2020-05-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 8ac90b0  [AIRFLOW-5615] Reduce duplicated logic around job 
heartbeating (#6311)
 add 7386670  Additional python extras and deps can be set in breeze (#9035)

No new revisions were added by this update.

Summary of changes:
 BREEZE.rst | 12 
 Dockerfile |  4 ++--
 Dockerfile.ci  |  3 ++-
 breeze | 14 ++
 breeze-complete|  1 +
 scripts/ci/_utils.sh   | 12 +++-
 .../kubernetes/docker/rebuild_airflow_image.sh |  2 ++
 7 files changed, 44 insertions(+), 4 deletions(-)



[GitHub] [airflow] potiuk merged pull request #9035: Additional python extras and deps can be set in breeze

2020-05-27 Thread GitBox


potiuk merged pull request #9035:
URL: https://github.com/apache/airflow/pull/9035


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #8604: Add ADDITIONAL_PYTHON_DEPS to the image

2020-05-27 Thread GitBox


potiuk closed issue #8604:
URL: https://github.com/apache/airflow/issues/8604


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #8866: Additional extras

2020-05-27 Thread GitBox


potiuk closed issue #8866:
URL: https://github.com/apache/airflow/issues/8866


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431236903



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/jobs/scheduler_job.py#L567-L575





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] samuelkhtu commented on pull request #8942: #8525 Add SQL Branch Operator

2020-05-27 Thread GitBox


samuelkhtu commented on pull request #8942:
URL: https://github.com/apache/airflow/pull/8942#issuecomment-634753579


   > @samuelkhtu we can do it in a separate PR. This will allow us to build a 
better git history.
   
   Thanks @mik-laj and @eladkal , can someone help and approve this PR? I am 
ready to move on to the next issue. Thanks.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431247683



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   ```
   Index('dag_run_type', run_type),
   ```
   
   This alone should help





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] themantalope commented on issue #8649: Add support for more than 1 cron exp per DAG

2020-05-27 Thread GitBox


themantalope commented on issue #8649:
URL: https://github.com/apache/airflow/issues/8649#issuecomment-634766517


   @mik-laj 
   
   I would recommend that the user be allow to supply a list of cron strings or 
cron strings with comma separation. I would then implement a object that has 
internal logic like 
[this](https://github.com/kiorky/croniter/pull/23#issuecomment-555828306) 
implementation of scheduling with multiple `croniter` objects. The object 
should also have a `get_next()` function similar to the one [currently used by 
the `DAG` object (see `following` 
implementation)](https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/models/dag.py#L488).
 If just one cron string is supplied then the DAG uses the `croniter` object as 
is currently implemented. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431313499



##
File path: airflow/models/dag.py
##
@@ -1468,14 +1472,26 @@ def create_dagrun(self,
 :param session: database session
 :type session: sqlalchemy.orm.session.Session
 """
+if run_id:
+if not isinstance(run_id, str):
+raise ValueError(f"`run_id` expected to be a str is 
{type(run_id)}")
+run_type: DagRunType = DagRunType.from_run_id(run_id)
+elif run_type and execution_date:
+run_id = DagRun.generate_run_id(run_type, execution_date)
+elif not run_id:
+raise AirflowException(
+"Creating DagRun needs either `run_id` or `run_type` and 
`execution_date`"
+)
+
 run = DagRun(
 dag_id=self.dag_id,
 run_id=run_id,
 execution_date=execution_date,
 start_date=start_date,
 external_trigger=external_trigger,
 conf=conf,
-state=state
+state=state,
+run_type=run_type.value,

Review comment:
   If a string is passed to run_type this will fail with "not so helpful" 
error, e.g "string type has no attribute value".
   
   We should probably add a check to see if only `run_type` is of Enum type 
only.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431234581



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.
+
+paths:
+  # Database entities
+  /connections:
+get:
+  summary: Get all connection entries
+  operationId: getConnections
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+  responses:
+'200':
+  description: List of connection entry.
+  content:
+application/json:
+  schema:
+allOf:
+  - $ref: '#/components/schemas/ConnectionCollection'
+  - $ref: '#/components/schemas/CollectionInfo'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+post:
+  summary: Create connection entry
+  operationId: createConnection
+  tags: [Connection]
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /connections/{connection_id}:
+parameters:
+  - $ref: '#/components/parameters/ConnectionID'
+
+get:
+  summary: Get a connection entry
+  operationId: getConnection
+  tags: [Connection]
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+patch:
+  summary: Update a connection entry
+  operationId: updateConnection
+  tags: [Connection]
+  parameters:
+- $ref: '#/components/parameters/UpdateMask'
+  requestBody:
+required: true
+content:
+  application/json:
+schema:
+  $ref: '#/components/schemas/Connection'
+
+  responses:
+'200':
+  description: Successful response.
+  content:
+application/json:
+  schema:
+$ref: '#/components/schemas/Connection'
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+'404':
+  $ref: '#/components/responses/NotFound'
+
+delete:
+  summary: Delete a connection entry
+  operationId: deleteConnection
+  tags: [Connection]
+  responses:
+'204':
+  description: No content.
+'400':
+  $ref: '#/components/responses/BadRequest'
+'401':
+  $ref: '#/components/responses/Unauthenticated'
+'403':
+  $ref: '#/components/responses/PermissionDenied'
+
+  /dags:
+get:
+  summary: Get all DAGs
+  operationId: getDags
+  tags: [DAG]
+  parameters:
+- $ref: '#/components/parameters/PageLimit'
+- $ref: '#/components/parameters/PageOffset'
+ 

[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431233852



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation

Review comment:
   Related commit: 
https://github.com/apache/airflow/pull/8721/commits/abef5b999da5e75d81e488228bda51bdd5827ccf





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #8721: Add OpenAPI specification (II)

2020-05-27 Thread GitBox


mik-laj commented on a change in pull request #8721:
URL: https://github.com/apache/airflow/pull/8721#discussion_r431234192



##
File path: openapi.yaml
##
@@ -0,0 +1,2427 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+---
+openapi: 3.0.3
+
+info:
+  title: "Airflow API (Stable)"
+  description: Apache Airflow management API.
+  version: '1.0.0'
+  license:
+name: Apache 2.0
+url: http://www.apache.org/licenses/LICENSE-2.0.html
+  contact:
+name: Apache Foundation
+url: https://airflow.apache.org
+email: d...@airflow.apache.org
+
+servers:
+  - url: /api/v1
+description: Airfow Stable API.

Review comment:
   Fixed. Thanks.
   
https://github.com/apache/airflow/pull/8721/commits/abef5b999da5e75d81e488228bda51bdd5827ccf





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8227: Add run_type to DagRun

2020-05-27 Thread GitBox


kaxil commented on a change in pull request #8227:
URL: https://github.com/apache/airflow/pull/8227#discussion_r431236903



##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   Maybe not using an `_and` but here:
   
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/jobs/scheduler_job.py#L567-L575
   
   

##
File path: airflow/models/dagrun.py
##
@@ -54,25 +54,27 @@ class DagRun(Base, LoggingMixin):
 _state = Column('state', String(50), default=State.RUNNING)
 run_id = Column(String(ID_LEN))
 external_trigger = Column(Boolean, default=True)
+run_type = Column(String(50), nullable=True)
 conf = Column(PickleType)
 
 dag = None
 
 __table_args__ = (
-Index('dag_id_state', dag_id, _state),
+Index('dag_id_state_type', dag_id, _state, run_type),

Review comment:
   Maybe not using an `and_` but here:
   
   
https://github.com/apache/airflow/blob/738667082d32d3ef93ec2cd6c3735ff3691ba1cc/airflow/jobs/scheduler_job.py#L567-L575
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] turbaszek commented on a change in pull request #9037: Create guide for Dataproc Operators

2020-05-27 Thread GitBox


turbaszek commented on a change in pull request #9037:
URL: https://github.com/apache/airflow/pull/9037#discussion_r431309657



##
File path: docs/howto/operator/gcp/dataproc.rst
##
@@ -0,0 +1,186 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Google Cloud Dataproc Operators
+===
+
+Dataproc is a managed Apache Spark and Apache Hadoop service that lets you
+take advantage of open source data tools for batch processing, querying, 
streaming and machine learning.
+Dataproc automation helps you create clusters quickly, manage them easily, and
+save money by turning clusters off when you don't need them.
+
+
+.. contents::
+  :depth: 1
+  :local:
+
+Prerequisite Tasks
+--
+
+.. include:: _partials/prerequisite_tasks.rst
+
+
+.. _howto/operator:DataprocCreateClusterOperator:
+
+Create a Cluster
+
+
+Before you create a dataproc cluster you need to define the cluster.
+It describes the identifying information, config, and status of a cluster of 
Compute Engine instances.
+
+A cluster configuration can look as followed:
+
+.. exampleinclude:: 
../../../../airflow/providers/google/cloud/example_dags/example_dataproc.py
+:language: python
+:dedent: 4
+:start-after: [START how_to_cloud_dataproc_create_cluster]
+:end-before: [END how_to_cloud_dataproc_create_cluster]
+
+With this configuration we can create the cluster:
+:class:`~airflow.providers.google.cloud.operators.dataproc.DataprocCreateClusterOperator`
+
+.. exampleinclude:: 
../../../../airflow/providers/google/cloud/example_dags/example_dataproc.py
+:language: python
+:dedent: 4
+:start-after: [START how_to_cloud_dataproc_create_cluster_operator]
+:end-before: [END how_to_cloud_dataproc_create_cluster_operator]
+
+Update a cluster
+
+You can scale the cluster up or down by providing a cluster config and a 
updateMask.
+In the updateMask argument you specifies the path, relative to Cluster, of the 
field to update.
+For more information on updateMask and other parameters take a look at 
`Dataproc update cluster API. 
`__

Review comment:
   Can you add link API docs with  cluster object definition in create 
cluster section?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6395) [AIP-28] Add AsyncExecutor option

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118237#comment-17118237
 ] 

ASF GitHub Bot commented on AIRFLOW-6395:
-

dazza-codes commented on pull request #6984:
URL: https://github.com/apache/airflow/pull/6984#issuecomment-635054207


   For various reasons beyond my control, the complexity of Airflow seemed to 
be a hurdle for various ops and other personnel on a team at work.  In short, I 
don't have time to follow-through on this in a timely manner, since alternative 
simpler options have proved to be useful.  See various details in the Apache2 
code at
   - https://gitlab.com/dazza-codes/aio-aws
   
   +1 for anyone who wants to beg-borrow-or-steal anything to move this along, 
if it's a worthy direction for Airflow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [AIP-28] Add AsyncExecutor option
> -
>
> Key: AIRFLOW-6395
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6395
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core, executors, operators, scheduler
>Affects Versions: 1.10.7
>Reporter: Darren Weber
>Priority: Minor
>
> Add an AsyncExecutor that is similar to LocalExecutor but designed to 
> optimize for high concurrency with async behavior for any blocking 
> operations.  It requires an async ecosystem and general flags for async 
> operations on hooks, operators, and sensors.
> Further details can be developed in an AIP and this description can be 
> updated with links to relevant resources and discussion(s).
> - 
> [https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-28%3A+Add+AsyncExecutor+option]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] dazza-codes commented on pull request #6984: [WIP][AIRFLOW-6395] Add an asyncio compatible dask executor [AIP-28]

2020-05-27 Thread GitBox


dazza-codes commented on pull request #6984:
URL: https://github.com/apache/airflow/pull/6984#issuecomment-635054207


   For various reasons beyond my control, the complexity of Airflow seemed to 
be a hurdle for various ops and other personnel on a team at work.  In short, I 
don't have time to follow-through on this in a timely manner, since alternative 
simpler options have proved to be useful.  See various details in the Apache2 
code at
   - https://gitlab.com/dazza-codes/aio-aws
   
   +1 for anyone who wants to beg-borrow-or-steal anything to move this along, 
if it's a worthy direction for Airflow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] casassg opened a new issue #9041: [AIP-31] Update Airflow tutorial to use functional DAGs

2020-05-27 Thread GitBox


casassg opened a new issue #9041:
URL: https://github.com/apache/airflow/issues/9041


   
   **Description**
   
   Change Airflow tutorial (or create new tutorial) using the new functional 
layer.
   
   **Use case / motivation**
   
   Help users onboard Airflow by reducing the complexity of the tutorial by 
adopting the new functional DAG definition layer.
   
   **Related Issues**
   
   Blocked by #8057
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] houqp opened a new pull request #9042: detect incompatible docker server version in breeze

2020-05-27 Thread GitBox


houqp opened a new pull request #9042:
URL: https://github.com/apache/airflow/pull/9042


   Without this detection code, breeze just fails with the following error:
   
   ```
   
===
Checking backend: mysql
   
===
   
   Checking if MySQL is ready for connections (double restarts in the logs)
   
   
   ERROR! Maximum number of retries while waiting for MySQL. Exiting
   
   Last check: 0 connection ready messages (expected >=2)
   ```
   
   With the change, it will fail with a more actionable error:
   
   ```
   Starting docker-compose_mysql_1 ... done
   Client: Docker Engine - Community
Version:   19.03.9
API version:   1.40
Go version:go1.13.10
Git commit:9d988398e7
Built: Fri May 15 00:24:16 2020
OS/Arch:   linux/amd64
Experimental:  false
   Error response from daemon: client version 1.40 is too new. Maximum 
supported API version is 1.39
   ```
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] tag nightly-master updated (0b0e4f7 -> 7386670)

2020-05-27 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to tag nightly-master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


*** WARNING: tag nightly-master was modified! ***

from 0b0e4f7  (commit)
  to 7386670  (commit)
from 0b0e4f7  Preparing for RC3 relase of backports (#9026)
 add 6fc555d  Add ADDITIONAL_PYTHON_DEPS (#9031)
 add 5a7a3d1  Add ADDITIONAL_AIRFLOW_EXTRAS (#9032)
 add 30b12a9  Filter dags by clicking on tag (#8897)
 add 8ac90b0  [AIRFLOW-5615] Reduce duplicated logic around job 
heartbeating (#6311)
 add 7386670  Additional python extras and deps can be set in breeze (#9035)

No new revisions were added by this update.

Summary of changes:
 BREEZE.rst | 12 +++
 Dockerfile | 10 +-
 Dockerfile.ci  |  3 +-
 IMAGES.rst | 23 +
 airflow/jobs/base_job.py   | 15 +++--
 airflow/jobs/scheduler_job.py  | 10 +-
 airflow/www/templates/airflow/dags.html|  6 +++-
 breeze | 14 
 breeze-complete|  1 +
 scripts/ci/_utils.sh   | 12 ++-
 .../kubernetes/docker/rebuild_airflow_image.sh |  2 ++
 tests/conftest.py  | 38 ++
 tests/jobs/test_base_job.py| 21 +++-
 13 files changed, 150 insertions(+), 17 deletions(-)



[GitHub] [airflow] casassg commented on a change in pull request #8805: Resolve upstream tasks when template field is XComArg

2020-05-27 Thread GitBox


casassg commented on a change in pull request #8805:
URL: https://github.com/apache/airflow/pull/8805#discussion_r431484377



##
File path: tests/models/test_baseoperator.py
##
@@ -347,3 +350,66 @@ def test_lineage_composition(self):
 task4 = DummyOperator(task_id="op4", dag=dag)
 task4 > [inlet, outlet, extra]
 self.assertEqual(task4.get_outlet_defs(), [inlet, outlet, extra])
+
+
+@pytest.fixture(scope="class")
+def provide_test_dag_bag():
+yield DagBag(dag_folder=TEST_DAGS_FOLDER, include_examples=False)
+
+
+class TestXComArgsRelationsAreResolved:
+def test_upstream_is_set_when_template_field_is_xcomarg(
+self, provide_test_dag_bag  # pylint: disable=redefined-outer-name
+):
+dag: DAG = provide_test_dag_bag.get_dag("xcomargs_test_1")
+op1, op2 = sorted(dag.tasks, key=lambda t: t.task_id)
+
+assert op1 in op2.upstream_list
+assert op2 in op1.downstream_list
+
+def test_set_xcomargs_dependencies_works_recursively(
+self, provide_test_dag_bag  # pylint: disable=redefined-outer-name
+):
+dag: DAG = provide_test_dag_bag.get_dag("xcomargs_test_2")
+op1, op2, op3, op4 = sorted(dag.tasks, key=lambda t: t.task_id)
+
+assert op1 in op3.upstream_list
+assert op2 in op3.upstream_list
+assert op1 in op4.upstream_list
+assert op2 in op4.upstream_list
+
+def test_set_xcomargs_dependencies_works_when_set_after_init(
+self, provide_test_dag_bag  # pylint: disable=redefined-outer-name
+):
+dag: DAG = provide_test_dag_bag.get_dag("xcomargs_test_3")
+op1, op2, _ = sorted(dag.tasks, key=lambda t: t.task_id)
+
+assert op1 in op2.upstream_list
+
+def test_set_xcomargs_dependencies_no_error_when_outside_dag(self):
+class CustomOp(DummyOperator):
+template_fields = ("field",)
+
+@apply_defaults
+def __init__(self, field, *args, **kwargs):
+super().__init__(*args, **kwargs)
+self.field = field
+
+op1 = DummyOperator(task_id="op1")
+CustomOp(task_id="op2", field=op1.output)
+
+def 
test_set_xcomargs_dependencies_when_creating_dagbag_with_serialization(self):
+# Persist DAG
+dag_id = "xcomargs_test_3"
+dagbag = DagBag(TEST_DAGS_FOLDER, include_examples=False)

Review comment:
   nit: Why not use `provide_test_dag_bag` here as well?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] casassg commented on a change in pull request #8962: [AIRFLOW-8057] [AIP-31] Add @task decorator

2020-05-27 Thread GitBox


casassg commented on a change in pull request #8962:
URL: https://github.com/apache/airflow/pull/8962#discussion_r431476613



##
File path: docs/concepts.rst
##
@@ -116,6 +116,47 @@ DAGs can be used as context managers to automatically 
assign new operators to th
 
 op.dag is dag # True
 
+.. _concepts:functional_dags:
+
+Functional DAGs
+---
+*Added in Airflow 1.10.11*
+
+DAGs can be defined using functional abstractions. Outputs and inputs are sent 
between tasks using
+:ref:`XComs ` values. In addition, you can wrap functions as 
tasks using the
+:ref:`task decorator `. Dependencies are 
automatically inferred from
+the message dependencies.
+
+Example DAG with functional abstraction
+
+.. code:: python
+
+  with DAG(
+  'send_server_ip', default_args=default_args, schedule_interval=None
+  ) as dag:
+
+# Using default connection as it's set to httpbin.org by default
+get_ip = SimpleHttpOperator(
+task_id='get_ip', endpoint='get', method='GET', xcom_push=True
+)
+
+@dag.task(multiple_outputs=True)

Review comment:
   created #9041 as a follow up task





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy opened a new pull request #9043: Add example dag and system test for LocalFilesystemToGCSOperator

2020-05-27 Thread GitBox


ephraimbuddy opened a new pull request #9043:
URL: https://github.com/apache/airflow/pull/9043


   ---
   This PR closes one of the issues in #8280 
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] houqp commented on pull request #9042: detect incompatible docker server version in breeze

2020-05-27 Thread GitBox


houqp commented on pull request #9042:
URL: https://github.com/apache/airflow/pull/9042#issuecomment-635027681


   @potiuk I noticed travis build is already failing with this check:
   
   ```
   Client: Docker Engine - Community
Version:   19.03.9
API version:   1.40
Go version:go1.13.10
Git commit:9d988398e7
Built: Fri May 15 00:24:16 2020
OS/Arch:   linux/amd64
Experimental:  false
   Error response from daemon: client version 1.40 is too new. Maximum 
supported API version is 1.38
   
###
  EXITING 
/opt/airflow/scripts/ci/in_container/check_environment.sh WITH STATUS CODE 1
   
###
   ```
   
   Is is expected for docker client to not be able to communicate with the 
daemon in travis?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] feluelle commented on a change in pull request #8962: [AIRFLOW-8057] [AIP-31] Add @task decorator

2020-05-27 Thread GitBox


feluelle commented on a change in pull request #8962:
URL: https://github.com/apache/airflow/pull/8962#discussion_r431366969



##
File path: docs/concepts.rst
##
@@ -116,6 +116,47 @@ DAGs can be used as context managers to automatically 
assign new operators to th
 
 op.dag is dag # True
 
+.. _concepts:functional_dags:
+
+Functional DAGs
+---
+*Added in Airflow 1.10.11*
+
+DAGs can be defined using functional abstractions. Outputs and inputs are sent 
between tasks using
+:ref:`XComs ` values. In addition, you can wrap functions as 
tasks using the
+:ref:`task decorator `. Dependencies are 
automatically inferred from
+the message dependencies.
+
+Example DAG with functional abstraction
+
+.. code:: python
+
+  with DAG(
+  'send_server_ip', default_args=default_args, schedule_interval=None
+  ) as dag:
+
+# Using default connection as it's set to httpbin.org by default
+get_ip = SimpleHttpOperator(
+task_id='get_ip', endpoint='get', method='GET', xcom_push=True
+)
+
+@dag.task(multiple_outputs=True)
+def prepare_email(raw_json: str) -> str:
+  external_ip = json.loads(raw_json)['origin']
+  return {
+'subject':f'Server connected from {external_ip}',
+'body': f'Seems like today your server executing Airflow is connected 
from the external IP {external_ip}'

Review comment:
   Or we could use 
[future-fstrings](https://github.com/asottile/future-fstrings). We already have 
it as dep of a 
[dep](https://github.com/apache/airflow/search?q=future-fstrings_q=future-fstrings)
 and add `# -*- coding: future_fstrings -*-`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #9038: Use production image for k8s tests

2020-05-27 Thread GitBox


potiuk commented on a change in pull request #9038:
URL: https://github.com/apache/airflow/pull/9038#discussion_r431367442



##
File path: scripts/ci/in_container/kubernetes/docker/rebuild_airflow_image.sh
##
@@ -38,46 +38,25 @@ cp /entrypoint.sh scripts/docker/
 echo
 echo "Building image from ${AIRFLOW_CI_IMAGE} with latest sources"
 echo
-start_output_heartbeat "Rebuilding Kubernetes image" 3
-docker build \
---build-arg PYTHON_BASE_IMAGE="${PYTHON_BASE_IMAGE}" \
---build-arg PYTHON_MAJOR_MINOR_VERSION="${PYTHON_MAJOR_MINOR_VERSION}" \
---build-arg AIRFLOW_VERSION="${AIRFLOW_VERSION}" \
---build-arg AIRFLOW_EXTRAS="${AIRFLOW_EXTRAS}" \
---build-arg ADDITIONAL_AIRFLOW_EXTRAS="${ADDITIONAL_AIRFLOW_EXTRAS}" \
---build-arg ADDITIONAL_PYTHON_DEPS="${ADDITIONAL_PYTHON_DEPS}" \
---build-arg AIRFLOW_BRANCH="${AIRFLOW_BRANCH}" \
---build-arg 
AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD="${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" \
---build-arg 
UPGRADE_TO_LATEST_REQUIREMENTS="${UPGRADE_TO_LATEST_REQUIREMENTS}" \
---build-arg HOME="${HOME}" \
---cache-from "${AIRFLOW_CI_IMAGE}" \
---tag="${AIRFLOW_CI_IMAGE}" \
---target="main" \
--f Dockerfile.ci . >> "${OUTPUT_LOG}"
-echo
+#export 
AIRFLOW_PROD_BASE_TAG="${DEFAULT_BRANCH}-python${PYTHON_MAJOR_MINOR_VERSION}"

Review comment:
   Those variablse are all set in the HOST and you need to pass them to the 
container. But I think this is not really needed . I just rebased #8265  which 
moves setting up kubernetes kind cluster and deploying production image to host 
rather than to CI image and I am going to finish it up this week, so  I think 
it's better to do it this way.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] casassg commented on a change in pull request #8962: [AIRFLOW-8057] [AIP-31] Add @task decorator

2020-05-27 Thread GitBox


casassg commented on a change in pull request #8962:
URL: https://github.com/apache/airflow/pull/8962#discussion_r431464263



##
File path: tests/operators/test_python.py
##
@@ -311,6 +315,350 @@ def func(**context):
 python_operator.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE)
 
 
+class TestAirflowTask(unittest.TestCase):
+
+@classmethod
+def setUpClass(cls):
+super().setUpClass()
+
+with create_session() as session:
+session.query(DagRun).delete()
+session.query(TI).delete()
+
+def setUp(self):
+super().setUp()
+self.dag = DAG(
+'test_dag',
+default_args={
+'owner': 'airflow',
+'start_date': DEFAULT_DATE})
+self.addCleanup(self.dag.clear)
+
+def tearDown(self):
+super().tearDown()
+
+with create_session() as session:
+session.query(DagRun).delete()
+session.query(TI).delete()
+
+def _assert_calls_equal(self, first, second):
+assert isinstance(first, Call)
+assert isinstance(second, Call)
+assert first.args == second.args
+# eliminate context (conf, dag_run, task_instance, etc.)
+test_args = ["an_int", "a_date", "a_templated_string"]
+first.kwargs = {
+key: value
+for (key, value) in first.kwargs.items()
+if key in test_args
+}
+second.kwargs = {
+key: value
+for (key, value) in second.kwargs.items()
+if key in test_args
+}
+assert first.kwargs == second.kwargs
+
+def test_python_operator_python_callable_is_callable(self):
+"""Tests that @task will only instantiate if
+the python_callable argument is callable."""
+not_callable = {}
+with pytest.raises(AirflowException):
+task_decorator(not_callable, dag=self.dag)
+
+def test_python_callable_arguments_are_templatized(self):
+"""Test @task op_args are templatized"""
+recorded_calls = []
+
+# Create a named tuple and ensure it is still preserved
+# after the rendering is done
+Named = namedtuple('Named', ['var1', 'var2'])
+named_tuple = Named('{{ ds }}', 'unchanged')
+
+task = task_decorator(
+# a Mock instance cannot be used as a callable function or test 
fails with a
+# TypeError: Object of type Mock is not JSON serializable
+build_recording_function(recorded_calls),
+dag=self.dag)
+task(4, date(2019, 1, 1), "dag {{dag.dag_id}} ran on {{ds}}.", 
named_tuple)
+
+self.dag.create_dagrun(
+run_id=DagRunType.MANUAL.value,
+execution_date=DEFAULT_DATE,
+start_date=DEFAULT_DATE,
+state=State.RUNNING
+)
+task.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE)
+
+ds_templated = DEFAULT_DATE.date().isoformat()
+assert len(recorded_calls) == 1
+self._assert_calls_equal(
+recorded_calls[0],
+Call(4,
+ date(2019, 1, 1),
+ "dag {} ran on {}.".format(self.dag.dag_id, ds_templated),
+ Named(ds_templated, 'unchanged'))
+)
+
+def test_python_callable_keyword_arguments_are_templatized(self):
+"""Test PythonOperator op_kwargs are templatized"""
+recorded_calls = []
+
+task = task_decorator(
+# a Mock instance cannot be used as a callable function or test 
fails with a
+# TypeError: Object of type Mock is not JSON serializable
+build_recording_function(recorded_calls),
+dag=self.dag
+)
+task(an_int=4, a_date=date(2019, 1, 1), a_templated_string="dag 
{{dag.dag_id}} ran on {{ds}}.")
+self.dag.create_dagrun(
+run_id=DagRunType.MANUAL.value,
+execution_date=DEFAULT_DATE,
+start_date=DEFAULT_DATE,
+state=State.RUNNING
+)
+task.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE)
+
+assert len(recorded_calls) == 1
+self._assert_calls_equal(
+recorded_calls[0],
+Call(an_int=4,
+ a_date=date(2019, 1, 1),
+ a_templated_string="dag {} ran on {}.".format(
+ self.dag.dag_id, DEFAULT_DATE.date().isoformat()))
+)
+
+def test_copy_in_dag(self):
+"""Test copy method to reuse tasks in a DAG"""
+
+@task_decorator
+def do_run():
+return 4
+with self.dag:
+do_run()
+assert ['do_run'] == self.dag.task_ids
+do_run_1 = do_run.copy()
+do_run_2 = do_run.copy()
+assert do_run_1.task_id == 'do_run__1'
+assert do_run_2.task_id == 'do_run__2'
+
+def test_copy(self):
+"""Test copy method outside of a DAG"""
+

[jira] [Commented] (AIRFLOW-3964) Consolidate and de-duplicate sensor tasks

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117987#comment-17117987
 ] 

ASF GitHub Bot commented on AIRFLOW-3964:
-

YingboWang commented on pull request #5499:
URL: https://github.com/apache/airflow/pull/5499#issuecomment-634854013


   Thanks @KevinYang21 for the comment. This PR description was updated with a 
high level architecture diagram and some charts for smart sensor impact after 
being deployed. 
   
   To cover operators not included in this PR:
   1. Define "poke_context_fields" as class attribute in the operator. 
"poke_context_fields" include all key names used for initializing a sensor 
object.
   2. In airflow.cfg, add new operator classname to [smart_sensor] 
sensors_enabled. All supported sensor classname should be comma separated. 
   3. Optional: if there are any corner case/restrictions on sensors that 
should be dump to smart sensor service, overwrite the 
"is_smart_sensor_compatible()" in operator class for rules. (example as named 
hive partition sensor) 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consolidate and de-duplicate sensor tasks 
> --
>
> Key: AIRFLOW-3964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: dependencies, operators, scheduler
>Affects Versions: 1.10.0
>Reporter: Yingbo Wang
>Assignee: Yingbo Wang
>Priority: Critical
>
> h2. Problem
> h3. Airflow Sensor:
> Sensors are a certain type of operator that will keep running until a certain 
> criterion is met. Examples include a specific file landing in HDFS or S3, a 
> partition appearing in Hive, or a specific time of the day. Sensors are 
> derived from BaseSensorOperator and run a poke method at a specified 
> poke_interval until it returns True.
> Airflow Sensor duplication is a normal problem for large scale airflow 
> project. There are duplicated partitions needing to be detected from 
> same/different DAG. In Airbnb there are 88 boxes running four different types 
> of sensors everyday. The number of running sensor tasks ranges from 8k to 
> 16k, which takes great amount of resources. Although Airflow team had 
> redirected all sensors to a specific queue to allocate relatively minor 
> resource, there is still large room to reduce the number of workers and 
> relief DB pressure by optimizing the sensor mechanism.
> Existing sensor implementation creates an identical task for any sensor task 
> with specific dag_id, task_id and execution_date. This task is responsible of 
> keeping querying DB until the specified partitions exists. Even if two tasks 
> are waiting for same partition in DB, they are creating two connections with 
> the DB and checking the status in two separate processes. In one hand, DB 
> need to run duplicate jobs in multiple processes which will take both cpu and 
> memory resources. At the same time, Airflow need to maintain a process for 
> each sensor to query and wait for the partition/table to be created.
> h1. ***Design*
> There are several issues need to be resolved for our smart sensor. 
> h2. Persist sensor infor in DB and avoid file parsing before running
> Current Airflow implementation need to parse the DAG python file before 
> running a task. Parsing multiple python file in a smart sensor make the case 
> low efficiency and overload. Since sensor tasks need relatively more “light 
> weight” executing information -- less number of properties with simple 
> structure (most are built in type instead of function or object). We propose 
> to skip the parsing for smart sensor. The easiest way is to persist all task 
> instance information in airflow metaDB. 
> h3. Solution:
> It will be hard to dump the whole task instance object dictionary. And we do 
> not really need that much information. 
> We add two sets to the base sensor class as “persist_fields” and 
> “execute_fields”. 
> h4. “persist_fields”  dump to airflow.task_instance column “attr_dict”
> saves the attribute names that should be used to accomplish a sensor poking 
> job. For example:
>  #  the “NamedHivePartitionSensor” define its persist_fields as  
> ('partition_names', 'metastore_conn_id', 'hook') since these properties are 
> enough for its poking function. 
>  # While the HivePartitionSensor can be slightly different use persist_fields 
> as ('schema', 'table', 'partition', 'metastore_conn_id')
> If we have two tasks that have same property value for all field in 
> persist_fields. That 

[jira] [Commented] (AIRFLOW-5500) Bug in trigger api endpoint

2020-05-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118034#comment-17118034
 ] 

ASF GitHub Bot commented on AIRFLOW-5500:
-

kylegao91 commented on pull request #8081:
URL: https://github.com/apache/airflow/pull/8081#issuecomment-634905164


   Really hope this can be merged into 1.10.11, very important for our tasks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bug in trigger api endpoint 
> 
>
> Key: AIRFLOW-5500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5500
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api
>Affects Versions: 1.10.1
>Reporter: Deavarajegowda M T
>Priority: Critical
> Attachments: 3level.py
>
>
> Unable to trigger workflow with nested sub dags, getting following error:
>  sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) duplicate key value 
> (dag_id,execution_date)=('dummy.task1.task_level1.task_level2','2019-09-10 
> 13:00:27+00:00') violates unique constraint 
> "dag_run_dag_id_execution_date_key"
>  trigger_dag for nested sub_dags is called twice.
>  
> fix:
> in airflow/api/common/experimental/trigger_dag.py -
> while populating subdags for a dag, each subdag's subdags is also populated 
> to main dag.
> So no need to repopulate subdags for each subdag separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kylegao91 commented on pull request #8081: [AIRFLOW-5500] Fix the trigger_dag api in the case of nested subdags

2020-05-27 Thread GitBox


kylegao91 commented on pull request #8081:
URL: https://github.com/apache/airflow/pull/8081#issuecomment-634905164


   Really hope this can be merged into 1.10.11, very important for our tasks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >