[GitHub] [airflow] j-y-matsubara commented on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


j-y-matsubara commented on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626278731


   Thank you, your comments.
   If my fix seems slow, and you want to fix it hurry, you should feel free to 
correct me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8202: Create guide for Dataflow operators

2020-05-09 Thread GitBox


mik-laj commented on issue #8202:
URL: https://github.com/apache/airflow/issues/8202#issuecomment-626277797


   @tanjinP I would be happy if you added information that asynchronous 
execution is recommended.
   https://cloud.google.com/dataflow/docs/guides/specifying-exec-params#python_8
   I would like to create sensors that will allow more efficient use of 
resources.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8202: Create guide for Dataflow operators

2020-05-09 Thread GitBox


mik-laj commented on issue #8202:
URL: https://github.com/apache/airflow/issues/8202#issuecomment-626277095


   @tanjinP Fantastic. I am working on this integration now with the Dataflow 
team, so this guide would be very helpful. I saw that you applied for 3 
services, but then this service is the most important. I will be happy to share 
my thoughts on this integration and together with you will develop this guide. 
I am sure that the Dataflow team will be willing to review this guide as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj edited a comment on issue #8202: Create guide for Dataflow operators

2020-05-09 Thread GitBox


mik-laj edited a comment on issue #8202:
URL: https://github.com/apache/airflow/issues/8202#issuecomment-626277095


   @tanjinP Fantastic. I am working on this integration now with the Dataflow 
team, so this guide would be very helpful. I saw that you applied for 3 
services, but then this service is the most important. I will be happy to share 
my thoughts on this integration and together with you will develop this guide. 
I am sure that the Dataflow team will be willing to review this guide as well. 
Dataflow is a hot topic right now, so I'm very happy.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla commented on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


gdevanla commented on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626276892


   Nope, you can go ahead and work on it. I hit the same issue with in our 
internal project and was looking through the code to understand what was 
happening. Then I came across this bug submission. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla edited a comment on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


gdevanla edited a comment on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626276892


   Nope, you can go ahead and work on it. I hit the same issue with our 
internal project and was looking through the code to understand what was 
happening. Then I came across this bug submission. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara edited a comment on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


j-y-matsubara edited a comment on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626276776


   @gdevanla 
   Sorry for cutting in.
   I agree with your idea.
   I was having trouble with this, so I am trying to fix it. 
   Are you planning to fix it?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


j-y-matsubara commented on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626276776


   @gdevanla 
   Sorry for cutting in, and I agree with your idea.
   I was having trouble with this, so I am trying to fix it. 
   Are you planning to fix it?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tanjinP commented on issue #8205: Create guide for Cloud Key Management Service (KMS) operators

2020-05-09 Thread GitBox


tanjinP commented on issue #8205:
URL: https://github.com/apache/airflow/issues/8205#issuecomment-626276202


   I can take this on - this is a service I've worked heavily in.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tanjinP commented on issue #8206: Create guide for Kubernetes Engine operators

2020-05-09 Thread GitBox


tanjinP commented on issue #8206:
URL: https://github.com/apache/airflow/issues/8206#issuecomment-626276214


   I can take this on - this is a service I've worked heavily in.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tanjinP commented on issue #8202: Create guide for Dataflow operators

2020-05-09 Thread GitBox


tanjinP commented on issue #8202:
URL: https://github.com/apache/airflow/issues/8202#issuecomment-626276187


   I can take this on - this is a service I've worked heavily in.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on a change in pull request #8757: Fix the incorrect description of pod_mutation_hook in kubernetes.rst

2020-05-09 Thread GitBox


j-y-matsubara commented on a change in pull request #8757:
URL: https://github.com/apache/airflow/pull/8757#discussion_r422455491



##
File path: docs/kubernetes.rst
##
@@ -40,11 +40,44 @@ that has the ability to mutate pod objects before sending 
them to the Kubernetes
 for scheduling. It receives a single argument as a reference to pod objects, 
and
 is expected to alter its attributes.
 
-This could be used, for instance, to add sidecar or init containers
+This could be used, for instance, to add init containers

Review comment:
   How is it used for sidecar?
   We can mutation pod object by using pod_mutation_hook, but I think we can 
not add any containers (sidecars), excluding the init container. 
   
   In the case of using kubernetes_request_factory.py ( 
airflow/contrib/kubernetes/ ),in the version 1.X.X, pod object of argument of 
pod_mutation_hook is not V1Pod.  
   **Part of the contents of** req object creating in pod_request_factory.py is 
overwritten with this pod object, in kubernetes_request_factory.py, and launch 
"Pod" based on **req object.**   
In this overwriting, I think there is no function to add new containers 
(sidecars) excluding the init container.
   
   In the version 2.0 (airflow/kubernetes/), We can directly edit the pod 
object used to launch "Pod". ( the pod object of argument of pod_mutation_hook 
is V1Pod ) Therefore, we can add any containaers (sidecar).
   
   
   
   If my understanding is mistaken, please tell me.
   Thank you in advance.
   
   ※Supplementary information (version 1.10.10)
If we set the following to the operator,
   ```
   k2 = KubernetesPodOperator(namespace='airflow01',
  task_id='test-kwsk',
  name='test-kwsk',
  image="airflowworker:1.0.0",
  image_pull_policy="IfNotPresent",
  cmds=["bash", "-cx"],
  arguments=['sleep 1'],
  do_xcom_push=True,
      
 ..
   ```
   The pod object of argument of pod_mutation_hook is already set as follows.
   ```
   affinity:{}
   annotations: {}
   args:['sleep 1']
   cmds:['bash', '-cx']
   configmaps:  []
   dnspolicy:   None
   envs:{}
   hostnetwork: False
   image:   airflowworker:1.0.0
   image_pull_policy:   IfNotPresent
   ...
   ..
   ```
   
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on a change in pull request #8757: Fix the incorrect description of pod_mutation_hook in kubernetes.rst

2020-05-09 Thread GitBox


j-y-matsubara commented on a change in pull request #8757:
URL: https://github.com/apache/airflow/pull/8757#discussion_r422455491



##
File path: docs/kubernetes.rst
##
@@ -40,11 +40,44 @@ that has the ability to mutate pod objects before sending 
them to the Kubernetes
 for scheduling. It receives a single argument as a reference to pod objects, 
and
 is expected to alter its attributes.
 
-This could be used, for instance, to add sidecar or init containers
+This could be used, for instance, to add init containers

Review comment:
   How is it used for sidecar?
   We can mutation pod object by using pod_mutation_hook, but I think we can 
not add any containers (sidecars), excluding the init container. 
   
   In the case of using kubernetes_request_factory.py ( 
airflow/contrib/kubernetes/ ),version 1.X.X, pod object of argument of 
pod_mutation_hook is not V1Pod.  
   **Part of the contents of** req object creating in pod_request_factory.py is 
overwritten with this pod object, in kubernetes_request_factory.py, and launch 
"Pod" based on **req object.**   
In this overwriting, I think there is no function to add new containers 
(sidecars) excluding the init container.
   
   In the version 2.0 (airflow/kubernetes/), We can directly edit the pod 
object used to launch "Pod". ( the pod object of argument of pod_mutation_hook 
is V1Pod ) Therefore, we can add any containaers (sidecar).
   
   
   
   If my understanding is mistaken, please tell me.
   Thank you in advance.
   
   ※Supplementary information (version 1.10.10)
If we set the following to the operator,
   ```
   k2 = KubernetesPodOperator(namespace='airflow01',
  task_id='test-kwsk',
  name='test-kwsk',
  image="airflowworker:1.0.0",
  image_pull_policy="IfNotPresent",
  cmds=["bash", "-cx"],
  arguments=['sleep 1'],
  do_xcom_push=True,
      
 ..
   ```
   The pod object of argument of pod_mutation_hook is already set as follows.
   ```
   affinity:{}
   annotations: {}
   args:['sleep 1']
   cmds:['bash', '-cx']
   configmaps:  []
   dnspolicy:   None
   envs:{}
   hostnetwork: False
   image:   airflowworker:1.0.0
   image_pull_policy:   IfNotPresent
   ...
   ..
   ```
   
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] j-y-matsubara commented on a change in pull request #8757: Fix the incorrect description of pod_mutation_hook in kubernetes.rst

2020-05-09 Thread GitBox


j-y-matsubara commented on a change in pull request #8757:
URL: https://github.com/apache/airflow/pull/8757#discussion_r422455491



##
File path: docs/kubernetes.rst
##
@@ -40,11 +40,44 @@ that has the ability to mutate pod objects before sending 
them to the Kubernetes
 for scheduling. It receives a single argument as a reference to pod objects, 
and
 is expected to alter its attributes.
 
-This could be used, for instance, to add sidecar or init containers
+This could be used, for instance, to add init containers

Review comment:
   How is it used for sidecar?
   We can mutation pod object by using pod_mutation_hook, but I think we can 
not add any containers (sidecars), excluding the init container. 
   
   In the case of using kubernetes_request_factory.py ( 
airflow/contrib/kubernetes/ ) , pod object of argument of pod_mutation_hook is 
not V1Pod.  
   **Part of the contents of** req object creating in pod_request_factory.py is 
overwritten with this pod object, in kubernetes_request_factory.py, and launch 
"Pod" based on **req object.**   
In this overwriting, I think there is no function to add new containers 
(sidecars) excluding the init container.
   
   In the version 2.0 (airflow/kubernetes/), We can directly edit the pod 
object used to launch "Pod". ( the pod object of argument of pod_mutation_hook 
is V1Pod ) Therefore, we can add any containaers (sidecar).
   
   
   
   If my understanding is mistaken, please tell me.
   Thank you in advance.
   
   ※Supplementary information (version 1.10.10)
If we set the following to the operator,
   ```
   k2 = KubernetesPodOperator(namespace='airflow01',
  task_id='test-kwsk',
  name='test-kwsk',
  image="airflowworker:1.0.0",
  image_pull_policy="IfNotPresent",
  cmds=["bash", "-cx"],
  arguments=['sleep 1'],
  do_xcom_push=True,
      
 ..
   ```
   The pod object of argument of pod_mutation_hook is already set as follows.
   ```
   affinity:{}
   annotations: {}
   args:['sleep 1']
   cmds:['bash', '-cx']
   configmaps:  []
   dnspolicy:   None
   envs:{}
   hostnetwork: False
   image:   airflowworker:1.0.0
   image_pull_policy:   IfNotPresent
   ...
   ..
   ```
   
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] maganaluis commented on pull request #8256: updated _write_args on PythonVirtualenvOperator

2020-05-09 Thread GitBox


maganaluis commented on pull request #8256:
URL: https://github.com/apache/airflow/pull/8256#issuecomment-626267914


   > Small issues, apart from that; LGTM
   
   Thank you @Fokko, I switched those to follow a create_session approach with 
the with statement. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil edited a comment on pull request #8776: [AIRFLOW-1156] BugFix: Unpausing a DAG with catchup=False creates an extra DAG run

2020-05-09 Thread GitBox


kaxil edited a comment on pull request #8776:
URL: https://github.com/apache/airflow/pull/8776#issuecomment-626267001


   > Lol, one line fixes are the Best*
   > 
   > Tests please :) (I'm sure you'd get around to it)
   > 
   > * The worst.
   
   😄  Yeah -  Looks like this bug was for a long time (possibly from the very 
start) !!! Found some old JIRAs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-1156) Using a timedelta object as a Schedule Interval with catchup=False causes the start_date to no longer be honored.

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103625#comment-17103625
 ] 

ASF GitHub Bot commented on AIRFLOW-1156:
-

kaxil edited a comment on pull request #8776:
URL: https://github.com/apache/airflow/pull/8776#issuecomment-626267001


   > Lol, one line fixes are the Best*
   > 
   > Tests please :) (I'm sure you'd get around to it)
   > 
   > * The worst.
   
   😄  Yeah -  Looks like this bug was for a long time (possibly from the very 
start) !!! Found some old JIRAs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Using a timedelta object as a Schedule Interval with catchup=False causes the 
> start_date to no longer be honored.
> -
>
> Key: AIRFLOW-1156
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1156
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.0
>Reporter: Zachary Lawson
>Assignee: Kaxil Naik
>Priority: Minor
>
> Currently, in Airflow v1.8, if you set your schedule_interval to a timedelta 
> object and set catchup=False, the start_date is no longer honored and the DAG 
> is scheduled immediately upon unpausing the DAG. It is then schedule on the 
> schedule interval from that point onward. Example below:
> {code}
> from airflow import DAG
> from datetime import datetime, timedelta
> import logging
> from airflow.operators.python_operator import PythonOperator
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2015, 6, 1),
> }
> dag = DAG('test', default_args=default_args, 
> schedule_interval=timedelta(seconds=5), catchup=False)
> def context_test(ds, **context):
> logging.info('testing')
> test_context = PythonOperator(
> task_id='test_context',
> provide_context=True,
> python_callable=context_test,
> dag=dag
> )
> {code}
> If you switch the above over to a CRON expression, the behavior of the 
> scheduling is returned to the expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (AIRFLOW-1156) Using a timedelta object as a Schedule Interval with catchup=False causes the start_date to no longer be honored.

2020-05-09 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-1156 started by Kaxil Naik.
---
> Using a timedelta object as a Schedule Interval with catchup=False causes the 
> start_date to no longer be honored.
> -
>
> Key: AIRFLOW-1156
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1156
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.0
>Reporter: Zachary Lawson
>Assignee: Kaxil Naik
>Priority: Minor
>
> Currently, in Airflow v1.8, if you set your schedule_interval to a timedelta 
> object and set catchup=False, the start_date is no longer honored and the DAG 
> is scheduled immediately upon unpausing the DAG. It is then schedule on the 
> schedule interval from that point onward. Example below:
> {code}
> from airflow import DAG
> from datetime import datetime, timedelta
> import logging
> from airflow.operators.python_operator import PythonOperator
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2015, 6, 1),
> }
> dag = DAG('test', default_args=default_args, 
> schedule_interval=timedelta(seconds=5), catchup=False)
> def context_test(ds, **context):
> logging.info('testing')
> test_context = PythonOperator(
> task_id='test_context',
> provide_context=True,
> python_callable=context_test,
> dag=dag
> )
> {code}
> If you switch the above over to a CRON expression, the behavior of the 
> scheduling is returned to the expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-3369) Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)

2020-05-09 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik closed AIRFLOW-3369.
---
Resolution: Duplicate

PR to fix this: https://github.com/apache/airflow/pull/8776
Closing the issue as https://issues.apache.org/jira/browse/AIRFLOW-1156 
describes the same issue 

> Un-pausing a DAG with catchup =False creates an extra DAG run (1.10)
> 
>
> Key: AIRFLOW-3369
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3369
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.0
>Reporter: Andrew Harmon
>Assignee: Kaxil Naik
>Priority: Major
> Attachments: image.png
>
>
> If you create a DAG with catchup=False, when it is un-paused, it creates 2 
> dag runs. One for the most recent scheduled interval (expected) and one for 
> the interval before that (unexpected).
> *Sample DAG*
> {code:java}
> from airflow import DAG
> from datetime import datetime
> from airflow.operators.dummy_operator import DummyOperator
> dag = DAG(
> dag_id='DummyTest',
> start_date=datetime(2018,1,1),
> catchup=False
> )
> do = DummyOperator(
> task_id='dummy_task',
> dag=dag
> )
> {code}
> *Result:*
> 2 DAG runs are created. 2018-11-18 and 108-11-17
> *Expected Result:*
> Only 1 DAG run should have been created (2018-11-18)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6577) DAG Backfill with timedelta runs twice

2020-05-09 Thread Kaxil Naik (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103622#comment-17103622
 ] 

Kaxil Naik commented on AIRFLOW-6577:
-

PR to fix this: https://github.com/apache/airflow/pull/8776
Closing the issue as https://issues.apache.org/jira/browse/AIRFLOW-1156 
describes the same issue 

> DAG Backfill with timedelta runs twice
> --
>
> Key: AIRFLOW-6577
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6577
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, DagRun
>Affects Versions: 1.10.7
> Environment: ProductName: Mac OS X
> ProductVersion:   10.14.6
> BuildVersion: 18G2022
> Client: Docker Engine - Community
>  Version:   19.03.5
>  API version:   1.40
>  Go version:go1.12.12
>  Git commit:633a0ea
>  Built: Wed Nov 13 07:22:34 2019
>  OS/Arch:   darwin/amd64
>  Experimental:  false
>Reporter: Nick Benthem
>Priority: Minor
>
> if you use {{timedelta=__anything__}}, and have {{catchup=False}}, it will 
> cause a DOUBLE run of your DAG! The only workaround i found was to use a cron 
> timer, i.e.,
> schedule_interval='@daily',
> Rather than
> schedule_interval=timedelta(days=1),
> It almost definitely exists in
> def create_dag_run(self, dag, session=None):
> in 
> {{/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py
>  }}
> around line {{643}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-1056) Single dag run triggered when un-pausing job with catchup=False

2020-05-09 Thread Kaxil Naik (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103621#comment-17103621
 ] 

Kaxil Naik commented on AIRFLOW-1056:
-

PR to fix this: https://github.com/apache/airflow/pull/8776
Closing the issue as https://issues.apache.org/jira/browse/AIRFLOW-1156 
describes the same issue 

> Single dag run triggered when un-pausing job with catchup=False
> ---
>
> Key: AIRFLOW-1056
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1056
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.0
>Reporter: Andrew Heuermann
>Priority: Major
>
> When "catchup=False" a single job run is still triggered when un-pausing a 
> dag when there are missed run windows. 
> In airflow/jobs.py:create_dag_run(): When catchup is disabled it updates the 
> dag.start_date here to prevent the backfill: 
> https://github.com/apache/incubator-airflow/blob/bb39078a35cf2bceea58d7831d7a2028c8ef849f/airflow/jobs.py#L770.
> But it looks like the function schedules dags based on a window (using 
> sequential run times as lower and upper bounds) so it will always schedule a 
> single dag run if there is a missed run between the last run and the time 
> which it was unpaused. Even if it was un-paused AFTER those missed runs.
> Some ideas on solutions:
> * Pass in the time when the scheduler last ran and use that as the lower 
> bound of the window, but not sure how easy that is to get to. 
> * Update the start_date when a dag with catchup=False is unpaused. Or add a 
> new "unpaused_date" field that would serve the same purpose.
> * If paused have the scheduler insert a skipped Job record when the job would 
> have run.
> There might be a simpler solution I'm missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-6577) DAG Backfill with timedelta runs twice

2020-05-09 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik closed AIRFLOW-6577.
---
Resolution: Duplicate

> DAG Backfill with timedelta runs twice
> --
>
> Key: AIRFLOW-6577
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6577
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, DagRun
>Affects Versions: 1.10.7
> Environment: ProductName: Mac OS X
> ProductVersion:   10.14.6
> BuildVersion: 18G2022
> Client: Docker Engine - Community
>  Version:   19.03.5
>  API version:   1.40
>  Go version:go1.12.12
>  Git commit:633a0ea
>  Built: Wed Nov 13 07:22:34 2019
>  OS/Arch:   darwin/amd64
>  Experimental:  false
>Reporter: Nick Benthem
>Priority: Minor
>
> if you use {{timedelta=__anything__}}, and have {{catchup=False}}, it will 
> cause a DOUBLE run of your DAG! The only workaround i found was to use a cron 
> timer, i.e.,
> schedule_interval='@daily',
> Rather than
> schedule_interval=timedelta(days=1),
> It almost definitely exists in
> def create_dag_run(self, dag, session=None):
> in 
> {{/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py
>  }}
> around line {{643}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-1056) Single dag run triggered when un-pausing job with catchup=False

2020-05-09 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik closed AIRFLOW-1056.
---
Resolution: Duplicate

> Single dag run triggered when un-pausing job with catchup=False
> ---
>
> Key: AIRFLOW-1056
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1056
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.0
>Reporter: Andrew Heuermann
>Priority: Major
>
> When "catchup=False" a single job run is still triggered when un-pausing a 
> dag when there are missed run windows. 
> In airflow/jobs.py:create_dag_run(): When catchup is disabled it updates the 
> dag.start_date here to prevent the backfill: 
> https://github.com/apache/incubator-airflow/blob/bb39078a35cf2bceea58d7831d7a2028c8ef849f/airflow/jobs.py#L770.
> But it looks like the function schedules dags based on a window (using 
> sequential run times as lower and upper bounds) so it will always schedule a 
> single dag run if there is a missed run between the last run and the time 
> which it was unpaused. Even if it was un-paused AFTER those missed runs.
> Some ideas on solutions:
> * Pass in the time when the scheduler last ran and use that as the lower 
> bound of the window, but not sure how easy that is to get to. 
> * Update the start_date when a dag with catchup=False is unpaused. Or add a 
> new "unpaused_date" field that would serve the same purpose.
> * If paused have the scheduler insert a skipped Job record when the job would 
> have run.
> There might be a simpler solution I'm missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-1156) Using a timedelta object as a Schedule Interval with catchup=False causes the start_date to no longer be honored.

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103619#comment-17103619
 ] 

ASF GitHub Bot commented on AIRFLOW-1156:
-

kaxil commented on pull request #8776:
URL: https://github.com/apache/airflow/pull/8776#issuecomment-626267001


   > Lol, one line fixes are the Best*
   > 
   > Tests please :) (I'm sure you'd get around to it)
   > 
   > * The worst.
   
   :D Yeah -  Looks like this bug was for a long time (possibly from the very 
start) !!! Found some old JIRAs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Using a timedelta object as a Schedule Interval with catchup=False causes the 
> start_date to no longer be honored.
> -
>
> Key: AIRFLOW-1156
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1156
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.0
>Reporter: Zachary Lawson
>Priority: Minor
>
> Currently, in Airflow v1.8, if you set your schedule_interval to a timedelta 
> object and set catchup=False, the start_date is no longer honored and the DAG 
> is scheduled immediately upon unpausing the DAG. It is then schedule on the 
> schedule interval from that point onward. Example below:
> {code}
> from airflow import DAG
> from datetime import datetime, timedelta
> import logging
> from airflow.operators.python_operator import PythonOperator
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2015, 6, 1),
> }
> dag = DAG('test', default_args=default_args, 
> schedule_interval=timedelta(seconds=5), catchup=False)
> def context_test(ds, **context):
> logging.info('testing')
> test_context = PythonOperator(
> task_id='test_context',
> provide_context=True,
> python_callable=context_test,
> dag=dag
> )
> {code}
> If you switch the above over to a CRON expression, the behavior of the 
> scheduling is returned to the expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on pull request #8776: [AIRFLOW-1156] BugFix: Unpausing a DAG with catchup=False creates an extra DAG run

2020-05-09 Thread GitBox


kaxil commented on pull request #8776:
URL: https://github.com/apache/airflow/pull/8776#issuecomment-626267001


   > Lol, one line fixes are the Best*
   > 
   > Tests please :) (I'm sure you'd get around to it)
   > 
   > * The worst.
   
   :D Yeah -  Looks like this bug was for a long time (possibly from the very 
start) !!! Found some old JIRAs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] amithmathew opened a new issue #8803: Impersonate service accounts while running GCP Operators

2020-05-09 Thread GitBox


amithmathew opened a new issue #8803:
URL: https://github.com/apache/airflow/issues/8803


   **Description**
   Allow running Google Cloud operators using Service Accounts, without having 
to provide key material while running on GCP. If the Compute instance Service 
Accounts on which Airflow is running have been granted "Service Account Token 
Creator" role on the target Service Account with which I want to run my 
operator, I do not need to download, or provide any key material for the 
impersonation to happen. This is a much more secure way to impersonate service 
accounts.
   
   **Use case / motivation**
   
   Allow running Google Cloud operators using Service Accounts, without having 
to provide key material while running on GCP. If the Compute instance Service 
Accounts on which Airflow is running have been granted "Service Account Token 
Creator" role on the target Service Account with which I want to run my 
operator, I do not need to download, or provide any key material for the 
impersonation to happen. This is a much more secure way to impersonate service 
accounts.
   
   
https://github.com/googleapis/google-auth-library-python/blob/master/docs/user-guide.rst#impersonated-credentials
   
   **Related Issues**
   
   None
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #8803: Impersonate service accounts while running GCP Operators

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on issue #8803:
URL: https://github.com/apache/airflow/issues/8803#issuecomment-626266912


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] tag nightly-master updated (42c5975 -> c7788a6)

2020-05-09 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to tag nightly-master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


*** WARNING: tag nightly-master was modified! ***

from 42c5975  (commit)
  to c7788a6  (commit)
from 42c5975  Update example SingularityOperator DAG (#8790)
 add 791d1a7  Backport packages are renamed to include backport in their 
name (#8767)
 add 100f530  Fixed test-target command (#8795)
 add db1b51d  Make celery worker_prefetch_multiplier configurable (#8695)
 add bc19778  [AIP-31] Implement XComArg to pass output from one operator 
to the next (#8652)
 add 7506c73  Add default `conf` parameter to Spark JDBC Hook (#8787)
 add 5e1c33a  Fix docs on creating CustomOperator (#8678)
 add 21cc7d7  Document default timeout value for SSHOperator (#8744)
 add cd635dd  [AIRFLOW-5906] Add authenticator parameter to snowflake_hook 
(#8642)
 add c7788a6  Add imap_attachment_to_s3 example dag and system test (#8669)

No new revisions were added by this update.

Summary of changes:
 CONTRIBUTING.rst   |  10 +-
 TESTING.rst|   8 +-
 airflow/config_templates/config.yml|  13 ++
 airflow/config_templates/default_airflow.cfg   |  10 ++
 airflow/config_templates/default_celery.py |   2 +-
 airflow/models/baseoperator.py |  32 -
 airflow/models/xcom_arg.py | 149 +++
 ...r_basic.py => example_imap_attachment_to_s3.py} |  39 +++--
 .../apache/spark/example_dags/example_spark_dag.py |   2 -
 airflow/providers/apache/spark/hooks/spark_jdbc.py |   2 +-
 airflow/providers/snowflake/hooks/snowflake.py |   8 +-
 airflow/providers/snowflake/operators/snowflake.py |  13 +-
 airflow/providers/ssh/operators/ssh.py |   2 +-
 backport_packages/setup_backport_packages.py   |   8 +-
 breeze |   2 +-
 docs/howto/custom-operator.rst |   6 +-
 .../operator/amazon/aws/imap_attachment_to_s3.rst  |  70 +
 docs/operators-and-hooks-ref.rst   |   4 +-
 scripts/ci/in_container/run_ci_tests.sh|   2 +-
 .../in_container/run_test_package_installation.sh  |   4 +-
 tests/models/test_xcom_arg.py  | 157 +
 .../test_imap_attachment_to_s3_system.py}  |  29 ++--
 tests/providers/snowflake/hooks/test_snowflake.py  |   6 +-
 23 files changed, 508 insertions(+), 70 deletions(-)
 create mode 100644 airflow/models/xcom_arg.py
 copy 
airflow/providers/amazon/aws/example_dags/{example_google_api_to_s3_transfer_basic.py
 => example_imap_attachment_to_s3.py} (50%)
 create mode 100644 docs/howto/operator/amazon/aws/imap_attachment_to_s3.rst
 create mode 100644 tests/models/test_xcom_arg.py
 copy tests/providers/{http/operators/test_http_system.py => 
amazon/aws/operators/test_imap_attachment_to_s3_system.py} (53%)



[airflow] branch master updated: Add imap_attachment_to_s3 example dag and system test (#8669)

2020-05-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new c7788a6  Add imap_attachment_to_s3 example dag and system test (#8669)
c7788a6 is described below

commit c7788a6894cb79c22153434dd9b977393b8236be
Author: Felix Uellendall 
AuthorDate: Sun May 10 03:06:35 2020 +0200

Add imap_attachment_to_s3 example dag and system test (#8669)
---
 .../example_dags/example_imap_attachment_to_s3.py  | 53 
 .../operator/amazon/aws/imap_attachment_to_s3.rst  | 70 ++
 docs/operators-and-hooks-ref.rst   |  4 +-
 .../operators/test_imap_attachment_to_s3_system.py | 42 +
 4 files changed, 167 insertions(+), 2 deletions(-)

diff --git 
a/airflow/providers/amazon/aws/example_dags/example_imap_attachment_to_s3.py 
b/airflow/providers/amazon/aws/example_dags/example_imap_attachment_to_s3.py
new file mode 100644
index 000..7a0d86c
--- /dev/null
+++ b/airflow/providers/amazon/aws/example_dags/example_imap_attachment_to_s3.py
@@ -0,0 +1,53 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This is an example dag for using `ImapAttachmentToS3Operator` to transfer an 
email attachment via IMAP
+protocol from a mail server to S3 Bucket.
+"""
+
+from os import getenv
+
+from airflow import DAG
+from airflow.providers.amazon.aws.operators.imap_attachment_to_s3 import 
ImapAttachmentToS3Operator
+from airflow.utils.dates import days_ago
+
+# [START howto_operator_imap_attachment_to_s3_env_variables]
+IMAP_ATTACHMENT_NAME = getenv("IMAP_ATTACHMENT_NAME", "test.txt")
+IMAP_MAIL_FOLDER = getenv("IMAP_MAIL_FOLDER", "INBOX")
+IMAP_MAIL_FILTER = getenv("IMAP_MAIL_FILTER", "All")
+S3_DESTINATION_KEY = getenv("S3_DESTINATION_KEY", "s3://bucket/key.json")
+# [END howto_operator_imap_attachment_to_s3_env_variables]
+
+default_args = {"start_date": days_ago(1)}
+
+with DAG(
+dag_id="example_imap_attachment_to_s3",
+default_args=default_args,
+schedule_interval=None,
+tags=['example']
+) as dag:
+# [START howto_operator_imap_attachment_to_s3_task_1]
+task_transfer_imap_attachment_to_s3 = ImapAttachmentToS3Operator(
+imap_attachment_name=IMAP_ATTACHMENT_NAME,
+s3_key=S3_DESTINATION_KEY,
+imap_mail_folder=IMAP_MAIL_FOLDER,
+imap_mail_filter=IMAP_MAIL_FILTER,
+task_id='transfer_imap_attachment_to_s3',
+dag=dag
+)
+# [END howto_operator_imap_attachment_to_s3_task_1]
diff --git a/docs/howto/operator/amazon/aws/imap_attachment_to_s3.rst 
b/docs/howto/operator/amazon/aws/imap_attachment_to_s3.rst
new file mode 100644
index 000..94cfea3
--- /dev/null
+++ b/docs/howto/operator/amazon/aws/imap_attachment_to_s3.rst
@@ -0,0 +1,70 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+.. _howto/operator:ImapAttachmentToS3Operator:
+
+Imap Attachment To S3 Operator
+==
+
+.. contents::
+  :depth: 1
+  :local:
+
+Overview
+
+
+The ``ImapAttachmentToS3Operator`` can transfer an email attachment via IMAP
+protocol from a mail server to S3 Bucket.
+
+An example dag ``example_imap_attachment_to_s3.py`` is provided which showcase 
the
+:class:`~airflow.providers.amazon.aws.operators.imap_attachment_to_s3.ImapAttachmentToS3Operator`
+in action.
+
+example_imap_attachment_to_

[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103575#comment-17103575
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103576#comment-17103576
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626256604


   > > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > > 
   > > 
   > > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   > 
   > Yes I just checked that after rebasing, there is problem with an commit 
that was made to base_aws on 4rth April by @baolsen involving following lines
   > 
   > ```
   > if not (self.client_type or self.resource_type):
   > raise AirflowException(
   > 'Either client_type or resource_type'
   > ' must be provided.')
   > ```
   > 
   > In base aws hook client_type and resource_type are optional, like
   > 
   > ```
   >client_type: Optional[str] = None,
   >resource_type: Optional[str] = None,
   > ```
   > 
   > but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   
   Okay, I needed to add client_type='glue' in hook to super.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5906) Add authenticator parameter to snowflake_hook

2020-05-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103574#comment-17103574
 ] 

ASF subversion and git services commented on AIRFLOW-5906:
--

Commit cd635dd7d57cab2f41efac2d3d94e8f80a6c96d6 in airflow's branch 
refs/heads/master from Peter Kosztolanyi
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=cd635dd ]

[AIRFLOW-5906] Add authenticator parameter to snowflake_hook (#8642)



> Add authenticator parameter to snowflake_hook
> -
>
> Key: AIRFLOW-5906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: authentication
>Affects Versions: 1.10.6
>Reporter: Salvador RIbolzi
>Assignee: Salvador RIbolzi
>Priority: Major
>  Labels: features
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We are currently migrating to using SAML to log in to Snowflake, to do so a 
> parameter `authenticator=externalbrowser` must be set. Currently the hook for 
> snowflake does not check for that parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5906) Add authenticator parameter to snowflake_hook

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103573#comment-17103573
 ] 

ASF GitHub Bot commented on AIRFLOW-5906:
-

boring-cyborg[bot] commented on pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#issuecomment-626256549


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add authenticator parameter to snowflake_hook
> -
>
> Key: AIRFLOW-5906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: authentication
>Affects Versions: 1.10.6
>Reporter: Salvador RIbolzi
>Assignee: Salvador RIbolzi
>Priority: Major
>  Labels: features
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We are currently migrating to using SAML to log in to Snowflake, to do so a 
> parameter `authenticator=externalbrowser` must be set. Currently the hook for 
> snowflake does not check for that parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103572#comment-17103572
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   Okay, I needed to add client_type='glue' in hook to super.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626256604


   > > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > > 
   > > 
   > > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   > 
   > Yes I just checked that after rebasing, there is problem with an commit 
that was made to base_aws on 4rth April by @baolsen involving following lines
   > 
   > ```
   > if not (self.client_type or self.resource_type):
   > raise AirflowException(
   > 'Either client_type or resource_type'
   > ' must be provided.')
   > ```
   > 
   > In base aws hook client_type and resource_type are optional, like
   > 
   > ```
   >client_type: Optional[str] = None,
   >resource_type: Optional[str] = None,
   > ```
   > 
   > but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   
   Okay, I needed to add client_type='glue' in hook to super.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] abdulbasitds edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (21cc7d7 -> cd635dd)

2020-05-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 21cc7d7  Document default timeout value for SSHOperator (#8744)
 add cd635dd  [AIRFLOW-5906] Add authenticator parameter to snowflake_hook 
(#8642)

No new revisions were added by this update.

Summary of changes:
 airflow/providers/snowflake/hooks/snowflake.py |  8 +---
 airflow/providers/snowflake/operators/snowflake.py | 13 +++--
 tests/providers/snowflake/hooks/test_snowflake.py  |  6 --
 3 files changed, 20 insertions(+), 7 deletions(-)



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8642: [AIRFLOW-5906] Add authenticator parameter to snowflake_hook

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#issuecomment-626256549


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-5906) Add authenticator parameter to snowflake_hook

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103569#comment-17103569
 ] 

ASF GitHub Bot commented on AIRFLOW-5906:
-

kaxil commented on a change in pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#discussion_r422566889



##
File path: tests/providers/snowflake/hooks/test_snowflake.py
##
@@ -134,3 +136,15 @@ def test_key_pair_auth_not_encrypted(self):
 self.conn.password = None
 params = self.db_hook._get_conn_params()
 self.assertTrue('private_key' in params)
+
+def test_authenticator(self):
+self.conn.extra_dejson = {'database': 'db',
+  'account': 'airflow',
+  'warehouse': 'af_wh',
+  'region': 'af_region',
+  'role': 'af_role',
+  'authenticator': 'externalbrowser'}
+
+uri_shouldbe = 
'snowflake://user:pw@airflow/db/public?warehouse=af_wh&role=af_role' \
+   '&authenticator=externalbrowser'
+self.assertEqual(uri_shouldbe, self.db_hook.get_uri())

Review comment:
   ```suggestion
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add authenticator parameter to snowflake_hook
> -
>
> Key: AIRFLOW-5906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: authentication
>Affects Versions: 1.10.6
>Reporter: Salvador RIbolzi
>Assignee: Salvador RIbolzi
>Priority: Major
>  Labels: features
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We are currently migrating to using SAML to log in to Snowflake, to do so a 
> parameter `authenticator=externalbrowser` must be set. Currently the hook for 
> snowflake does not check for that parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   
   Okay, I needed to add client_type='glue' in hook to super.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8642: [AIRFLOW-5906] Add authenticator parameter to snowflake_hook

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#discussion_r422566889



##
File path: tests/providers/snowflake/hooks/test_snowflake.py
##
@@ -134,3 +136,15 @@ def test_key_pair_auth_not_encrypted(self):
 self.conn.password = None
 params = self.db_hook._get_conn_params()
 self.assertTrue('private_key' in params)
+
+def test_authenticator(self):
+self.conn.extra_dejson = {'database': 'db',
+  'account': 'airflow',
+  'warehouse': 'af_wh',
+  'region': 'af_region',
+  'role': 'af_role',
+  'authenticator': 'externalbrowser'}
+
+uri_shouldbe = 
'snowflake://user:pw@airflow/db/public?warehouse=af_wh&role=af_role' \
+   '&authenticator=externalbrowser'
+self.assertEqual(uri_shouldbe, self.db_hook.get_uri())

Review comment:
   ```suggestion
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8642: [AIRFLOW-5906] Add authenticator parameter to snowflake_hook

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#discussion_r422566593



##
File path: tests/providers/snowflake/hooks/test_snowflake.py
##
@@ -134,3 +136,15 @@ def test_key_pair_auth_not_encrypted(self):
 self.conn.password = None
 params = self.db_hook._get_conn_params()
 self.assertTrue('private_key' in params)
+
+def test_authenticator(self):
+self.conn.extra_dejson = {'database': 'db',
+  'account': 'airflow',
+  'warehouse': 'af_wh',
+  'region': 'af_region',
+  'role': 'af_role',
+  'authenticator': 'externalbrowser'}
+
+uri_shouldbe = 
'snowflake://user:pw@airflow/db/public?warehouse=af_wh&role=af_role' \
+   '&authenticator=externalbrowser'
+self.assertEqual(uri_shouldbe, self.db_hook.get_uri())

Review comment:
   This test is not related to the changes in this PR. It just checks if 
`extra_dejson` of any Connection object is changed, they are reflected and can 
be retrieved with `get_uri()`.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-5906) Add authenticator parameter to snowflake_hook

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103568#comment-17103568
 ] 

ASF GitHub Bot commented on AIRFLOW-5906:
-

kaxil commented on a change in pull request #8642:
URL: https://github.com/apache/airflow/pull/8642#discussion_r422566593



##
File path: tests/providers/snowflake/hooks/test_snowflake.py
##
@@ -134,3 +136,15 @@ def test_key_pair_auth_not_encrypted(self):
 self.conn.password = None
 params = self.db_hook._get_conn_params()
 self.assertTrue('private_key' in params)
+
+def test_authenticator(self):
+self.conn.extra_dejson = {'database': 'db',
+  'account': 'airflow',
+  'warehouse': 'af_wh',
+  'region': 'af_region',
+  'role': 'af_role',
+  'authenticator': 'externalbrowser'}
+
+uri_shouldbe = 
'snowflake://user:pw@airflow/db/public?warehouse=af_wh&role=af_role' \
+   '&authenticator=externalbrowser'
+self.assertEqual(uri_shouldbe, self.db_hook.get_uri())

Review comment:
   This test is not related to the changes in this PR. It just checks if 
`extra_dejson` of any Connection object is changed, they are reflected and can 
be retrieved with `get_uri()`.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add authenticator parameter to snowflake_hook
> -
>
> Key: AIRFLOW-5906
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5906
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: authentication
>Affects Versions: 1.10.6
>Reporter: Salvador RIbolzi
>Assignee: Salvador RIbolzi
>Priority: Major
>  Labels: features
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We are currently migrating to using SAML to log in to Snowflake, to do so a 
> parameter `authenticator=externalbrowser` must be set. Currently the hook for 
> snowflake does not check for that parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103564#comment-17103564
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen  involving following lines
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client_type and resource_type are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understanding, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] abdulbasitds edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen 
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client and resource are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understading, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103562#comment-17103562
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen 
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client and resource are optional, like
   
```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understading, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103561#comment-17103561
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen 
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client and resource are optional, like
   
  ```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understading, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626255328


   > > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > > It is now failing on my local breeze as well, Trying to see if i can 
find the reason, but cant find any change that I have made
   > 
   > Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.
   
   Yes I just checked that after rebasing, there is problem with an commit that 
was made to base_aws on 4rth April by @baolsen 
   
   
   ```
   if not (self.client_type or self.resource_type):
   raise AirflowException(
   'Either client_type or resource_type'
   ' must be provided.')
   ```
   
   In base aws hook client and resource are optional, like
   
  ```
   client_type: Optional[str] = None,
   resource_type: Optional[str] = None,
   ```
   
but it raises exception if not provided. May be i need to pass client_type 
somehow. But as per my (vague) understading, if they are not required, this if 
check should not be there
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #8749: add AWS StepFunctions integrations to the aws provider

2020-05-09 Thread GitBox


kaxil commented on pull request #8749:
URL: https://github.com/apache/airflow/pull/8749#issuecomment-626255210


   cc @baolsen 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8744: Document default timeout value for SSHOperator

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on pull request #8744:
URL: https://github.com/apache/airflow/pull/8744#issuecomment-626254623


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Document default timeout value for SSHOperator (#8744)

2020-05-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 21cc7d7  Document default timeout value for SSHOperator (#8744)
21cc7d7 is described below

commit 21cc7d729827e9f3af0698bf647b2d41fc87b11c
Author: Abhilash Kishore 
AuthorDate: Sat May 9 17:35:41 2020 -0700

Document default timeout value for SSHOperator (#8744)
---
 airflow/providers/ssh/operators/ssh.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/airflow/providers/ssh/operators/ssh.py 
b/airflow/providers/ssh/operators/ssh.py
index 532a4a2..04c996f 100644
--- a/airflow/providers/ssh/operators/ssh.py
+++ b/airflow/providers/ssh/operators/ssh.py
@@ -42,7 +42,7 @@ class SSHOperator(BaseOperator):
 :type remote_host: str
 :param command: command to execute on remote host. (templated)
 :type command: str
-:param timeout: timeout (in seconds) for executing the command.
+:param timeout: timeout (in seconds) for executing the command. The 
default is 10 seconds.
 :type timeout: int
 :param environment: a dict of shell environment variables. Note that the
 server will reject them silently if `AcceptEnv` is not set in SSH 
config.



[airflow] branch master updated: Fix docs on creating CustomOperator (#8678)

2020-05-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 5e1c33a  Fix docs on creating CustomOperator (#8678)
5e1c33a is described below

commit 5e1c33a1baf0725eeb695a96b29ddd9585df51e4
Author: Jonny Fuller 
AuthorDate: Sat May 9 20:33:45 2020 -0400

Fix docs on creating CustomOperator (#8678)
---
 docs/howto/custom-operator.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/howto/custom-operator.rst b/docs/howto/custom-operator.rst
index 0951b18..6cd8510 100644
--- a/docs/howto/custom-operator.rst
+++ b/docs/howto/custom-operator.rst
@@ -162,7 +162,7 @@ the operator.
 self.name = name
 
 def execute(self, context):
-message = "Hello from {}".format(name)
+message = "Hello from {}".format(self.name)
 print(message)
 return message
 
@@ -171,9 +171,9 @@ You can use the template as follows:
 .. code:: python
 
 with dag:
-hello_task = HelloOperator(task_id='task_id_1', dag=dag, name='{{ 
task_id }}')
+hello_task = HelloOperator(task_id='task_id_1', dag=dag, name='{{ 
task_instance.task_id }}')
 
-In this example, Jinja looks for the ``name`` parameter and substitutes ``{{ 
task_id }}`` with
+In this example, Jinja looks for the ``name`` parameter and substitutes ``{{ 
task_instance.task_id }}`` with
 ``task_id_1``.
 
 



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8678: Updated docs so demo code runs

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on pull request #8678:
URL: https://github.com/apache/airflow/pull/8678#issuecomment-626254500


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626254185


   > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > 
   > It is now failing on my local breeze as well, Trying to see if i can find 
the reason, but cant find any change that I have made
   
   Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103556#comment-17103556
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626254185


   > Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   > 
   > It is now failing on my local breeze as well, Trying to see if i can find 
the reason, but cant find any change that I have made
   
   Your PR was not in sync with the master so most likely a related changed 
might have been merged too. So make sure to rebase to master frequently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil opened a new pull request #8802: Allow passing secrets backend_kwargs to AWS SSM client

2020-05-09 Thread GitBox


kaxil opened a new pull request #8802:
URL: https://github.com/apache/airflow/pull/8802


   Allow passing secrets backend_kwargs to AWS SSM client
   
   We were already doing this for all other Secret backends.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103552#comment-17103552
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626253136


   Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   
   It is now failing on my local breeze as well, Trying to see if i can find 
the reason, but cant find any change that I have made



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626253136


   Appearenlty, some unittest are failing. I did a pytest in morning and 
reworked, every files was succeeding all test. havent changed anything but now 
in base aws class there is an error.
   
   It is now failing on my local breeze as well, Trying to see if i can find 
the reason, but cant find any change that I have made



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla commented on issue #8696: Skip task itself instead of all downstream tasks

2020-05-09 Thread GitBox


gdevanla commented on issue #8696:
URL: https://github.com/apache/airflow/issues/8696#issuecomment-626252453


   @yuqian90 I believe you are pointing to this line of code which should not 
be performing this task here.  
   
   
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/sensors/base_sensor_operator.py#L121
   
   As I understand it, when the jobs are not skipped here, the current set up 
in TriggerRuleDep._evaluate_trigger_rule() takes care of skipping the tasks for 
appropriate conditions.
   
   
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/ti_deps/deps/trigger_rule_dep.py#L128
   
   Any thoughts on this change along these lines?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103533#comment-17103533
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248933


   > > > doesn't conform to snake_case naming style
   > > 
   > > 
   > > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   > 
   > You can disable it as in
   > 
   > 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82-L96
   
   Thanks alot, done and pushed, No errors now with breeze



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248933


   > > > doesn't conform to snake_case naming style
   > > 
   > > 
   > > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   > 
   > You can disable it as in
   > 
   > 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82-L96
   
   Thanks alot, done and pushed, No errors now with breeze



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103530#comment-17103530
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82-L96
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103529#comment-17103529
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/master/airflow/providers/google/cloud/operators/stackdriver.py#L82



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/master/airflow/providers/google/cloud/operators/stackdriver.py#L82



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103531#comment-17103531
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626248652


   > > doesn't conform to snake_case naming style
   > 
   > @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?
   
   You can disable it as in 
https://github.com/apache/airflow/blob/7506c73f1721151e9c50ef8bdb70d2136a16190b/airflow/providers/google/cloud/operators/stackdriver.py#L82-L96
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103528#comment-17103528
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds removed a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626245782


   > doesn't conform to snake_case naming style
   
   @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds removed a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds removed a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626245782


   > doesn't conform to snake_case naming style
   
   @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (bc19778 -> 7506c73)

2020-05-09 Thread ash
This is an automated email from the ASF dual-hosted git repository.

ash pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from bc19778  [AIP-31] Implement XComArg to pass output from one operator 
to the next (#8652)
 add 7506c73  Add default `conf` parameter to Spark JDBC Hook (#8787)

No new revisions were added by this update.

Summary of changes:
 airflow/providers/apache/spark/example_dags/example_spark_dag.py | 2 --
 airflow/providers/apache/spark/hooks/spark_jdbc.py   | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)



[airflow] branch master updated (db1b51d -> bc19778)

2020-05-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from db1b51d  Make celery worker_prefetch_multiplier configurable (#8695)
 add bc19778  [AIP-31] Implement XComArg to pass output from one operator 
to the next (#8652)

No new revisions were added by this update.

Summary of changes:
 airflow/models/baseoperator.py |  32 +++--
 airflow/models/xcom_arg.py | 149 ++
 tests/models/test_xcom_arg.py  | 157 +
 3 files changed, 334 insertions(+), 4 deletions(-)
 create mode 100644 airflow/models/xcom_arg.py
 create mode 100644 tests/models/test_xcom_arg.py



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8652: [AIP-31] Implement XComArg model to functionally pass output from one operator to the next

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on pull request #8652:
URL: https://github.com/apache/airflow/pull/8652#issuecomment-626245939


   Awesome work, congrats on your first merged pull request!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8652: [AIP-31] Implement XComArg model to functionally pass output from one operator to the next

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8652:
URL: https://github.com/apache/airflow/pull/8652#discussion_r422556125



##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   Merging this PR ignoring this change (we can update it in a related 
follow-up PR)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103515#comment-17103515
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626245782


   > doesn't conform to snake_case naming style
   
   @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] abdulbasitds commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


abdulbasitds commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626245782


   > doesn't conform to snake_case naming style
   
   @kaxil I have solved other issues (havent pushed yet) , but for the last 
one(reducing number of arguments) I will have to change all files(for example 
if I convert some arguments into dictionary), should I do it like this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8652: [AIP-31] Implement XComArg model to functionally pass output from one operator to the next

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8652:
URL: https://github.com/apache/airflow/pull/8652#discussion_r422555955



##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   ```suggestion
   """Returns reference to XCom pushed by current operator"""
   ```

##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   ```suggestion
   """Returns reference to XCom pushed by current operator"""
   ```

##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   ```suggestion
   """Returns reference to XCom pushed by current operator"""
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8652: [AIP-31] Implement XComArg model to functionally pass output from one operator to the next

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8652:
URL: https://github.com/apache/airflow/pull/8652#discussion_r422555955



##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   ```suggestion
   """Returns reference to XCom pushed by current operator"""
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #8652: [AIP-31] Implement XComArg model to functionally pass output from one operator to the next

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #8652:
URL: https://github.com/apache/airflow/pull/8652#discussion_r422555817



##
File path: airflow/models/baseoperator.py
##
@@ -1121,6 +1139,12 @@ def set_upstream(self, task_or_task_list: 
Union['BaseOperator', List['BaseOperat
 """
 self._set_relatives(task_or_task_list, upstream=True)
 
+@property
+def output(self):
+"""Returns default XComArg for the operator"""

Review comment:
   We haven't updated this yet :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103513#comment-17103513
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555361



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,81 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+"""
+return KafkaConsumerHook(self.topic, self.host, self.port)
+
+def poke(self, context):
+"""
+Checks to see if messages exist on this topic/partition.
+
+:param context:
+:return:
+"""
+logging.info(
+'Poking topic: %s, using hook: %s',
+str(self.topic), str(self.hook))

Review comment:
   ```suggestion
   self.log.info('Poking topic: %s, using hook: %s', str(self.topic), 
str(self.hook))
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103512#comment-17103512
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555319



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,81 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+"""
+return KafkaConsumerHook(self.topic, self.host, self.port)
+
+def poke(self, context):
+"""
+Checks to see if messages exist on this topic/partition.
+
+:param context:
+:return:
+"""
+logging.info(
+'Poking topic: %s, using hook: %s',
+str(self.topic), str(self.hook))
+
+messages = self.hook.get_messages()
+
+if messages:
+logging.info(
+'Got messages during poking: %s', str(messages))
+return messages

Review comment:
   ```suggestion
   self.log.info('Got messages during poking: %s', str(messages))
   return messages
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [df

[GitHub] [airflow] kaxil commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555361



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,81 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+"""
+return KafkaConsumerHook(self.topic, self.host, self.port)
+
+def poke(self, context):
+"""
+Checks to see if messages exist on this topic/partition.
+
+:param context:
+:return:
+"""
+logging.info(
+'Poking topic: %s, using hook: %s',
+str(self.topic), str(self.hook))

Review comment:
   ```suggestion
   self.log.info('Poking topic: %s, using hook: %s', str(self.topic), 
str(self.hook))
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555319



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,81 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+"""
+return KafkaConsumerHook(self.topic, self.host, self.port)
+
+def poke(self, context):
+"""
+Checks to see if messages exist on this topic/partition.
+
+:param context:
+:return:
+"""
+logging.info(
+'Poking topic: %s, using hook: %s',
+str(self.topic), str(self.hook))
+
+messages = self.hook.get_messages()
+
+if messages:
+logging.info(
+'Got messages during poking: %s', str(messages))
+return messages

Review comment:
   ```suggestion
   self.log.info('Got messages during poking: %s', str(messages))
   return messages
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103511#comment-17103511
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555228



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+
+:return:
+KafkaConsumerHook

Review comment:
   ```suggestion
   """
   Returns a Kafka Consumer Hook
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #7407:
URL: https://github.com/apache/airflow/pull/7407#discussion_r422555228



##
File path: airflow/providers/apache/kafka/sensors/kafka_sensor.py
##
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+
+from cached_property import cached_property
+
+from airflow.providers.apache.kafka.hooks.kafka_consumer_hook import 
KafkaConsumerHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class KafkaSensor(BaseSensorOperator):
+"""
+Consumes the Kafka message with the specific topic
+"""
+DEFAULT_HOST = 'kafka1'
+DEFAULT_PORT = 9092
+templated_fields = ('topic',
+'host',
+'port',
+)
+
+@apply_defaults
+def __init__(self, topic, host=DEFAULT_HOST, port=DEFAULT_PORT, *args, 
**kwargs):
+"""
+Initialize the sensor, the connection establish
+is put off to it's first time usage.
+
+:param topic:
+:param host:
+:param port:
+:param args:
+:param kwargs:
+"""
+self.topic = topic
+self.host = host
+self.port = port
+super(KafkaSensor, self).__init__(*args, **kwargs)
+
+@cached_property
+def hook(self):
+"""
+Returns a Kafka Consumer Hook
+
+:return:
+KafkaConsumerHook

Review comment:
   ```suggestion
   """
   Returns a Kafka Consumer Hook
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #8772: Correctly store non-default Nones in serialized tasks/dags

2020-05-09 Thread GitBox


ashb commented on pull request #8772:
URL: https://github.com/apache/airflow/pull/8772#issuecomment-626244825


   (We've got that code in this diff comment, we can bring it back when we want 
it, but that will need more changes elsewhere to serialize more fields



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #8772: Correctly store non-default Nones in serialized tasks/dags

2020-05-09 Thread GitBox


ashb commented on pull request #8772:
URL: https://github.com/apache/airflow/pull/8772#issuecomment-626244699


   I was wrong about where that fn was used, so I've removed that complex code 
an replace it with this in the test instead:
   
   ```python
   if serialized_task.resources is None:
   assert task.resources is None or task.resources == []
   else:
   assert serialized_task.resources == task.resources
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103510#comment-17103510
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626244214


   Please fix the following tests too:
   
   ```
   
   * Module airflow.providers.amazon.aws.hooks.glue
   airflow/providers/amazon/aws/hooks/glue.py:70:8: C0103: Attribute name 
"S3_GLUE_LOGS" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/hooks/glue.py:98:12: W1202: Use % formatting in 
logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:117:12: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:179:16: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   * Module airflow.providers.amazon.aws.operators.glue
   airflow/providers/amazon/aws/operators/glue.py:86:8: C0103: Attribute name 
"S3_PROTOCOL" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:87:8: C0103: Attribute name 
"S3_ARTIFACTS_PREFIX" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:60:4: R0913: Too many 
arguments (12/10) (too-many-arguments)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103509#comment-17103509
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626244214


   Please fix the following tests too:
   
   ```
   Permissions already fixed
   
   * Module airflow.providers.amazon.aws.hooks.glue
   airflow/providers/amazon/aws/hooks/glue.py:70:8: C0103: Attribute name 
"S3_GLUE_LOGS" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/hooks/glue.py:98:12: W1202: Use % formatting in 
logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:117:12: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:179:16: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   * Module airflow.providers.amazon.aws.operators.glue
   airflow/providers/amazon/aws/operators/glue.py:86:8: C0103: Attribute name 
"S3_PROTOCOL" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:87:8: C0103: Attribute name 
"S3_ARTIFACTS_PREFIX" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:60:4: R0913: Too many 
arguments (12/10) (too-many-arguments)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil commented on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626244214


   Please fix the following tests too:
   
   ```
   Permissions already fixed
   
   * Module airflow.providers.amazon.aws.hooks.glue
   airflow/providers/amazon/aws/hooks/glue.py:70:8: C0103: Attribute name 
"S3_GLUE_LOGS" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/hooks/glue.py:98:12: W1202: Use % formatting in 
logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:117:12: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:179:16: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   * Module airflow.providers.amazon.aws.operators.glue
   airflow/providers/amazon/aws/operators/glue.py:86:8: C0103: Attribute name 
"S3_PROTOCOL" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:87:8: C0103: Attribute name 
"S3_ARTIFACTS_PREFIX" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:60:4: R0913: Too many 
arguments (12/10) (too-many-arguments)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil edited a comment on pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil edited a comment on pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#issuecomment-626244214


   Please fix the following tests too:
   
   ```
   
   * Module airflow.providers.amazon.aws.hooks.glue
   airflow/providers/amazon/aws/hooks/glue.py:70:8: C0103: Attribute name 
"S3_GLUE_LOGS" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/hooks/glue.py:98:12: W1202: Use % formatting in 
logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:117:12: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   airflow/providers/amazon/aws/hooks/glue.py:179:16: W1202: Use % formatting 
in logging functions and pass the % parameters as arguments 
(logging-format-interpolation)
   * Module airflow.providers.amazon.aws.operators.glue
   airflow/providers/amazon/aws/operators/glue.py:86:8: C0103: Attribute name 
"S3_PROTOCOL" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:87:8: C0103: Attribute name 
"S3_ARTIFACTS_PREFIX" doesn't conform to snake_case naming style (invalid-name)
   airflow/providers/amazon/aws/operators/glue.py:60:4: R0913: Too many 
arguments (12/10) (too-many-arguments)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla edited a comment on pull request #8699: Scheduler just checks for task instances in 'running' state in execution.

2020-05-09 Thread GitBox


gdevanla edited a comment on pull request #8699:
URL: https://github.com/apache/airflow/pull/8699#issuecomment-626243867


   @ashb My first thought was also to handle this at the executor level. But, I 
believe, we need to allow the scheduler to reset the state of the task to 
`scheduled`, if it was not enqueued properly (that is switched to running 
status) in the first attempt.  This would give the scheduler more control on 
how to enqueue the task again. Don't you think this is important?
   
   If this assumption is incorrect, please could you elaborate on the alternate 
approach a little more.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla edited a comment on pull request #8699: Scheduler just checks for task instances in 'running' state in execution.

2020-05-09 Thread GitBox


gdevanla edited a comment on pull request #8699:
URL: https://github.com/apache/airflow/pull/8699#issuecomment-626243867


   @ashb My first thought was also to handle this at the executor level. But, I 
believe, we need to allow the scheduler to reset the state of the task to 
`scheduled`, if it was not enqueued properly (that is switched to running 
status) in the first attempt.  This would give the scheduler more control on 
how to enqueue the task again. Don't you think this is important?
   
   If this assumption is incorrect, please could you elaborate on this approach 
a little more.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla commented on pull request #8699: Scheduler just checks for task instances in 'running' state in execution.

2020-05-09 Thread GitBox


gdevanla commented on pull request #8699:
URL: https://github.com/apache/airflow/pull/8699#issuecomment-626243867


   @ashb My first thought was also to handle this at the executor level. But, I 
believe, we need to allow the scheduler to reset the state of the task to 
`scheduled`, if it was not enqueued properly (that is switched to running 
status) in the first attempt.  This would give the scheduler more control on 
how to enqueue the task again. Don't you think this is important?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] gdevanla commented on a change in pull request #8699: Scheduler just checks for task instances in 'running' state in execution.

2020-05-09 Thread GitBox


gdevanla commented on a change in pull request #8699:
URL: https://github.com/apache/airflow/pull/8699#discussion_r422554030



##
File path: airflow/jobs/scheduler_job.py
##
@@ -1275,7 +1275,7 @@ def _find_executable_task_instances(self, simple_dag_bag, 
session=None):
   " this task has been reached.", 
task_instance)
 continue
 
-if self.executor.has_task(task_instance):
+if self.executor.is_task_running(task_instance):

Review comment:
   @ashb Thanks for the comment.
   
   The base_executor.queue_command that is eventually called when this 
condition is `False` checks to see if the `self.queued_tasks` already has an 
entry for this task. Therefore, I do not see being added to this queue at this 
point.
   
   Are there other points in the code you believe this could happen. Could you 
please point me to those points, if any?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on pull request #8801: Add optional network profile configuration parameter

2020-05-09 Thread GitBox


boring-cyborg[bot] commented on pull request #8801:
URL: https://github.com/apache/airflow/pull/8801#issuecomment-626241147


   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for 
testing locally, it’s a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better 🚀.
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] lutzkuen opened a new pull request #8801: Add optional network profile configuration parameter

2020-05-09 Thread GitBox


lutzkuen opened a new pull request #8801:
URL: https://github.com/apache/airflow/pull/8801


   This PR adds optional parameters for the AzureContainerInstanceGroupOperator 
to forward IP Address cconfiguration, Network Profile and Restart policy to the 
Azure Container group. This is useful if your container instance needs to 
access ressources that are protected by VPN settings 
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration

2020-05-09 Thread GitBox


kaxil commented on a change in pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#discussion_r422551368



##
File path: docs/operators-and-hooks-ref.rst
##
@@ -387,6 +387,14 @@ These integrations allow you to perform various operations 
within the Amazon Web
  -
  - :mod:`airflow.providers.amazon.aws.sensors.glue_catalog_partition`
 
+   * - `AWS Glue `__
+ -
+ - :mod:`airflow.providers.amazon.aws.hooks.glue`
+ - :mod:`airflow.providers.amazon.aws.operators.glue`
+ - :mod:`airflow.providers.amazon.aws.sensors.glue`
+
+   * - `AWS Lambda `__
+

Review comment:
   ```suggestion
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2020-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103500#comment-17103500
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

kaxil commented on a change in pull request #6007:
URL: https://github.com/apache/airflow/pull/6007#discussion_r422551368



##
File path: docs/operators-and-hooks-ref.rst
##
@@ -387,6 +387,14 @@ These integrations allow you to perform various operations 
within the Amazon Web
  -
  - :mod:`airflow.providers.amazon.aws.sensors.glue_catalog_partition`
 
+   * - `AWS Glue `__
+ -
+ - :mod:`airflow.providers.amazon.aws.hooks.glue`
+ - :mod:`airflow.providers.amazon.aws.operators.glue`
+ - :mod:`airflow.providers.amazon.aws.sensors.glue`
+
+   * - `AWS Lambda `__
+

Review comment:
   ```suggestion
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nadflinn commented on issue #8480: Celery autoscaling overrides normal worker_concurrency setting

2020-05-09 Thread GitBox


nadflinn commented on issue #8480:
URL: https://github.com/apache/airflow/issues/8480#issuecomment-626240649


   @turbaszek I think this can be closed, correct?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >