[jira] [Updated] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-02 Thread Darren Weber (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darren Weber updated AIRFLOW-3293:
--
Description: 
The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.

The implementation below using a PythonOperator works, but it consumes a worker 
for 5min needlessly.  It would be much better to have a TimeDelta that accepts 
the time when an upstream sensor triggers it and then waits for a timedelta, 
with options from the base sensor for poke interval (and timeout).  This could 
be used without consuming a worker as much with the reschedule option.  
Something like this can help with adding jitter to downstream tasks that could 
otherwise hit an HTTP endpoint too hard all at once.

{code:python}
def wait5(*args, **kwargs):
import random
import time as t
minutes = random.randint(3,6)
t.sleep(minutes * 60)
return True

wait5_task = PythonOperator(
task_id="python_op_wait_5min",
python_callable=wait5,
dag=a_dag)

upstream_http_sensor >> wait5_task
{code}


  was:
The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.

The implementation below using a PythonOperator works, but it consumes a worker 
for 5min needlessly.  It would be much better to have a TimeDelta that accepts 
the time when an upstream sensor triggers it and then waits for a timedelta, 
with options from the base sensor for poke interval (and timeout).  This could 
be used without consuming a worker as much with the reschedule option.  
Something like this can help with adding jitter to downstream tasks that could 
otherwise hit an HTTP endpoint too hard all at once.

{code:python}
def wait5(*args, **kwargs):
import time as t
t.sleep(5 * 60)
return True

wait5_task = PythonOperator(
task_id="python_op_wait_5min",
python_callable=wait5,
dag=a_dag)

upstream_http_sensor >> wait5_task
{code}



> Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
> -
>
> Key: AIRFLOW-3293
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Darren Weber
>Priority: Major
>
> The TimeDeltaSensor has baked-in lookups for the schedule and 
> schedule_interval lurking in the class init, it's not a pure time delta.  It 
> would be ideal to have a TimeDelta that is purely relative to the time that 
> an upstream task triggers it.  If there is a way to do this, please note it 
> here or suggest some implementation alternative that could achieve this 
> easily.
> The implementation below using a PythonOperator works, but it consumes a 
> worker for 5min needlessly.  It would be much better to have a TimeDelta that 
> accepts the time when an upstream sensor triggers it and then waits for a 
> timedelta, with options from the base sensor for poke interval (and timeout). 
>  This could be used without consuming a worker as much with the reschedule 
> option.  Something like this can help with adding jitter to downstream tasks 
> that could otherwise hit an HTTP endpoint too hard all at once.
> {code:python}
> def wait5(*args, **kwargs):
> import random
> import time as t
> minutes = random.randint(3,6)
> t.sleep(minutes * 60)
> return True
> wait5_task = PythonOperator(
> task_id="python_op_wait_5min",
> python_callable=wait5,
> dag=a_dag)
> upstream_http_sensor >> wait5_task
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-02 Thread Darren Weber (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darren Weber updated AIRFLOW-3293:
--
Description: 
The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.

The implementation below using a PythonOperator works, but it consumes a worker 
for 5min needlessly.  It would be much better to have a TimeDelta that accepts 
the time when an upstream sensor triggers it and then waits for a timedelta, 
with options from the base sensor for poke interval (and timeout).  This could 
be used without consuming a worker as much with the reschedule option.  
Something like this can help with adding jitter to downstream tasks that could 
otherwise hit an HTTP endpoint too hard all at once.

{code:python}
def wait5(*args, **kwargs):
import time as t
t.sleep(5 * 60)
return True

wait5_task = PythonOperator(
task_id="python_op_wait_5min",
python_callable=wait5,
dag=a_dag)

upstream_http_sensor >> wait5_task
{code}


  was:
The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.

The implementation below using a PythonOperator works, but it consumes a worker 
for 5min needlessly.  It would be much better to have a TimeDelta that accepts 
the time when an upstream sensor triggers it and then waits for a timedelta, 
with options from the base sensor for poke interval (and timeout).  This could 
be used without consuming a worker as much with the reschedule option.

{code:python}
def wait5(*args, **kwargs):
import time as t
t.sleep(5 * 60)
return True

wait5_task = PythonOperator(
task_id="python_op_wait_5min",
python_callable=wait5,
dag=a_dag)

upstream_http_sensor >> wait5_task
{code}



> Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
> -
>
> Key: AIRFLOW-3293
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Darren Weber
>Priority: Major
>
> The TimeDeltaSensor has baked-in lookups for the schedule and 
> schedule_interval lurking in the class init, it's not a pure time delta.  It 
> would be ideal to have a TimeDelta that is purely relative to the time that 
> an upstream task triggers it.  If there is a way to do this, please note it 
> here or suggest some implementation alternative that could achieve this 
> easily.
> The implementation below using a PythonOperator works, but it consumes a 
> worker for 5min needlessly.  It would be much better to have a TimeDelta that 
> accepts the time when an upstream sensor triggers it and then waits for a 
> timedelta, with options from the base sensor for poke interval (and timeout). 
>  This could be used without consuming a worker as much with the reschedule 
> option.  Something like this can help with adding jitter to downstream tasks 
> that could otherwise hit an HTTP endpoint too hard all at once.
> {code:python}
> def wait5(*args, **kwargs):
> import time as t
> t.sleep(5 * 60)
> return True
> wait5_task = PythonOperator(
> task_id="python_op_wait_5min",
> python_callable=wait5,
> dag=a_dag)
> upstream_http_sensor >> wait5_task
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-02 Thread Darren Weber (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darren Weber updated AIRFLOW-3293:
--
Description: 
The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.

The implementation below using a PythonOperator works, but it consumes a worker 
for 5min needlessly.  It would be much better to have a TimeDelta that accepts 
the time when an upstream sensor triggers it and then waits for a timedelta, 
with options from the base sensor for poke interval (and timeout).  This could 
be used without consuming a worker as much with the reschedule option.

{code:python}
def wait5(*args, **kwargs):
import time as t
t.sleep(5 * 60)
return True

wait5_task = PythonOperator(
task_id="python_op_wait_5min",
python_callable=wait5,
dag=a_dag)

upstream_http_sensor >> wait5_task
{code}


  was:The TimeDeltaSensor has baked-in lookups for the schedule and 
schedule_interval lurking in the class init, it's not a pure time delta.  It 
would be ideal to have a TimeDelta that is purely relative to the time that an 
upstream task triggers it.  If there is a way to do this, please note it here 
or suggest some implementation alternative that could achieve this easily.


> Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
> -
>
> Key: AIRFLOW-3293
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Darren Weber
>Priority: Major
>
> The TimeDeltaSensor has baked-in lookups for the schedule and 
> schedule_interval lurking in the class init, it's not a pure time delta.  It 
> would be ideal to have a TimeDelta that is purely relative to the time that 
> an upstream task triggers it.  If there is a way to do this, please note it 
> here or suggest some implementation alternative that could achieve this 
> easily.
> The implementation below using a PythonOperator works, but it consumes a 
> worker for 5min needlessly.  It would be much better to have a TimeDelta that 
> accepts the time when an upstream sensor triggers it and then waits for a 
> timedelta, with options from the base sensor for poke interval (and timeout). 
>  This could be used without consuming a worker as much with the reschedule 
> option.
> {code:python}
> def wait5(*args, **kwargs):
> import time as t
> t.sleep(5 * 60)
> return True
> wait5_task = PythonOperator(
> task_id="python_op_wait_5min",
> python_callable=wait5,
> dag=a_dag)
> upstream_http_sensor >> wait5_task
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-02 Thread Darren Weber (JIRA)
Darren Weber created AIRFLOW-3293:
-

 Summary: Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
 Key: AIRFLOW-3293
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
 Project: Apache Airflow
  Issue Type: Wish
Reporter: Darren Weber


The TimeDeltaSensor has baked-in lookups for the schedule and schedule_interval 
lurking in the class init, it's not a pure time delta.  It would be ideal to 
have a TimeDelta that is purely relative to the time that an upstream task 
triggers it.  If there is a way to do this, please note it here or suggest some 
implementation alternative that could achieve this easily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feluelle commented on a change in pull request #4120: [AIRFLOW-XXX] Update Contributing Guide - Git Hooks

2018-11-02 Thread GitBox
feluelle commented on a change in pull request #4120: [AIRFLOW-XXX] Update 
Contributing Guide - Git Hooks
URL: https://github.com/apache/incubator-airflow/pull/4120#discussion_r230516992
 
 

 ##
 File path: CONTRIBUTING.md
 ##
 @@ -183,6 +183,24 @@ docker-compose -f scripts/ci/docker-compose.yml run 
airflow-testing /app/scripts
 Alternatively can also set up [Travis CI](https://travis-ci.org/) on your repo 
to automate this.
 It is free for open source projects.
 
+Another great way of automating linting and testing is to use [Git 
Hooks](https://git-scm.com/book/uz/v2/Customizing-Git-Git-Hooks). For example 
you could create a `pre-commit` file based on the Travis CI Pipeline so that 
before each commit a local pipeline will be executed and if this pipeline 
failed (returned an exit code other than `0`) the commit does not come through.
+This "in theory" has the advantage that you can not commit any code that fails 
that again reduces the errors in the Travis CI Pipelines.
+
+Since there are a lot of tests the script would last very long so you propably 
only should test your new feature locally.
+
+The following example of a `pre-commit` file allows you..
+- to lint your code
+- test your code in a docker container based on python 2
+- test your code in a docker container based on python 3
+
+NOTE: Change the `airflow-py2` and `airflow-py3` to your docker containers or 
remove the `docker exec` if you have set up your environment directly on your 
host system.
+```
 
 Review comment:
   Didn't know that existed. Thank you. I will also change this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key separated by ":"

2018-11-02 Thread GitBox
uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key 
separated by ":"
URL: 
https://github.com/apache/incubator-airflow/pull/4127#issuecomment-435478501
 
 
   Was trying to pass kubernetes secrets to a worker pod and realized the 
config example was wrong. I changed it in the the airflow project rather than 
config documentation because I felt it made sense to follow the structure 
   env_variable = :


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key separated by ":"

2018-11-02 Thread GitBox
uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key 
separated by ":"
URL: 
https://github.com/apache/incubator-airflow/pull/4127#issuecomment-435478501
 
 
   Was trying to pass kubernetes secrets to a worker pod and realized the 
config example was wrong. I changed it in the the airflow project rather than 
config documentation because I felt it made sense to follow the structure 
   "" = :


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key separated by ":"

2018-11-02 Thread GitBox
uesenthi edited a comment on issue #4127: Bug Fix: Secrets object and key 
separated by ":"
URL: 
https://github.com/apache/incubator-airflow/pull/4127#issuecomment-435478501
 
 
   Was trying to pass kubernetes secrets to a worker pod and realized the 
config example was wrong. I changed it in the the airflow project rather than 
config documentation because I felt it made sense to follow the structure 
= :


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] uesenthi commented on issue #4127: Bug Fix: Secrets object and key separated by ":"

2018-11-02 Thread GitBox
uesenthi commented on issue #4127: Bug Fix: Secrets object and key separated by 
":"
URL: 
https://github.com/apache/incubator-airflow/pull/4127#issuecomment-435478501
 
 
   Was trying to pass kubernetes secrets to a worker pod and realized the 
config example was wrong. I changed it in the the airflow project rather than 
config documentation because I felt it made sense to follow the structure 
= :


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] uesenthi opened a new pull request #4127: Bug Fix: Secrets object and key separated by ":"

2018-11-02 Thread GitBox
uesenthi opened a new pull request #4127: Bug Fix: Secrets object and key 
separated by ":"
URL: https://github.com/apache/incubator-airflow/pull/4127
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-3253) KubernetesPodOperator Unauthorized Code 401

2018-11-02 Thread Trevor Edwards (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Edwards reassigned AIRFLOW-3253:
---

Assignee: Trevor Edwards

> KubernetesPodOperator Unauthorized Code 401
> ---
>
> Key: AIRFLOW-3253
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3253
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication, gcp, kubernetes
>Affects Versions: 1.10.0
>Reporter: Sunny Gupta
>Assignee: Trevor Edwards
>Priority: Minor
> Attachments: Screenshot from 2018-10-25 02-08-28.png
>
>
> apache-airflow==1.10.0
> kubernetes==7.0.0 (Tried)
>  kubernetes==8.0.0b1 (Tried)
>  
> Everytime after couple of successful scheduled runs, some runs failed and 
> throw below error.
> Error looks related  to k8s authorization and it seems like a pattern in my 
> case, everytime expiry comes near, job fails and after new expiry updates it 
> runs for a while and fails.
> !Screenshot from 2018-10-25 02-08-28.png!
> Above speculation could be wrong, need help to fix this issue. I am running 
> one sample python hello DAG and planning to move production workload but this 
> is blocker for me.
> Tried :(
>  * ~/.kube folder clearing and regenerate token by `gcloud container clusters 
> get-credentials ***` even tried setting as cron to force update tokens.
>  * Tried kubernetes==7.0.0 to latest beta version.
> Below is my kubectl config. When I run *kubectl* cli to do GET ops on pods, 
> nodes resources,no issues.
>  
> {code:java}
> $ kubectl config view 
> apiVersion: v1
> clusters:
> - cluster:
>     certificate-authority-data: DATA+OMITTED
>     server: https://XX.XX.XX.XX
>   name: gke_us-central1-b_dev-kube-cluster
> contexts:
> - context:
>     cluster: gke_us-central1-b_dev-kube-cluster
>     user: gke_us-central1-b_dev-kube-cluster
>   name: gke_us-central1-b_dev-kube-cluster
> current-context: gke_us-central1-b_dev-kube-cluster
> kind: Config
> preferences: {}
> users:
> - name: gke_us-central1-b_dev-kube-cluster
>   user:
>     auth-provider:
>   config:
>     access-token: ya29.c.TOKEN5EREdigv
>     cmd-args: config config-helper --format=json
>     cmd-path: /usr/lib/google-cloud-sdk/bin/gcloud
>     expiry: 2018-10-24T20:54:37Z
>     expiry-key: '{.credential.token_expiry}'
>     token-key: '{.credential.access_token}'
>   name: gcp
> {code}
>  
>  
> In an hour, running every */5 min, 2-3 jobs fails. with below error.
>  
> {code:java}
> kubernetes.client.rest.ApiException: (401)
> Reason: Unauthorized
> HTTP response headers: HTTPHeaderDict({'Date': 'Wed, 24 Oct 2018 06:20:04 
> GMT', 'Content-Length': '129', 'Audit-Id': 
> '89dcda61-a60f-4b23-85d6-9d28a6bfeed0', 'Www-Authenticate': 'Basic 
> realm="kubernetes-master"', 'Content-Type': 'application/json'})
> HTTP response body: 
> {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}{code}
>  
>  
> {code:java}
> // complete logs
> 
> *** Log file does not exist: 
> /root/airflow/logs/pyk8s.v3/python-hello/2018-10-24T06:16:00+00:00/1.log
> *** Fetching from: 
> http://aflow-worker.internal:8793/log/pyk8s.v3/python-hello/2018-10-24T06:16:00+00:00/1.log
> [2018-10-24 06:20:02,947] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-10-24 06:20:02,952] {models.py:1335} INFO - Dependencies all met for 
> 
> [2018-10-24 06:20:02,952] {models.py:1547} INFO -
> 
> Starting attempt 1 of 1
> 
> [2018-10-24 06:20:02,966] {models.py:1569} INFO - Executing 
>  on 2018-10-24T06:16:00+00:00
> [2018-10-24 06:20:02,967] {base_task_runner.py:124} INFO - Running: ['bash', 
> '-c', 'airflow run pyk8s.v3 python-hello 2018-10-24T06:16:00+00:00 --job_id 
> 354 --raw -sd DAGS_FOLDER/pyk8s.v3.py --cfg_path /tmp/tmpf0saygt7']
> [2018-10-24 06:20:03,405] {base_task_runner.py:107} INFO - Job 354: Subtask 
> python-hello [2018-10-24 06:20:03,404] {settings.py:174} INFO - 
> setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
> [2018-10-24 06:20:03,808] {base_task_runner.py:107} INFO - Job 354: Subtask 
> python-hello [2018-10-24 06:20:03,807] {__init__.py:51} INFO - Using executor 
> CeleryExecutor
> [2018-10-24 06:20:03,970] {base_task_runner.py:107} INFO - Job 354: Subtask 
> python-hello [2018-10-24 06:20:03,970] {models.py:258} INFO - Filling up the 
> DagBag from /root/airflow/dags/pyk8s.v3.py
> [2018-10-24 06:20:04,255] {base_task_runner.py:107} INFO - Job 354: Subtask 
> python-hello [2018-10-24 06:20:04,255] {cli.py:492} INFO - Running 
>  on 
> host 

[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673534#comment-16673534
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


[~ashb] : please expedite 

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3530: [AIRFLOW-2609] Fixed behavior of BranchPythonOperator

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #3530: [AIRFLOW-2609] Fixed behavior of 
BranchPythonOperator
URL: 
https://github.com/apache/incubator-airflow/pull/3530#issuecomment-399166103
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=h1)
 Report
   > Merging 
[#3530](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/e703d6beeb379ee88ef5e7df495e8a785666f8af?src=pr=desc)
 will **increase** coverage by `0.79%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3530/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3530  +/-   ##
   ==
   + Coverage   76.67%   77.46%   +0.79% 
   ==
 Files 199  204   +5 
 Lines   1618615228 -958 
   ==
   - Hits1241011796 -614 
   + Misses   3776 3432 -344
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uX29wZXJhdG9yLnB5)
 | `94.96% <100%> (-0.07%)` | :arrow_down: |
   | 
[airflow/hooks/pig\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9waWdfaG9vay5weQ==)
 | `0% <0%> (-100%)` | :arrow_down: |
   | 
[airflow/operators/slack\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc2xhY2tfb3BlcmF0b3IucHk=)
 | `0% <0%> (-97.37%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `31.03% <0%> (-68.97%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `41.17% <0%> (-58.83%)` | :arrow_down: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `67.07% <0%> (-17.31%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=)
 | `78.94% <0%> (-15.79%)` | :arrow_down: |
   | 
[airflow/hooks/oracle\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9vcmFjbGVfaG9vay5weQ==)
 | `0% <0%> (-15.48%)` | :arrow_down: |
   | 
[airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5)
 | `78% <0%> (-12%)` | :arrow_down: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `69.56% <0%> (-11.87%)` | :arrow_down: |
   | ... and [108 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3530/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=footer).
 Last update 
[e703d6b...cfd3646](https://codecov.io/gh/apache/incubator-airflow/pull/3530?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] evizitei commented on issue #4082: [AIRFLOW-2865] Call success_callback before updating task state

2018-11-02 Thread GitBox
evizitei commented on issue #4082: [AIRFLOW-2865] Call success_callback before 
updating task state
URL: 
https://github.com/apache/incubator-airflow/pull/4082#issuecomment-435466198
 
 
   @ashb or @Fokko hey y'all, I don't want to be pushy or anything, I just want 
to know what the right way is to get feedback on a PR; I've had this one open 
for about 10 days and because there's no comments on it I wonder if I've done 
something wrong to make it de-prioritized?  Should I have been making noise in 
a mailing list somewhere or something like that?  Thanks, ~Ethan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3725: [AIRFLOW-2877] Make docs site URL consistent everywhere

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #3725: [AIRFLOW-2877] Make docs site URL 
consistent everywhere
URL: 
https://github.com/apache/incubator-airflow/pull/3725#issuecomment-412940914
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=h1)
 Report
   > Merging 
[#3725](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/e703d6beeb379ee88ef5e7df495e8a785666f8af?src=pr=desc)
 will **increase** coverage by `0.99%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3725/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3725  +/-   ##
   ==
   + Coverage   76.67%   77.66%   +0.99% 
   ==
 Files 199  204   +5 
 Lines   1618615849 -337 
   ==
   - Hits1241012309 -101 
   + Misses   3776 3540 -236
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/example\_dags/tutorial.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvdHV0b3JpYWwucHk=)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5)
 | `99.01% <ø> (+0.06%)` | :arrow_up: |
   | 
[airflow/www\_rbac/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9hcHAucHk=)
 | `96.77% <ø> (-0.29%)` | :arrow_down: |
   | 
[airflow/operators/slack\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc2xhY2tfb3BlcmF0b3IucHk=)
 | `0% <0%> (-97.37%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `31.03% <0%> (-68.97%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `41.17% <0%> (-58.83%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=)
 | `78.94% <0%> (-15.79%)` | :arrow_down: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `71.34% <0%> (-13.04%)` | :arrow_down: |
   | 
[airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5)
 | `78% <0%> (-12%)` | :arrow_down: |
   | 
[airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=)
 | `90.47% <0%> (-9.53%)` | :arrow_down: |
   | ... and [89 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3725/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=footer).
 Last update 
[e703d6b...e524245](https://codecov.io/gh/apache/incubator-airflow/pull/3725?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Eronarn commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism

2018-11-02 Thread GitBox
Eronarn commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism
URL: 
https://github.com/apache/incubator-airflow/pull/3584#issuecomment-435461415
 
 
   This has gotten conflicts again from going stale... I'd love to see this 
feature in Airflow and have multiple internal use cases for it already; are 
there any steps in the process that I'm missing here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673458#comment-16673458
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


After adding those missing lines before tls_configuration = None., hitting 
another issue, below is complete exception stack, please help to fix  with 
respect to "_*ldap3.core.exceptions.LDAPAttributeError: invalid attribute 
type*_". I am suspecting this "user_name_attr = uid". Please advise.

 

_*[2018-11-02 17:32:04,632] \{{ldap_auth.py:303}} INFO - User user1234 
successfully authenticated*_

[2018-11-02 17:32:05,056] ERROR in app: Exception on /admin/airflow/login [POST]

Traceback (most recent call last):

  File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1982, in 
wsgi_app

    response = self.full_dispatch_request()

  File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1614, in 
full_dispatch_request

    rv = self.handle_user_exception(e)

  File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1517, in 
handle_user_exception

    reraise(exc_type, exc_value, tb)

  File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise

    raise value

  File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1612, in 
full_dispatch_request

    rv = self.dispatch_request()

  File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1598, in 
dispatch_request

    return self.view_functions[rule.endpoint](**req.view_args)

  File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, 
in inner

    return self._run_view(f, *args, **kwargs)

  File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, 
in _run_view

    return fn(self, *args, **kwargs)

  File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 735, 
in login

    return airflow.login.login(self, request)

  File "/usr/local/lib/python3.5/site-packages/airflow/utils/db.py", line 74, 
in wrapper

    return func(*args, **kwargs)

  File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 316, in login

    flask_login.login_user(LdapUser(user))

  File "", line 4, in __init__

  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/state.py", line 
414, in _initialize_instance

    manager.dispatch.init_failure(self, args, kwargs)

  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/util/langhelpers.py", 
line 66, in __exit__

    compat.reraise(exc_type, exc_value, exc_tb)

  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 
187, in reraise

    raise value

  File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/state.py", line 
411, in _initialize_instance

    return manager.original_init(*mixed[1:], **kwargs)

  File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 185, in __init__

    user.username

  File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 106, in groups_user

    attributes=[native(memberof_attr)])

  *_File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", 
line 765, in search_*

    _*raise LDAPAttributeError('invalid attribute type ' + 
attribute_name_to_check)*_

_*ldap3.core.exceptions.LDAPAttributeError: invalid attribute type*_

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA

[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673377#comment-16673377
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


Sure, let me try.

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673370#comment-16673370
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Yup, add those lines just before {{tls_configuration = None}}

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673368#comment-16673368
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


Existing  ldap_auth.py, in {{get_ldap_connection}} function:
{code:java}
def get_ldap_connection(dn=None, password=None): 
   tls_configuration = None 
   use_ssl = False 
   try:
{code}
 

 

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673362#comment-16673362
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


I checked the github you mentioned, it shows same as what i notice, however 
regarding possible two versions , i checked using pip command, below is the 
result, please confirm if this is an issue ?
{code:java}
airflow@web-579c6f666d-9hxgj:/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends$
 pip list |grep ldap 
ldap3             2.5.1       
pyldap            3.0.0.post1 
python-ldap       3.1.0      
{code}

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673351#comment-16673351
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


I think the same issue would still apply on 1.10.0 so it doesn't matter but the 
lines def match up form the stack trace. For instance

{code}
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
{code}

https://github.com/apache/incubator-airflow/blob/1.10.0/airflow/contrib/auth/backends/ldap_auth.py#L268
 -- the is not in a login function.

So you might want to see if you have two versions installed (just so that you 
edit the right one).

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673341#comment-16673341
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


Honestly, this is 1.10.0 Airflow only, below is the detail from pip list 
command : 

 
{code:java}
airflow@web-579c6f666d-9hxgj:/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends$
 pip  list|grep airflow
 apache-airflow    1.10.0
{code}
So, i should add these below lines in {{get_ldap_connection}} function  and try 
? correct? 

 
{code:java}
 if dn == "": 
dn=None
 if password == "": 
password = None
{code}
 

 

 

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on a change in pull request #3684: [AIRFLOW-2840] - add update connections cli option

2018-11-02 Thread GitBox
ashb commented on a change in pull request #3684: [AIRFLOW-2840] - add update 
connections cli option
URL: https://github.com/apache/incubator-airflow/pull/3684#discussion_r230378800
 
 

 ##
 File path: airflow/api/client/local_client.py
 ##
 @@ -51,3 +51,83 @@ def create_pool(self, name, slots, description):
 def delete_pool(self, name):
 p = pool.delete_pool(name=name)
 return p.pool, p.slots, p.description
+
+def add_connection(self, conn_id,
+   conn_uri=None,
+   conn_type=None,
+   conn_host=None,
+   conn_login=None,
+   conn_password=None,
+   conn_schema=None,
+   conn_port=None,
+   conn_extra=None):
+"""
+
+:param conn_id:
+:param conn_uri:
+:param conn_type:
+:param conn_host:
+:param conn_login:
+:param conn_password:
+:param conn_schema:
+:param conn_port:
+:param conn_extra:
+:return: The new Connection
+"""
+return connections.add_connection(conn_id,
 
 Review comment:
   We probably shouldn't return objects over the API  (as only the "Local" api 
can do this - we couldn't return an object over JSON for instance) - this 
should be the conn_id that we return I think.
   
   It is possibly worth returning the numeric ID (the `id` column, not to be 
confused with the `conn_id` column!) as this would let us delete individual 
connections if there are multiple ones with the same conn_id.
   
   Similarly for the other APIs- ids or JSON representations of the conn.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r230426159
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_operator.py
 ##
 @@ -0,0 +1,149 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint.
+
+This operator returns The ARN of the endpoint created in Amazon SageMaker
+
+:param config:
+The configuration necessary to create an endpoint.
+
+If you need to create a SageMaker endpoint based on an existed 
SageMaker model and an existed SageMaker
+endpoint config, 
+
+config = endpoint_configuration;
+
+If you need to create all of SageMaker model, SageMaker 
endpoint-config and SageMaker endpoint, 
+
+config = {
+'Model': model_configuration,
+
+'EndpointConfig': endpoint_config_configuration,
+
+'Endpoint': endpoint_configuration
+}
+
+For details of the configuration parameter of model_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model
+
+For details of the configuration parameter of 
endpoint_config_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+
+For details of the configuration parameter of endpoint_configuration, 
See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+:param wait_for_completion: Whether the operator should wait until the 
endpoint creation finishes.
+:type wait_for_completion: bool
+:param check_interval: If wait is set to True, this is the time interval, 
in seconds, that this operation waits
+before polling the status of the endpoint creation.
+:type check_interval: int
+:param max_ingestion_time: If wait is set to True, this operation fails if 
the endpoint creation doesn't finish
+within max_ingestion_time seconds. If you set this parameter to None 
it never times out.
+:type max_ingestion_time: int
+:param operation: Whether to create an endpoint or update an endpoint. 
Must be either 'create or 'update'.
+:type operation: str
+"""  # noqa
+
+@apply_defaults
+def __init__(self,
+ config,
+ wait_for_completion=True,
+ check_interval=30,
+ max_ingestion_time=None,
+ operation='create',
+ *args, **kwargs):
+super(SageMakerEndpointOperator, self).__init__(config=config,
+*args, **kwargs)
+
+self.config = config
+self.wait_for_completion = wait_for_completion
+self.check_interval = check_interval
+self.max_ingestion_time = max_ingestion_time
+self.operation = operation.lower()
+self.create_integer_fields()
+
+def create_integer_fields(self):
+if 'EndpointConfig' in self.config:
+self.integer_fields = [
+['EndpointConfig', 'ProductionVariants', 
'InitialInstanceCount']
+]
+
+def expand_role(self):
+if 'Model' not in self.config:
+return
+hook = AwsHook(self.aws_conn_id)
+config = self.config['Model']
+if 'ExecutionRoleArn' in config:
+config['ExecutionRoleArn'] = 

[GitHub] ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r230426495
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_operator.py
 ##
 @@ -0,0 +1,149 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint.
+
+This operator returns The ARN of the endpoint created in Amazon SageMaker
+
+:param config:
+The configuration necessary to create an endpoint.
+
+If you need to create a SageMaker endpoint based on an existed 
SageMaker model and an existed SageMaker
+endpoint config, 
+
+config = endpoint_configuration;
+
+If you need to create all of SageMaker model, SageMaker 
endpoint-config and SageMaker endpoint, 
+
+config = {
+'Model': model_configuration,
+
+'EndpointConfig': endpoint_config_configuration,
+
+'Endpoint': endpoint_configuration
+}
+
+For details of the configuration parameter of model_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model
+
+For details of the configuration parameter of 
endpoint_config_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+
+For details of the configuration parameter of endpoint_configuration, 
See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+:param wait_for_completion: Whether the operator should wait until the 
endpoint creation finishes.
+:type wait_for_completion: bool
+:param check_interval: If wait is set to True, this is the time interval, 
in seconds, that this operation waits
+before polling the status of the endpoint creation.
+:type check_interval: int
+:param max_ingestion_time: If wait is set to True, this operation fails if 
the endpoint creation doesn't finish
+within max_ingestion_time seconds. If you set this parameter to None 
it never times out.
+:type max_ingestion_time: int
+:param operation: Whether to create an endpoint or update an endpoint. 
Must be either 'create or 'update'.
+:type operation: str
+"""  # noqa
+
+@apply_defaults
+def __init__(self,
+ config,
+ wait_for_completion=True,
+ check_interval=30,
+ max_ingestion_time=None,
+ operation='create',
+ *args, **kwargs):
+super(SageMakerEndpointOperator, self).__init__(config=config,
+*args, **kwargs)
+
+self.config = config
+self.wait_for_completion = wait_for_completion
+self.check_interval = check_interval
+self.max_ingestion_time = max_ingestion_time
+self.operation = operation.lower()
+self.create_integer_fields()
+
+def create_integer_fields(self):
+if 'EndpointConfig' in self.config:
+self.integer_fields = [
+['EndpointConfig', 'ProductionVariants', 
'InitialInstanceCount']
+]
+
+def expand_role(self):
+if 'Model' not in self.config:
+return
+hook = AwsHook(self.aws_conn_id)
+config = self.config['Model']
+if 'ExecutionRoleArn' in config:
+config['ExecutionRoleArn'] = 

[GitHub] ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r230425965
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_operator.py
 ##
 @@ -0,0 +1,149 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint.
+
+This operator returns The ARN of the endpoint created in Amazon SageMaker
+
+:param config:
+The configuration necessary to create an endpoint.
+
+If you need to create a SageMaker endpoint based on an existed 
SageMaker model and an existed SageMaker
+endpoint config, 
+
+config = endpoint_configuration;
+
+If you need to create all of SageMaker model, SageMaker 
endpoint-config and SageMaker endpoint, 
+
+config = {
+'Model': model_configuration,
+
+'EndpointConfig': endpoint_config_configuration,
+
+'Endpoint': endpoint_configuration
+}
+
+For details of the configuration parameter of model_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model
+
+For details of the configuration parameter of 
endpoint_config_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+
+For details of the configuration parameter of endpoint_configuration, 
See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+:param wait_for_completion: Whether the operator should wait until the 
endpoint creation finishes.
+:type wait_for_completion: bool
+:param check_interval: If wait is set to True, this is the time interval, 
in seconds, that this operation waits
+before polling the status of the endpoint creation.
+:type check_interval: int
+:param max_ingestion_time: If wait is set to True, this operation fails if 
the endpoint creation doesn't finish
+within max_ingestion_time seconds. If you set this parameter to None 
it never times out.
+:type max_ingestion_time: int
+:param operation: Whether to create an endpoint or update an endpoint. 
Must be either 'create or 'update'.
+:type operation: str
+"""  # noqa
+
+@apply_defaults
+def __init__(self,
+ config,
+ wait_for_completion=True,
+ check_interval=30,
+ max_ingestion_time=None,
+ operation='create',
+ *args, **kwargs):
+super(SageMakerEndpointOperator, self).__init__(config=config,
+*args, **kwargs)
+
+self.config = config
+self.wait_for_completion = wait_for_completion
+self.check_interval = check_interval
+self.max_ingestion_time = max_ingestion_time
+self.operation = operation.lower()
+self.create_integer_fields()
+
+def create_integer_fields(self):
+if 'EndpointConfig' in self.config:
+self.integer_fields = [
+['EndpointConfig', 'ProductionVariants', 
'InitialInstanceCount']
+]
+
+def expand_role(self):
+if 'Model' not in self.config:
+return
+hook = AwsHook(self.aws_conn_id)
 
 Review comment:
   This creates a new hook, but execute uses `self.hook` - is that existing 
hook (which is a sub-class of AwsHook so has 

[GitHub] ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r230426320
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_operator.py
 ##
 @@ -0,0 +1,149 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint.
+
+This operator returns The ARN of the endpoint created in Amazon SageMaker
+
+:param config:
+The configuration necessary to create an endpoint.
+
+If you need to create a SageMaker endpoint based on an existed 
SageMaker model and an existed SageMaker
+endpoint config, 
+
+config = endpoint_configuration;
+
+If you need to create all of SageMaker model, SageMaker 
endpoint-config and SageMaker endpoint, 
+
+config = {
+'Model': model_configuration,
+
+'EndpointConfig': endpoint_config_configuration,
+
+'Endpoint': endpoint_configuration
+}
+
+For details of the configuration parameter of model_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model
+
+For details of the configuration parameter of 
endpoint_config_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+
+For details of the configuration parameter of endpoint_configuration, 
See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+:param wait_for_completion: Whether the operator should wait until the 
endpoint creation finishes.
+:type wait_for_completion: bool
+:param check_interval: If wait is set to True, this is the time interval, 
in seconds, that this operation waits
+before polling the status of the endpoint creation.
+:type check_interval: int
+:param max_ingestion_time: If wait is set to True, this operation fails if 
the endpoint creation doesn't finish
+within max_ingestion_time seconds. If you set this parameter to None 
it never times out.
+:type max_ingestion_time: int
+:param operation: Whether to create an endpoint or update an endpoint. 
Must be either 'create or 'update'.
+:type operation: str
+"""  # noqa
+
+@apply_defaults
+def __init__(self,
+ config,
+ wait_for_completion=True,
+ check_interval=30,
+ max_ingestion_time=None,
+ operation='create',
+ *args, **kwargs):
+super(SageMakerEndpointOperator, self).__init__(config=config,
+*args, **kwargs)
+
+self.config = config
+self.wait_for_completion = wait_for_completion
+self.check_interval = check_interval
+self.max_ingestion_time = max_ingestion_time
+self.operation = operation.lower()
+self.create_integer_fields()
+
+def create_integer_fields(self):
+if 'EndpointConfig' in self.config:
+self.integer_fields = [
+['EndpointConfig', 'ProductionVariants', 
'InitialInstanceCount']
+]
+
+def expand_role(self):
+if 'Model' not in self.config:
+return
+hook = AwsHook(self.aws_conn_id)
+config = self.config['Model']
+if 'ExecutionRoleArn' in config:
+config['ExecutionRoleArn'] = 

[GitHub] ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r230427126
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_model_operator.py
 ##
 @@ -0,0 +1,67 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerModelOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker model
+This operator returns The ARN of the model created in Amazon SageMaker
 
 Review comment:
   Blank line needed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673311#comment-16673311
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Well your stack trace lines are for 1.8, so this might not be true anymore, but 
could you try making this change in airflow/contrib/auth/backends/ldap_auth.py, 
in {{get_ldap_connection}} function:

{code}
diff --git a/airflow/contrib/auth/backends/ldap_auth.py 
b/airflow/contrib/auth/backends/ldap_auth.py
index 13b49f90..42ad7026 100644
--- a/airflow/contrib/auth/backends/ldap_auth.py
+++ b/airflow/contrib/auth/backends/ldap_auth.py
@@ -51,6 +51,13 @@ class LdapException(Exception):
 
 
 def get_ldap_connection(dn=None, password=None):
+# When coming form confing we can't set None, the best we can do is set it
+# to an empty string
+if dn == "":
+dn=None
+if password == "":
+password = None
+
 tls_configuration = None
 use_ssl = False
 try:
{code}

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673302#comment-16673302
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


[~ashb]:

we also tried using below, same issue persists. Please advise resolution.
user_filter = objectClass=*

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673287#comment-16673287
 ] 

Ash Berlin-Taylor edited comment on AIRFLOW-3270 at 11/2/18 3:44 PM:
-

[~ashb] : the '\{{ = }}'  as the end of the line is copy/paste issue on this 
JIRA. Below is the correct one without formatting. Below is the full exception 
stack included.

 
{code}
[ldap]

uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389

user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im

user_name_attr = uid

group_member_attr =

superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im

data_profiler_filter =

bind_user =

bind_password =

basedn = ou=people,dc=odc,dc=im

cacert = /opt/orchestration/airflow/ldap_ca.crt

search_scope = LEVEL
{code}

 ===

===

{code}
[2018-10-30 04:01:04,520] ERROR in app: Exception on /admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
[2018-10-30 04:01:04,520] [72] \{app.py:1587} ERROR - Exception on 
/admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
215, in parse_filter
raise LDAPInvalidFilterError('malformed filter')
ldap3.core.exceptions.LDAPInvalidFilterError: 

[jira] [Comment Edited] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673287#comment-16673287
 ] 

Hari Krishna ADDEPALLI LN edited comment on AIRFLOW-3270 at 11/2/18 3:42 PM:
-

[~ashb] : the '\{{ = }}'  as the end of the line is copy/paste issue on this 
JIRA. Below is the correct one without formatting. Below is the full exception 
stack included.

 

[ldap]

uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389

user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im

user_name_attr = uid

group_member_attr =

superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im

data_profiler_filter =

bind_user =

bind_password =

basedn = ou=people,dc=odc,dc=im

cacert = /opt/orchestration/airflow/ldap_ca.crt

search_scope = LEVEL

 ===

===

[2018-10-30 04:01:04,520] ERROR in app: Exception on /admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
[2018-10-30 04:01:04,520] [72] \{app.py:1587} ERROR - Exception on 
/admin/airflow/login [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request
return 
self.view_functions[rule.endpoint|https://github.com/cannatag/ldap3/issues/**req.view_args]
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 
215, in parse_filter
raise LDAPInvalidFilterError('malformed filter')
ldap3.core.exceptions.LDAPInvalidFilterError: malformed 

[jira] [Commented] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673289#comment-16673289
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3292:


{{.}} in a dag is reserves for subdags, so deleting {{schema}} also deletes the 
subdags too. If {{schema.table1}} is _NOT_ a sub-dag then we should probably 
add better validation of the dag id to only allow dots in dag ids when used in 
subdags.

Does that help?

> `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
> 
>
> Key: AIRFLOW-3292
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3292
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api, cli
>Affects Versions: 1.10.0
>Reporter: Teresa Martyny
>Priority: Major
>
> If you have the following dag ids: `schema`, `schema.table1`, 
> `schema.table2`, `schema_replace`
> When you hit the delete_dag endpoint with the dag id: `schema`, it will 
> delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. 
> Underscores are fine so it doesn't delete `schema_replace`, but periods are 
> not.
> If this is expected behavior, clarifying that functionality in the docs would 
> be great, and then I can submit a feature request for the ability to use 
> regex for exact matching with this command and endpoint.
> Thanks!! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673287#comment-16673287
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


[~ashb] : the '\{{ = }}'  as the end of the line is copy/paste issue on this 
JIRA. This is the correct one without formatting. Below is the full exception 
stack included.

 

[ldap]

uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389

user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im

user_name_attr = uid

group_member_attr =

superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im

data_profiler_filter =

bind_user =

bind_password =

basedn = ou=people,dc=odc,dc=im

cacert = /opt/orchestration/airflow/ldap_ca.crt

search_scope = LEVEL

 
{noformat}
[2018-10-30 04:01:04,520] ERROR in app: Exception on /admin/airflow/login 
[POST] Traceback (most recent call last): File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in wsgi_app 
response = self.full_dispatch_request() File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request rv = self.handle_user_exception(e) File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception reraise(exc_type, exc_value, tb) File 
"/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise 
raise value File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 
1639, in full_dispatch_request rv = self.dispatch_request() File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request return self.view_functionsrule.endpoint File 
"/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in inner 
return self._run_view(f, *args, **kwargs) File 
"/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view return fn(self, *args, **kwargs) File 
"/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, in 
login return airflow.login.login(self, request) File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login LdapUser.try_login(username, password) File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login search_scope=native(search_scope)) File 
"/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 779, in 
search [2018-10-30 04:01:04,520] [72] {app.py:1587} ERROR - Exception on 
/admin/airflow/login [POST] Traceback (most recent call last): File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1988, in wsgi_app 
response = self.full_dispatch_request() File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1641, in 
full_dispatch_request rv = self.handle_user_exception(e) File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1544, in 
handle_user_exception reraise(exc_type, exc_value, tb) File 
"/usr/local/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise 
raise value File "/usr/local/lib/python3.5/site-packages/flask/app.py", line 
1639, in full_dispatch_request rv = self.dispatch_request() File 
"/usr/local/lib/python3.5/site-packages/flask/app.py", line 1625, in 
dispatch_request return self.view_functionsrule.endpoint File 
"/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 69, in inner 
return self._run_view(f, *args, **kwargs) File 
"/usr/local/lib/python3.5/site-packages/flask_admin/base.py", line 368, in 
_run_view return fn(self, *args, **kwargs) File 
"/usr/local/lib/python3.5/site-packages/airflow/www/views.py", line 650, in 
login return airflow.login.login(self, request) File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login LdapUser.try_login(username, password) File 
"/usr/local/lib/python3.5/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login search_scope=native(search_scope)) File 
"/usr/local/lib/python3.5/site-packages/ldap3/core/connection.py", line 779, in 
search check_names=self.check_names) File 
"/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 372, 
in search_operation request['filter'] = 
compile_filter(parse_filter(search_filter, schema, auto_escape, auto_encode, 
validator, check_names).elements[0]) # parse the searchFilter string and 
compile it starting from the root node File 
"/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 215, 
in parse_filter raise LDAPInvalidFilterError('malformed filter') 
ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter 
check_names=self.check_names) File 
"/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", line 372, 
in search_operation request['filter'] = 
compile_filter(parse_filter(search_filter, schema, auto_escape, auto_encode, 
validator, check_names).elements[0]) # 

[jira] [Created] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching

2018-11-02 Thread TERESA M MARTYNY (JIRA)
TERESA M MARTYNY created AIRFLOW-3292:
-

 Summary: `delete_dag` endpoint and cli commands don't delete on 
exact dag_id matching
 Key: AIRFLOW-3292
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3292
 Project: Apache Airflow
  Issue Type: Bug
  Components: api, cli
Affects Versions: 1.10.0
Reporter: TERESA M MARTYNY


If you have the following dag ids: `schema`, `schema.table1`, `schema.table2`, 
`schema_replace`

When you hit the delete_dag endpoint with the dag id: `schema`, it will delete 
`schema`, `schema.table1`, and `schema.table2`, not just `schema`. Underscores 
are fine so it doesn't delete `schema_replace`, but periods are not.

If this is expected behavior, clarifying that functionality in the docs would 
be great, and then I can submit a feature request for the ability to use regex 
for exact matching with this command and endpoint.

Thanks!! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: 
https://github.com/apache/incubator-airflow/pull/4111#issuecomment-433705746
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=h1)
 Report
   > Merging 
[#4111](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/92cb5c74d808c5d734134cf267a1de55ce333db4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4111/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4111   +/-   ##
   ===
 Coverage   76.67%   76.67%   
   ===
 Files 199  199   
 Lines   1621216212   
   ===
 Hits1243012430   
 Misses   3782 3782
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=footer).
 Last update 
[92cb5c7...bff5c52](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: 
https://github.com/apache/incubator-airflow/pull/4111#issuecomment-433705746
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=h1)
 Report
   > Merging 
[#4111](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/92cb5c74d808c5d734134cf267a1de55ce333db4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4111/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4111   +/-   ##
   ===
 Coverage   76.67%   76.67%   
   ===
 Files 199  199   
 Lines   1621216212   
   ===
 Hits1243012430   
 Misses   3782 3782
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=footer).
 Last update 
[92cb5c7...bff5c52](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: 
https://github.com/apache/incubator-airflow/pull/4111#issuecomment-433705746
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=h1)
 Report
   > Merging 
[#4111](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/92cb5c74d808c5d734134cf267a1de55ce333db4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4111/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4111   +/-   ##
   ===
 Coverage   76.67%   76.67%   
   ===
 Files 199  199   
 Lines   1621216212   
   ===
 Hits1243012430   
 Misses   3782 3782
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=footer).
 Last update 
[92cb5c7...bff5c52](https://codecov.io/gh/apache/incubator-airflow/pull/4111?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1753) Can't install on windows 10

2018-11-02 Thread Tony Mao (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673208#comment-16673208
 ] 

Tony Mao commented on AIRFLOW-1753:
---

Actually even if the installation is complete. When you run `airflow initdb`, 
it will be failed as well. 

> Can't install on windows 10
> ---
>
> Key: AIRFLOW-1753
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1753
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Lakshman Udayakantha
>Priority: Major
>
> When I installed airflow using "pip install airflow command" two errors pop 
> up.
> 1.  link.exe failed with exit status 1158
> 2.\x86_amd64\\cl.exe' failed with exit status 2
> first issue can be solved by reffering 
> https://stackoverflow.com/questions/43858836/python-installing-clarifai-vs14-0-link-exe-failed-with-exit-status-1158/44563421#44563421.
> But second issue is still there. there was no any solution by googling also. 
> how to prevent that issue and install airflow on windows 10 X64.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS 
Athena Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230402321
 
 

 ##
 File path: tests/contrib/operators/test_aws_athena_operator.py
 ##
 @@ -48,42 +49,62 @@
 }
 
 
+class Iterator(object):
+
+def __init__(self, tuple):
+self.tuple = tuple
+
+def __iter__(self):
+return self
+
+def __next__(self):
+return self.tuple[randrange(len(self.tuple))]
+
+
 class TestAWSAthenaOperator(unittest.TestCase):
 
 def setUp(self):
 configuration.load_test_config()
 
 self.athena = AWSAthenaOperator(task_id='test_aws_athena_operator', 
query='SELECT * FROM TEST_TABLE',
 database='TEST_DATABASE', 
output_location='s3://test_s3_bucket/',
-
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595')
+
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595',
+sleep_time=1)
 
 def test_init(self):
 self.assertEqual(self.athena.task_id, MOCK_DATA['task_id'])
 self.assertEqual(self.athena.query, MOCK_DATA['query'])
 self.assertEqual(self.athena.database, MOCK_DATA['database'])
 self.assertEqual(self.athena.aws_conn_id, 'aws_default')
 self.assertEqual(self.athena.client_request_token, 
MOCK_DATA['client_request_token'])
+self.assertEqual(self.athena.sleep_time, 1)
+
+@mock.patch.object(AWSAthenaHook, 'check_query_status', 
side_effect=Iterator(("RUNNING", "SUCCESS",)))
 
 Review comment:
   I will add multiple test cases for short and long running queries. 
Explaining this whole logic as comment will be little odd :P


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230401305
 
 

 ##
 File path: tests/contrib/operators/test_aws_athena_operator.py
 ##
 @@ -48,42 +49,62 @@
 }
 
 
+class Iterator(object):
+
+def __init__(self, tuple):
+self.tuple = tuple
+
+def __iter__(self):
+return self
+
+def __next__(self):
+return self.tuple[randrange(len(self.tuple))]
+
+
 class TestAWSAthenaOperator(unittest.TestCase):
 
 def setUp(self):
 configuration.load_test_config()
 
 self.athena = AWSAthenaOperator(task_id='test_aws_athena_operator', 
query='SELECT * FROM TEST_TABLE',
 database='TEST_DATABASE', 
output_location='s3://test_s3_bucket/',
-
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595')
+
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595',
+sleep_time=1)
 
 def test_init(self):
 self.assertEqual(self.athena.task_id, MOCK_DATA['task_id'])
 self.assertEqual(self.athena.query, MOCK_DATA['query'])
 self.assertEqual(self.athena.database, MOCK_DATA['database'])
 self.assertEqual(self.athena.aws_conn_id, 'aws_default')
 self.assertEqual(self.athena.client_request_token, 
MOCK_DATA['client_request_token'])
+self.assertEqual(self.athena.sleep_time, 1)
+
+@mock.patch.object(AWSAthenaHook, 'check_query_status', 
side_effect=Iterator(("RUNNING", "SUCCESS",)))
 
 Review comment:
   How about two checks/calls one that has `side_effect=("RUNNING", "RUNNING", 
"RUNNING", "SUCCESS",)`  and a second that does `side_effect= "SUCCESS",)`?
   
   Or at very least capture your reasoning in a comment in the code :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS 
Athena Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230399221
 
 

 ##
 File path: tests/contrib/operators/test_aws_athena_operator.py
 ##
 @@ -48,42 +49,62 @@
 }
 
 
+class Iterator(object):
+
+def __init__(self, tuple):
+self.tuple = tuple
+
+def __iter__(self):
+return self
+
+def __next__(self):
+return self.tuple[randrange(len(self.tuple))]
+
+
 class TestAWSAthenaOperator(unittest.TestCase):
 
 def setUp(self):
 configuration.load_test_config()
 
 self.athena = AWSAthenaOperator(task_id='test_aws_athena_operator', 
query='SELECT * FROM TEST_TABLE',
 database='TEST_DATABASE', 
output_location='s3://test_s3_bucket/',
-
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595')
+
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595',
+sleep_time=1)
 
 def test_init(self):
 self.assertEqual(self.athena.task_id, MOCK_DATA['task_id'])
 self.assertEqual(self.athena.query, MOCK_DATA['query'])
 self.assertEqual(self.athena.database, MOCK_DATA['database'])
 self.assertEqual(self.athena.aws_conn_id, 'aws_default')
 self.assertEqual(self.athena.client_request_token, 
MOCK_DATA['client_request_token'])
+self.assertEqual(self.athena.sleep_time, 1)
+
+@mock.patch.object(AWSAthenaHook, 'check_query_status', 
side_effect=Iterator(("RUNNING", "SUCCESS",)))
 
 Review comment:
   At first i thought of having `side_effect=("RUNNING", "SUCCESS",)`, but 
later i realised in a real query, task won't reach state `SUCCESS` in single 
sleep cycle. Randomly picking state sounds little odd but for our case, it will 
serve the process. We will get multiple `RUNNING` states and after few sleep 
cycles we will encounter `SUCCESS` state and `poll_query_results` will return 
the final state. To make this more understandable we can add bias to state 
`RUNNING` and make it look realistic. Returning `SUCCESS` in first check 
symbolises a fast query and `SUCCESS` after few checks symbolises long running 
queries which we cannot achieve with plain `side_effect=("RUNNING", "SUCCESS",)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230394621
 
 

 ##
 File path: tests/contrib/operators/test_aws_athena_operator.py
 ##
 @@ -48,42 +49,62 @@
 }
 
 
+class Iterator(object):
+
+def __init__(self, tuple):
+self.tuple = tuple
+
+def __iter__(self):
+return self
+
+def __next__(self):
+return self.tuple[randrange(len(self.tuple))]
+
+
 class TestAWSAthenaOperator(unittest.TestCase):
 
 def setUp(self):
 configuration.load_test_config()
 
 self.athena = AWSAthenaOperator(task_id='test_aws_athena_operator', 
query='SELECT * FROM TEST_TABLE',
 database='TEST_DATABASE', 
output_location='s3://test_s3_bucket/',
-
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595')
+
client_request_token='eac427d0-1c6d-4dfb-96aa-2835d3ac6595',
+sleep_time=1)
 
 def test_init(self):
 self.assertEqual(self.athena.task_id, MOCK_DATA['task_id'])
 self.assertEqual(self.athena.query, MOCK_DATA['query'])
 self.assertEqual(self.athena.database, MOCK_DATA['database'])
 self.assertEqual(self.athena.aws_conn_id, 'aws_default')
 self.assertEqual(self.athena.client_request_token, 
MOCK_DATA['client_request_token'])
+self.assertEqual(self.athena.sleep_time, 1)
+
+@mock.patch.object(AWSAthenaHook, 'check_query_status', 
side_effect=Iterator(("RUNNING", "SUCCESS",)))
 
 Review comment:
   Random-ness in tests seem like a bad plan at first sight.
   
   Can you explain your reasoning for doing this, rather than say just 
`side_effect=["RUNNING", "SUCCESS"]` which could return running first time and 
success the second time. Right now with randomness we don't know how many times 
it might have to check the query otherwise, or am I missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230394035
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -0,0 +1,97 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.contrib.hooks.aws_athena_hook import AWSAthenaHook
+
+
+class AWSAthenaOperator(BaseOperator):
+"""
+Airflow operator to run presto queries on athena.
+
+:param query: Presto to be run on athena. (templated)
+:type query: str
+:param database: Database to select. (templated)
+:type database: str
+:param output_location: s3 path to write the query results into. 
(templated)
+:type output_location: str
+:param aws_conn_id: aws connection to use.
+:type aws_conn_id: str
+"""
+
+ui_color = '#44b5e2'
+template_fields = ('query', 'database', 'output_location')
+
+@apply_defaults
+def __init__(self, query, database, output_location, 
aws_conn_id='aws_default', *args, **kwargs):
+super(AWSAthenaOperator, self).__init__(*args, **kwargs)
+self.query = query
+self.database = database
+self.output_location = output_location
+self.aws_conn_id = aws_conn_id
+self.client_request_token = kwargs.get('client_request_token')
+self.query_execution_context = kwargs.get('query_execution_context') 
or {}
+self.result_configuration = kwargs.get('result_configuration') or {}
 
 Review comment:
   Importantly they should not have default of `{}` in the function - see 
https://docs.python-guide.org/writing/gotchas/


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
phani8996 commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS 
Athena Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230393373
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -0,0 +1,97 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.contrib.hooks.aws_athena_hook import AWSAthenaHook
+
+
+class AWSAthenaOperator(BaseOperator):
+"""
+Airflow operator to run presto queries on athena.
+
+:param query: Presto to be run on athena. (templated)
+:type query: str
+:param database: Database to select. (templated)
+:type database: str
+:param output_location: s3 path to write the query results into. 
(templated)
+:type output_location: str
+:param aws_conn_id: aws connection to use.
+:type aws_conn_id: str
+"""
+
+ui_color = '#44b5e2'
+template_fields = ('query', 'database', 'output_location')
+
+@apply_defaults
+def __init__(self, query, database, output_location, 
aws_conn_id='aws_default', *args, **kwargs):
+super(AWSAthenaOperator, self).__init__(*args, **kwargs)
+self.query = query
+self.database = database
+self.output_location = output_location
+self.aws_conn_id = aws_conn_id
+self.client_request_token = kwargs.get('client_request_token')
+self.query_execution_context = kwargs.get('query_execution_context') 
or {}
+self.result_configuration = kwargs.get('result_configuration') or {}
 
 Review comment:
   I will move them to args with defaults values.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-02 Thread GitBox
ashb commented on a change in pull request #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: https://github.com/apache/incubator-airflow/pull/4111#discussion_r230392180
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -0,0 +1,97 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.contrib.hooks.aws_athena_hook import AWSAthenaHook
+
+
+class AWSAthenaOperator(BaseOperator):
+"""
+Airflow operator to run presto queries on athena.
+
+:param query: Presto to be run on athena. (templated)
+:type query: str
+:param database: Database to select. (templated)
+:type database: str
+:param output_location: s3 path to write the query results into. 
(templated)
+:type output_location: str
+:param aws_conn_id: aws connection to use.
+:type aws_conn_id: str
+"""
+
+ui_color = '#44b5e2'
+template_fields = ('query', 'database', 'output_location')
+
+@apply_defaults
+def __init__(self, query, database, output_location, 
aws_conn_id='aws_default', *args, **kwargs):
+super(AWSAthenaOperator, self).__init__(*args, **kwargs)
+self.query = query
+self.database = database
+self.output_location = output_location
+self.aws_conn_id = aws_conn_id
+self.client_request_token = kwargs.get('client_request_token')
+self.query_execution_context = kwargs.get('query_execution_context') 
or {}
+self.result_configuration = kwargs.get('result_configuration') or {}
 
 Review comment:
   The problem here is that kwargs might contain `result_configuration` -  and 
this will cause the following warning to be emitted:
   
   ```
   Invalid arguments were passed to AWSAtheneaOperator. Support for passing 
such arguments will be dropped in Airflow 2.0. Invalid arguments were: args=[], 
kwargs={'result_configuration':...}
   ```
   
   In short, if an operator understands a arg/kwarg it must not be passed to 
the super constructor (currently on line 50). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sprzedwojski commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
sprzedwojski commented on a change in pull request #4124: [AIRFLOW-3276] Cloud 
SQL: database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124#discussion_r230388466
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param database: Name of the database in the instance.
+:type database: str
+:return: A Cloud SQL database resource, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource
+:rtype: dict
+"""
+return self.get_conn().databases().get(
+project=project_id,
+instance=instance,
+database=database
+).execute(num_retries=NUM_RETRIES)
+
+def create_database(self, project, instance, body):
+"""
+Creates a new database inside a Cloud SQL instance.
+
+:param project: Project ID of the project that contains the instance.
+:type project: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().databases().insert(
+project=project,
+instance=instance,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project, operation_name)
+
+def patch_database(self, project, instance, database, body):
 
 Review comment:
   Yes, sorry, forgot to explain this. You're right, `patch` is a partial 
update, while `update` is a full update.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3291) Update S3KeySensor to not need s3:GetObject permissions

2018-11-02 Thread Ash Berlin-Taylor (JIRA)
Ash Berlin-Taylor created AIRFLOW-3291:
--

 Summary: Update S3KeySensor to not need s3:GetObject permissions
 Key: AIRFLOW-3291
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3291
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Ash Berlin-Taylor


The S3KeySensor/S3Hook as it is currently written requires {{s3:GetObject}} 
permissions on the bucket (as it does a HeadObject API call) - it would be nice 
if it could use ListBucket instead as for our use case we don't really want 
Airflow reading the files as EMR does that for us.

This would be doable by changing the implemention of check_key_exists and 
check_wildcard_exists(?) to use list instead/only and not try to load the Key 
object.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3290) Issue when loading in UTF-8 into Postgres with pg_hook.copy_expert

2018-11-02 Thread Frank Cash (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank Cash updated AIRFLOW-3290:

Description: 
Problem:

 

Working on loading an ETL from mysql => postgres using both the mysql and 
postgres hook.  I am working on moving utf-8 character set of usernames into my 
postgres store.  The file is written as a UTF-8 file and thus it has a BOM 
header.  I would like to be able to define the encoding that the open function 
passes in on 
[https://github.com/apache/incubator-airflow/blob/53b89b98371c7bb993b242c341d3941e9ce09f9a/airflow/hooks/postgres_hook.py#L63.]
  An example username would be `frank—cash ` with the dash being an em dash. 

> Issue when loading in UTF-8 into Postgres with pg_hook.copy_expert
> --
>
> Key: AIRFLOW-3290
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3290
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Frank Cash
>Priority: Major
>
> Problem:
>  
> Working on loading an ETL from mysql => postgres using both the mysql and 
> postgres hook.  I am working on moving utf-8 character set of usernames into 
> my postgres store.  The file is written as a UTF-8 file and thus it has a BOM 
> header.  I would like to be able to define the encoding that the open 
> function passes in on 
> [https://github.com/apache/incubator-airflow/blob/53b89b98371c7bb993b242c341d3941e9ce09f9a/airflow/hooks/postgres_hook.py#L63.]
>   An example username would be `frank—cash ` with the dash being an em dash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3276) Google Cloud SQL database create / patch / delete operators

2018-11-02 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3276:

Component/s: gcp

> Google Cloud SQL database create / patch / delete operators
> ---
>
> Key: AIRFLOW-3276
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3276
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Reporter: Szymon Przedwojski
>Assignee: Szymon Przedwojski
>Priority: Minor
> Fix For: 1.10.1
>
>
> Operators allowing to invoke Google Cloud SQL's database methods:
> - CloudSqlInstanceDatabaseCreateOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert])
> - CloudSqlInstanceDatabasePatchOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/patch])
> - CloudSqlInstanceDatabaseDeleteOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/delete])



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on issue #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
kaxil commented on issue #4124: [AIRFLOW-3276] Cloud SQL: database create / 
patch / delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4124#issuecomment-435383137
 
 
   Thanks @sprzedwojski 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-3276) Google Cloud SQL database create / patch / delete operators

2018-11-02 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3276.
-
   Resolution: Fixed
Fix Version/s: 1.10.1

Resolved by https://github.com/apache/incubator-airflow/pull/4124

> Google Cloud SQL database create / patch / delete operators
> ---
>
> Key: AIRFLOW-3276
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3276
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Reporter: Szymon Przedwojski
>Assignee: Szymon Przedwojski
>Priority: Minor
> Fix For: 1.10.1
>
>
> Operators allowing to invoke Google Cloud SQL's database methods:
> - CloudSqlInstanceDatabaseCreateOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert])
> - CloudSqlInstanceDatabasePatchOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/patch])
> - CloudSqlInstanceDatabaseDeleteOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/delete])



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3290) Issue when loading in UTF-8 into Postgres with pg_hook.copy_expert

2018-11-02 Thread Frank Cash (JIRA)
Frank Cash created AIRFLOW-3290:
---

 Summary: Issue when loading in UTF-8 into Postgres with 
pg_hook.copy_expert
 Key: AIRFLOW-3290
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3290
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Frank Cash






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3276) Google Cloud SQL database create / patch / delete operators

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673108#comment-16673108
 ] 

ASF GitHub Bot commented on AIRFLOW-3276:
-

kaxil closed pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / 
patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/example_dags/example_gcp_sql.py 
b/airflow/contrib/example_dags/example_gcp_sql.py
index a484456f6e..136c88c843 100644
--- a/airflow/contrib/example_dags/example_gcp_sql.py
+++ b/airflow/contrib/example_dags/example_gcp_sql.py
@@ -18,26 +18,30 @@
 # under the License.
 
 """
-Example Airflow DAG that deploys, updates, patches and deletes a Cloud SQL 
instance
-in Google Cloud Platform.
+Example Airflow DAG that creates, patches and deletes a Cloud SQL instance, 
and also
+creates, patches and deletes a database inside the instance, in Google Cloud 
Platform.
 
-This DAG relies on the following Airflow variables
-https://airflow.apache.org/concepts.html#variables
+This DAG relies on the following environment variables
 * PROJECT_ID - Google Cloud Platform project for the Cloud SQL instance.
 * INSTANCE_NAME - Name of the Cloud SQL instance.
+* DB_NAME - Name of the database inside a Cloud SQL instance.
 """
 
+import os
 import datetime
 
 import airflow
 from airflow import models
 
 from airflow.contrib.operators.gcp_sql_operator import 
CloudSqlInstanceCreateOperator, \
-CloudSqlInstancePatchOperator, CloudSqlInstanceDeleteOperator
+CloudSqlInstancePatchOperator, CloudSqlInstanceDeleteOperator, \
+CloudSqlInstanceDatabaseCreateOperator, 
CloudSqlInstanceDatabasePatchOperator, \
+CloudSqlInstanceDatabaseDeleteOperator
 
 # [START howto_operator_cloudsql_arguments]
-PROJECT_ID = models.Variable.get('PROJECT_ID', '')
-INSTANCE_NAME = models.Variable.get('INSTANCE_NAME', '')
+PROJECT_ID = os.environ.get('PROJECT_ID', 'example-project')
+INSTANCE_NAME = os.environ.get('INSTANCE_NAME', 'testinstance')
+DB_NAME = os.environ.get('DB_NAME', 'testdb')
 # [END howto_operator_cloudsql_arguments]
 
 # Bodies below represent Cloud SQL instance resources:
@@ -97,6 +101,19 @@
 }
 }
 # [END howto_operator_cloudsql_patch_body]
+# [START howto_operator_cloudsql_db_create_body]
+db_create_body = {
+"instance": INSTANCE_NAME,
+"name": DB_NAME,
+"project": PROJECT_ID
+}
+# [END howto_operator_cloudsql_db_create_body]
+# [START howto_operator_cloudsql_db_patch_body]
+db_patch_body = {
+"charset": "utf16",
+"collation": "utf16_general_ci"
+}
+# [END howto_operator_cloudsql_db_patch_body]
 
 default_args = {
 'start_date': airflow.utils.dates.days_ago(1)
@@ -123,6 +140,31 @@
 task_id='sql_instance_patch_task'
 )
 # [END howto_operator_cloudsql_patch]
+# [START howto_operator_cloudsql_db_create]
+sql_db_create_task = CloudSqlInstanceDatabaseCreateOperator(
+project_id=PROJECT_ID,
+body=db_create_body,
+instance=INSTANCE_NAME,
+task_id='sql_db_create_task'
+)
+# [END howto_operator_cloudsql_db_create]
+# [START howto_operator_cloudsql_db_patch]
+sql_db_patch_task = CloudSqlInstanceDatabasePatchOperator(
+project_id=PROJECT_ID,
+body=db_patch_body,
+instance=INSTANCE_NAME,
+database=DB_NAME,
+task_id='sql_db_patch_task'
+)
+# [END howto_operator_cloudsql_db_patch]
+# [START howto_operator_cloudsql_db_delete]
+sql_db_delete_task = CloudSqlInstanceDatabaseDeleteOperator(
+project_id=PROJECT_ID,
+instance=INSTANCE_NAME,
+database=DB_NAME,
+task_id='sql_db_delete_task'
+)
+# [END howto_operator_cloudsql_db_delete]
 # [START howto_operator_cloudsql_delete]
 sql_instance_delete_task = CloudSqlInstanceDeleteOperator(
 project_id=PROJECT_ID,
@@ -131,4 +173,6 @@
 )
 # [END howto_operator_cloudsql_delete]
 
-sql_instance_create_task >> sql_instance_patch_task >> 
sql_instance_delete_task
+sql_instance_create_task >> sql_instance_patch_task \
+>> sql_db_create_task >> sql_db_patch_task \
+>> sql_db_delete_task >> sql_instance_delete_task
diff --git a/airflow/contrib/hooks/gcp_sql_hook.py 
b/airflow/contrib/hooks/gcp_sql_hook.py
index e0b3f92d8f..549ceaf49c 100644
--- a/airflow/contrib/hooks/gcp_sql_hook.py
+++ b/airflow/contrib/hooks/gcp_sql_hook.py
@@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, 

[GitHub] kaxil closed pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
kaxil closed pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / 
patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/example_dags/example_gcp_sql.py 
b/airflow/contrib/example_dags/example_gcp_sql.py
index a484456f6e..136c88c843 100644
--- a/airflow/contrib/example_dags/example_gcp_sql.py
+++ b/airflow/contrib/example_dags/example_gcp_sql.py
@@ -18,26 +18,30 @@
 # under the License.
 
 """
-Example Airflow DAG that deploys, updates, patches and deletes a Cloud SQL 
instance
-in Google Cloud Platform.
+Example Airflow DAG that creates, patches and deletes a Cloud SQL instance, 
and also
+creates, patches and deletes a database inside the instance, in Google Cloud 
Platform.
 
-This DAG relies on the following Airflow variables
-https://airflow.apache.org/concepts.html#variables
+This DAG relies on the following environment variables
 * PROJECT_ID - Google Cloud Platform project for the Cloud SQL instance.
 * INSTANCE_NAME - Name of the Cloud SQL instance.
+* DB_NAME - Name of the database inside a Cloud SQL instance.
 """
 
+import os
 import datetime
 
 import airflow
 from airflow import models
 
 from airflow.contrib.operators.gcp_sql_operator import 
CloudSqlInstanceCreateOperator, \
-CloudSqlInstancePatchOperator, CloudSqlInstanceDeleteOperator
+CloudSqlInstancePatchOperator, CloudSqlInstanceDeleteOperator, \
+CloudSqlInstanceDatabaseCreateOperator, 
CloudSqlInstanceDatabasePatchOperator, \
+CloudSqlInstanceDatabaseDeleteOperator
 
 # [START howto_operator_cloudsql_arguments]
-PROJECT_ID = models.Variable.get('PROJECT_ID', '')
-INSTANCE_NAME = models.Variable.get('INSTANCE_NAME', '')
+PROJECT_ID = os.environ.get('PROJECT_ID', 'example-project')
+INSTANCE_NAME = os.environ.get('INSTANCE_NAME', 'testinstance')
+DB_NAME = os.environ.get('DB_NAME', 'testdb')
 # [END howto_operator_cloudsql_arguments]
 
 # Bodies below represent Cloud SQL instance resources:
@@ -97,6 +101,19 @@
 }
 }
 # [END howto_operator_cloudsql_patch_body]
+# [START howto_operator_cloudsql_db_create_body]
+db_create_body = {
+"instance": INSTANCE_NAME,
+"name": DB_NAME,
+"project": PROJECT_ID
+}
+# [END howto_operator_cloudsql_db_create_body]
+# [START howto_operator_cloudsql_db_patch_body]
+db_patch_body = {
+"charset": "utf16",
+"collation": "utf16_general_ci"
+}
+# [END howto_operator_cloudsql_db_patch_body]
 
 default_args = {
 'start_date': airflow.utils.dates.days_ago(1)
@@ -123,6 +140,31 @@
 task_id='sql_instance_patch_task'
 )
 # [END howto_operator_cloudsql_patch]
+# [START howto_operator_cloudsql_db_create]
+sql_db_create_task = CloudSqlInstanceDatabaseCreateOperator(
+project_id=PROJECT_ID,
+body=db_create_body,
+instance=INSTANCE_NAME,
+task_id='sql_db_create_task'
+)
+# [END howto_operator_cloudsql_db_create]
+# [START howto_operator_cloudsql_db_patch]
+sql_db_patch_task = CloudSqlInstanceDatabasePatchOperator(
+project_id=PROJECT_ID,
+body=db_patch_body,
+instance=INSTANCE_NAME,
+database=DB_NAME,
+task_id='sql_db_patch_task'
+)
+# [END howto_operator_cloudsql_db_patch]
+# [START howto_operator_cloudsql_db_delete]
+sql_db_delete_task = CloudSqlInstanceDatabaseDeleteOperator(
+project_id=PROJECT_ID,
+instance=INSTANCE_NAME,
+database=DB_NAME,
+task_id='sql_db_delete_task'
+)
+# [END howto_operator_cloudsql_db_delete]
 # [START howto_operator_cloudsql_delete]
 sql_instance_delete_task = CloudSqlInstanceDeleteOperator(
 project_id=PROJECT_ID,
@@ -131,4 +173,6 @@
 )
 # [END howto_operator_cloudsql_delete]
 
-sql_instance_create_task >> sql_instance_patch_task >> 
sql_instance_delete_task
+sql_instance_create_task >> sql_instance_patch_task \
+>> sql_db_create_task >> sql_db_patch_task \
+>> sql_db_delete_task >> sql_instance_delete_task
diff --git a/airflow/contrib/hooks/gcp_sql_hook.py 
b/airflow/contrib/hooks/gcp_sql_hook.py
index e0b3f92d8f..549ceaf49c 100644
--- a/airflow/contrib/hooks/gcp_sql_hook.py
+++ b/airflow/contrib/hooks/gcp_sql_hook.py
@@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database 

[GitHub] kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: 
database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124#discussion_r230375578
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param database: Name of the database in the instance.
+:type database: str
+:return: A Cloud SQL database resource, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource
+:rtype: dict
+"""
+return self.get_conn().databases().get(
+project=project_id,
+instance=instance,
+database=database
+).execute(num_retries=NUM_RETRIES)
+
+def create_database(self, project, instance, body):
+"""
+Creates a new database inside a Cloud SQL instance.
+
+:param project: Project ID of the project that contains the instance.
+:type project: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().databases().insert(
+project=project,
+instance=instance,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project, operation_name)
+
+def patch_database(self, project, instance, database, body):
 
 Review comment:
   I didn't know that they are 2 different operatons. I am still trying to 
figure out the difference between both the operations. Looks like patch does 
partial update only.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: 
database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124#discussion_r230375202
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param database: Name of the database in the instance.
+:type database: str
+:return: A Cloud SQL database resource, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource
+:rtype: dict
+"""
+return self.get_conn().databases().get(
+project=project_id,
+instance=instance,
+database=database
+).execute(num_retries=NUM_RETRIES)
+
+def create_database(self, project, instance, body):
+"""
+Creates a new database inside a Cloud SQL instance.
+
+:param project: Project ID of the project that contains the instance.
+:type project: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().databases().insert(
+project=project,
+instance=instance,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project, operation_name)
+
+def patch_database(self, project, instance, database, body):
 
 Review comment:
   Ya. Makes sense then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-3262) Can't get log containing Response when using SimpleHttpOperator

2018-11-02 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3262.
-
Resolution: Fixed

Resolved by https://github.com/apache/incubator-airflow/pull/4102

> Can't get log containing Response when using SimpleHttpOperator
> ---
>
> Key: AIRFLOW-3262
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3262
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Trivial
> Fix For: 1.10.1
>
>
> When you use SimpleHttpOperator for things like ElasticSearch, you want to 
> get the response in the logs as well. Currently, the only workaround is to 
> use `xcom_push` and push the content to xcom and in the next task get the 
> response.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3262) Can't get log containing Response when using SimpleHttpOperator

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673100#comment-16673100
 ] 

ASF GitHub Bot commented on AIRFLOW-3262:
-

kaxil closed pull request #4102: [AIRFLOW-3262] Add param to log response when 
using SimpleHttpOperator
URL: https://github.com/apache/incubator-airflow/pull/4102
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/operators/http_operator.py 
b/airflow/operators/http_operator.py
index 0585a92a86..3e00de96eb 100644
--- a/airflow/operators/http_operator.py
+++ b/airflow/operators/http_operator.py
@@ -46,6 +46,10 @@ class SimpleHttpOperator(BaseOperator):
 'requests' documentation (options to modify timeout, ssl, etc.)
 :type extra_options: A dictionary of options, where key is string and value
 depends on the option that's being modified.
+:param xcom_push: Push the response to Xcom (default: False)
+:type xcom_push: bool
+:param log_response: Log the response (default: False)
+:type log_response: bool
 """
 
 template_fields = ('endpoint', 'data',)
@@ -61,7 +65,9 @@ def __init__(self,
  response_check=None,
  extra_options=None,
  xcom_push=False,
- http_conn_id='http_default', *args, **kwargs):
+ http_conn_id='http_default',
+ log_response=False,
+ *args, **kwargs):
 """
 If xcom_push is True, response of an HTTP request will also
 be pushed to an XCom.
@@ -75,6 +81,7 @@ def __init__(self,
 self.response_check = response_check
 self.extra_options = extra_options or {}
 self.xcom_push_flag = xcom_push
+self.log_response = log_response
 
 def execute(self, context):
 http = HttpHook(self.method, http_conn_id=self.http_conn_id)
@@ -90,3 +97,5 @@ def execute(self, context):
 raise AirflowException("Response check returned False.")
 if self.xcom_push_flag:
 return response.text
+if self.log_response:
+self.log.info(response.text)
diff --git a/tests/operators/test_http_operator.py 
b/tests/operators/test_http_operator.py
new file mode 100644
index 00..6ab2c03bb5
--- /dev/null
+++ b/tests/operators/test_http_operator.py
@@ -0,0 +1,63 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import unittest
+
+from mock import patch
+from airflow.operators.http_operator import SimpleHttpOperator
+
+try:
+from unittest import mock
+except ImportError:
+try:
+import mock
+except ImportError:
+mock = None
+
+
+class AnyStringWith(str):
+"""
+Helper class to check if a substring is a part of a string
+"""
+def __eq__(self, other):
+return self in other
+
+
+class SimpleHttpOpTests(unittest.TestCase):
+def setUp(self):
+# Creating a local Http connection to Airflow Webserver
+os.environ['AIRFLOW_CONN_HTTP_GOOGLE'] = 'http://www.google.com'
+
+def test_response_in_logs(self):
+"""
+Test that when using SimpleHttpOperator with 'GET' on localhost:8080,
+the log contains 'Google' in it
+"""
+operator = SimpleHttpOperator(
+task_id='test_HTTP_op',
+method='GET',
+endpoint='/',
+http_conn_id='HTTP_GOOGLE',
+log_response=True,
+)
+
+with patch.object(operator.log, 'info') as mock_info:
+operator.execute(None)
+mock_info.assert_called_with(AnyStringWith('Google'))


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:

[GitHub] kaxil closed pull request #4102: [AIRFLOW-3262] Add param to log response when using SimpleHttpOperator

2018-11-02 Thread GitBox
kaxil closed pull request #4102: [AIRFLOW-3262] Add param to log response when 
using SimpleHttpOperator
URL: https://github.com/apache/incubator-airflow/pull/4102
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/operators/http_operator.py 
b/airflow/operators/http_operator.py
index 0585a92a86..3e00de96eb 100644
--- a/airflow/operators/http_operator.py
+++ b/airflow/operators/http_operator.py
@@ -46,6 +46,10 @@ class SimpleHttpOperator(BaseOperator):
 'requests' documentation (options to modify timeout, ssl, etc.)
 :type extra_options: A dictionary of options, where key is string and value
 depends on the option that's being modified.
+:param xcom_push: Push the response to Xcom (default: False)
+:type xcom_push: bool
+:param log_response: Log the response (default: False)
+:type log_response: bool
 """
 
 template_fields = ('endpoint', 'data',)
@@ -61,7 +65,9 @@ def __init__(self,
  response_check=None,
  extra_options=None,
  xcom_push=False,
- http_conn_id='http_default', *args, **kwargs):
+ http_conn_id='http_default',
+ log_response=False,
+ *args, **kwargs):
 """
 If xcom_push is True, response of an HTTP request will also
 be pushed to an XCom.
@@ -75,6 +81,7 @@ def __init__(self,
 self.response_check = response_check
 self.extra_options = extra_options or {}
 self.xcom_push_flag = xcom_push
+self.log_response = log_response
 
 def execute(self, context):
 http = HttpHook(self.method, http_conn_id=self.http_conn_id)
@@ -90,3 +97,5 @@ def execute(self, context):
 raise AirflowException("Response check returned False.")
 if self.xcom_push_flag:
 return response.text
+if self.log_response:
+self.log.info(response.text)
diff --git a/tests/operators/test_http_operator.py 
b/tests/operators/test_http_operator.py
new file mode 100644
index 00..6ab2c03bb5
--- /dev/null
+++ b/tests/operators/test_http_operator.py
@@ -0,0 +1,63 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import unittest
+
+from mock import patch
+from airflow.operators.http_operator import SimpleHttpOperator
+
+try:
+from unittest import mock
+except ImportError:
+try:
+import mock
+except ImportError:
+mock = None
+
+
+class AnyStringWith(str):
+"""
+Helper class to check if a substring is a part of a string
+"""
+def __eq__(self, other):
+return self in other
+
+
+class SimpleHttpOpTests(unittest.TestCase):
+def setUp(self):
+# Creating a local Http connection to Airflow Webserver
+os.environ['AIRFLOW_CONN_HTTP_GOOGLE'] = 'http://www.google.com'
+
+def test_response_in_logs(self):
+"""
+Test that when using SimpleHttpOperator with 'GET' on localhost:8080,
+the log contains 'Google' in it
+"""
+operator = SimpleHttpOperator(
+task_id='test_HTTP_op',
+method='GET',
+endpoint='/',
+http_conn_id='HTTP_GOOGLE',
+log_response=True,
+)
+
+with patch.object(operator.log, 'info') as mock_info:
+operator.execute(None)
+mock_info.assert_called_with(AnyStringWith('Google'))


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sprzedwojski commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
sprzedwojski commented on a change in pull request #4124: [AIRFLOW-3276] Cloud 
SQL: database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124#discussion_r230349527
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param database: Name of the database in the instance.
+:type database: str
+:return: A Cloud SQL database resource, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource
+:rtype: dict
+"""
+return self.get_conn().databases().get(
+project=project_id,
+instance=instance,
+database=database
+).execute(num_retries=NUM_RETRIES)
+
+def create_database(self, project, instance, body):
+"""
+Creates a new database inside a Cloud SQL instance.
+
+:param project: Project ID of the project that contains the instance.
+:type project: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().databases().insert(
+project=project,
+instance=instance,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project, operation_name)
+
+def patch_database(self, project, instance, database, body):
 
 Review comment:
   Actually, in Cloud SQL, both for instances and for databases, 
[patch](https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/patch)
 and 
[update](https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/update)
 are two different operations.
   
   I've decided for now to implement only the patch operator, however in the 
future there could be a separate update operator, too.
   
   So I'd rather stick to the same names that Cloud SQL uses to avoid any 
confusion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673000#comment-16673000
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3270:


Can you include the full stack trace? The problem could be in your 
{{superuser_filter}} (the '{{ = }}' at the end of it look suspect at first 
sight.

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3289) BashOperator mangles {{\}} escapes in commands

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3289:
---
Summary: BashOperator mangles {{\}} escapes in commands  (was: sed called 
from BashOperator not working as expected)

> BashOperator mangles {{\}} escapes in commands
> --
>
> Key: AIRFLOW-3289
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3289
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Nikolay Semyachkin
>Priority: Major
> Attachments: example.csv, issue_proof.py
>
>
> I want to call a sed command on csv file to replace empty values (,,) with \N.
> I can do it with the following bash command 
> {code:java}
> cat example.csv | sed 's;,,;,\\N,;g' > example_processed.csv{code}
> But when I try to do the same with airflow BashOperator, it substitutes ,, 
> with N (instead of \N).
>  
> I attached the code and csv file to reproduce.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3289) sed called from BashOperator not working as expected

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672996#comment-16672996
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3289:


I suspect a work-around for now is to specify \{{N}} - there may be some 
escaping bug in the bash operator

> sed called from BashOperator not working as expected
> 
>
> Key: AIRFLOW-3289
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3289
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Nikolay Semyachkin
>Priority: Major
> Attachments: example.csv, issue_proof.py
>
>
> I want to call a sed command on csv file to replace empty values (,,) with \N.
> I can do it with the following bash command 
> {code:java}
> cat example.csv | sed 's;,,;,\\N,;g' > example_processed.csv{code}
> But when I try to do the same with airflow BashOperator, it substitutes ,, 
> with N (instead of \N).
>  
> I attached the code and csv file to reproduce.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Krishna ADDEPALLI LN updated AIRFLOW-3270:
---
Priority: Blocker  (was: Critical)

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Blocker
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
kaxil commented on a change in pull request #4124: [AIRFLOW-3276] Cloud SQL: 
database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124#discussion_r230343865
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -144,6 +144,96 @@ def delete_instance(self, project_id, instance):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project_id, operation_name)
 
+def get_database(self, project_id, instance, database):
+"""
+Retrieves a database resource from a Cloud SQL instance.
+
+:param project_id: Project ID of the project that contains the 
instance.
+:type project_id: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param database: Name of the database in the instance.
+:type database: str
+:return: A Cloud SQL database resource, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource
+:rtype: dict
+"""
+return self.get_conn().databases().get(
+project=project_id,
+instance=instance,
+database=database
+).execute(num_retries=NUM_RETRIES)
+
+def create_database(self, project, instance, body):
+"""
+Creates a new database inside a Cloud SQL instance.
+
+:param project: Project ID of the project that contains the instance.
+:type project: str
+:param instance: Database instance ID. This does not include the 
project ID.
+:type instance: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().databases().insert(
+project=project,
+instance=instance,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project, operation_name)
+
+def patch_database(self, project, instance, database, body):
 
 Review comment:
   Hmm.. Lets make this `update_database`. There was something similar in the 
last PR as well. Can we change patch to update in that too?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3289) sed called from BashOperator not working as expected

2018-11-02 Thread Nikolay Semyachkin (JIRA)
Nikolay Semyachkin created AIRFLOW-3289:
---

 Summary: sed called from BashOperator not working as expected
 Key: AIRFLOW-3289
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3289
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Nikolay Semyachkin
 Attachments: example.csv, issue_proof.py

I want to call a sed command on csv file to replace empty values (,,) with \N.

I can do it with the following bash command 
{code:java}
cat example.csv | sed 's;,,;,\\N,;g' > example_processed.csv{code}
But when I try to do the same with airflow BashOperator, it substitutes ,, with 
N (instead of \N).
 
I attached the code and csv file to reproduce.
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3287) Core tests DB clean up to be run in the right moment

2018-11-02 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3287.

Resolution: Fixed

Thanks, good spot!

> Core tests DB clean up to be run in the right moment
> 
>
> Key: AIRFLOW-3287
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3287
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Reporter: Jarosław Śmietanka
>Assignee: Jarosław Śmietanka
>Priority: Minor
>
> While running a single unit test of some Dataproc operator, I've spotted that 
> database clean-up code in `tests.core` module is triggered, which should not 
> take place since I run unit test in a completely different place.  
> A proposed solution is to move this database clean-up code inside the 
> CoreTest as `tearDown`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3287) Core tests DB clean up to be run in the right moment

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672913#comment-16672913
 ] 

ASF GitHub Bot commented on AIRFLOW-3287:
-

ashb closed pull request #4122: [AIRFLOW-3287] Moving DB clean-up code into the 
CoreTest.tearDown()
URL: https://github.com/apache/incubator-airflow/pull/4122
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/core.py b/tests/core.py
index 679ddbc125..64195b6349 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -89,19 +89,6 @@
 import pickle
 
 
-def reset(dag_id=TEST_DAG_ID):
-session = Session()
-tis = session.query(models.TaskInstance).filter_by(dag_id=dag_id)
-tis.delete()
-session.commit()
-session.close()
-
-
-configuration.conf.load_test_config()
-if os.environ.get('KUBERNETES_VERSION') is None:
-reset()
-
-
 class OperatorSubclass(BaseOperator):
 """
 An operator to test template substitution
@@ -130,6 +117,16 @@ def setUp(self):
 self.run_after_loop = self.dag_bash.get_task('run_after_loop')
 self.run_this_last = self.dag_bash.get_task('run_this_last')
 
+def tearDown(self):
+if os.environ.get('KUBERNETES_VERSION') is None:
+session = Session()
+session.query(models.TaskInstance).filter_by(
+dag_id=TEST_DAG_ID).delete()
+session.query(models.TaskFail).filter_by(
+dag_id=TEST_DAG_ID).delete()
+session.commit()
+session.close()
+
 def test_schedule_dag_no_previous_runs(self):
 """
 Tests scheduling a dag with no previous runs


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Core tests DB clean up to be run in the right moment
> 
>
> Key: AIRFLOW-3287
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3287
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Reporter: Jarosław Śmietanka
>Assignee: Jarosław Śmietanka
>Priority: Minor
>
> While running a single unit test of some Dataproc operator, I've spotted that 
> database clean-up code in `tests.core` module is triggered, which should not 
> take place since I run unit test in a completely different place.  
> A proposed solution is to move this database clean-up code inside the 
> CoreTest as `tearDown`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb closed pull request #4122: [AIRFLOW-3287] Moving DB clean-up code into the CoreTest.tearDown()

2018-11-02 Thread GitBox
ashb closed pull request #4122: [AIRFLOW-3287] Moving DB clean-up code into the 
CoreTest.tearDown()
URL: https://github.com/apache/incubator-airflow/pull/4122
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/core.py b/tests/core.py
index 679ddbc125..64195b6349 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -89,19 +89,6 @@
 import pickle
 
 
-def reset(dag_id=TEST_DAG_ID):
-session = Session()
-tis = session.query(models.TaskInstance).filter_by(dag_id=dag_id)
-tis.delete()
-session.commit()
-session.close()
-
-
-configuration.conf.load_test_config()
-if os.environ.get('KUBERNETES_VERSION') is None:
-reset()
-
-
 class OperatorSubclass(BaseOperator):
 """
 An operator to test template substitution
@@ -130,6 +117,16 @@ def setUp(self):
 self.run_after_loop = self.dag_bash.get_task('run_after_loop')
 self.run_this_last = self.dag_bash.get_task('run_this_last')
 
+def tearDown(self):
+if os.environ.get('KUBERNETES_VERSION') is None:
+session = Session()
+session.query(models.TaskInstance).filter_by(
+dag_id=TEST_DAG_ID).delete()
+session.query(models.TaskFail).filter_by(
+dag_id=TEST_DAG_ID).delete()
+session.commit()
+session.close()
+
 def test_schedule_dag_no_previous_runs(self):
 """
 Tests scheduling a dag with no previous runs


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
codecov-io commented on issue #4126: [AIRFLOW-2524] More AWS SageMaker 
operators, sensors for model, endpoint-config and endpoint
URL: 
https://github.com/apache/incubator-airflow/pull/4126#issuecomment-435335196
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=h1)
 Report
   > Merging 
[#4126](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/bc3108edc1f63208be1d3bf8893c22bb12c7bc9f?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4126/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4126   +/-   ##
   ===
 Coverage   76.66%   76.66%   
   ===
 Files 199  199   
 Lines   1620916209   
   ===
 Hits1242712427   
 Misses   3782 3782
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/hooks/S3\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4126/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9TM19ob29rLnB5)
 | `93.03% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=footer).
 Last update 
[bc3108e...b43dfad](https://codecov.io/gh/apache/incubator-airflow/pull/4126?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672834#comment-16672834
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

yangaws opened a new pull request #4126: [AIRFLOW-2524] Add operators, sensors 
for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2524
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   1. Add operators for SageMaker model, endpoint config and endpoint
   2. Add sensor for SageMaker endpoint
   3. Update docstrings
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [] My commits all reference Jira issues in their subject lines, and I have 
squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
> Fix For: 1.10.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yangaws opened a new pull request #4126: [AIRFLOW-2524] Add operators, sensors for model, endpoint-config and endpoint

2018-11-02 Thread GitBox
yangaws opened a new pull request #4126: [AIRFLOW-2524] Add operators, sensors 
for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-2524
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   1. Add operators for SageMaker model, endpoint config and endpoint
   2. Add sensor for SageMaker endpoint
   3. Update docstrings
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [] My commits all reference Jira issues in their subject lines, and I have 
squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4006: [AIRFLOW-3164] Verify server certificate when connecting to LDAP

2018-11-02 Thread GitBox
kaxil commented on a change in pull request #4006: [AIRFLOW-3164] Verify server 
certificate when connecting to LDAP
URL: https://github.com/apache/incubator-airflow/pull/4006#discussion_r230315644
 
 

 ##
 File path: airflow/contrib/auth/backends/ldap_auth.py
 ##
 @@ -55,16 +55,20 @@ class LdapException(Exception):
 
 
 def get_ldap_connection(dn=None, password=None):
-tls_configuration = None
-use_ssl = False
+cacert = None
 try:
 cacert = configuration.conf.get("ldap", "cacert")
-tls_configuration = Tls(validate=ssl.CERT_REQUIRED, 
ca_certs_file=cacert)
-use_ssl = True
-except Exception:
+except AirflowConfigException:
 pass
 
-server = Server(configuration.conf.get("ldap", "uri"), use_ssl, 
tls_configuration)
+tls_configuration = Tls(validate=ssl.CERT_REQUIRED,
+version=ssl.PROTOCOL_SSLv23,
 
 Review comment:
   I agree with this. We should definitely default to `TLSv1.2`. We underwent a 
pentest for one of our environment, and this was raised as an issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
codecov-io edited a comment on issue #4124: [AIRFLOW-3276] Cloud SQL: database 
create / patch / delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4124#issuecomment-435311241
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=h1)
 Report
   > Merging 
[#4124](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/bc3108edc1f63208be1d3bf8893c22bb12c7bc9f?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4124/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4124  +/-   ##
   ==
   - Coverage   76.66%   76.66%   -0.01% 
   ==
 Files 199  199  
 Lines   1620916209  
   ==
   - Hits1242712426   -1 
   - Misses   3782 3783   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4124/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.04% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=footer).
 Last update 
[bc3108e...1fec034](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
codecov-io commented on issue #4124: [AIRFLOW-3276] Cloud SQL: database create 
/ patch / delete operators
URL: 
https://github.com/apache/incubator-airflow/pull/4124#issuecomment-435311241
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=h1)
 Report
   > Merging 
[#4124](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/bc3108edc1f63208be1d3bf8893c22bb12c7bc9f?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4124/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4124  +/-   ##
   ==
   - Coverage   76.66%   76.66%   -0.01% 
   ==
 Files 199  199  
 Lines   1620916209  
   ==
   - Hits1242712426   -1 
   - Misses   3782 3783   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4124/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.04% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=footer).
 Last update 
[bc3108e...1fec034](https://codecov.io/gh/apache/incubator-airflow/pull/4124?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2715) Dataflow template operator dosenot support region parameter

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672768#comment-16672768
 ] 

ASF GitHub Bot commented on AIRFLOW-2715:
-

janhicken opened a new pull request #4125: [AIRFLOW-2715] Pick up the region 
setting while launching Dataflow templates
URL: https://github.com/apache/incubator-airflow/pull/4125
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   To launch an instance of a Dataflow template in the configured region,
   the API service.projects().locations().teplates() instead of
   service.projects().templates() has to be used. Otherwise, all jobs will
   always be started in us-central1.
   
   To make it even worse, the polling for the job status already honors the
   region parameter and will search for the job in the wrong region in the
   current implementation. Because the job's status is not found, the
   corresponding Airflow task will hang.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Should work with existing tests.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Dataflow template operator dosenot support region parameter
> ---
>
> Key: AIRFLOW-2715
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2715
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.9.0
>Reporter: Mohammed Tameem
>Priority: Critical
> Fix For: 2.0.0
>
>
> The DataflowTemplateOperator  uses dataflow.projects.templates.launch which 
> has a region parameter but only supports execution of the dataflow job in the 
> us-central1 region. Alternatively  there is another api, 
> dataflow.projects.locations.templates.launch which supports execution of the 
> template in all regional endpoints provided by google cloud.
> It would be great if,
>  # The base REST API of this operator could be changed from 
> "dataflow.projects.templates.launch" to 
> "dataflow.projects.locations.templates.launch"
>  # A templated region paramter was included in the operator to run the 
> dataflow job in the requested regional endpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] janhicken opened a new pull request #4125: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates

2018-11-02 Thread GitBox
janhicken opened a new pull request #4125: [AIRFLOW-2715] Pick up the region 
setting while launching Dataflow templates
URL: https://github.com/apache/incubator-airflow/pull/4125
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   To launch an instance of a Dataflow template in the configured region,
   the API service.projects().locations().teplates() instead of
   service.projects().templates() has to be used. Otherwise, all jobs will
   always be started in us-central1.
   
   To make it even worse, the polling for the job status already honors the
   region parameter and will search for the job in the wrong region in the
   current implementation. Because the job's status is not found, the
   corresponding Airflow task will hang.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Should work with existing tests.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sprzedwojski opened a new pull request #4124: [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators

2018-11-02 Thread GitBox
sprzedwojski opened a new pull request #4124: [AIRFLOW-3276] Cloud SQL: 
database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124
 
 
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3276) issues and references 
them in the PR title.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Adding 3 new operators to communicate with Google Cloud Platform's Cloud SQL 
database resources:
   
   - **CloudSqlInstanceDatabaseCreateOperator**
   Creates a new database inside a Cloud SQL instance.
   - **CloudSqlInstanceDatabasePatchOperator**
   Updates a resource containing information about a database inside a Cloud 
SQL instance using patch semantics.
   - **CloudSqlInstanceDatabaseDeleteOperator**
   Deletes a database from a Cloud SQL instance.
   
   ### Tests
   
   - [x] My PR adds the following unit tests:
   `test_gcp_sql_operator.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3276) Google Cloud SQL database create / patch / delete operators

2018-11-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672747#comment-16672747
 ] 

ASF GitHub Bot commented on AIRFLOW-3276:
-

sprzedwojski opened a new pull request #4124: [AIRFLOW-3276] Cloud SQL: 
database create / patch / delete operators
URL: https://github.com/apache/incubator-airflow/pull/4124
 
 
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3276) issues and references 
them in the PR title.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Adding 3 new operators to communicate with Google Cloud Platform's Cloud SQL 
database resources:
   
   - **CloudSqlInstanceDatabaseCreateOperator**
   Creates a new database inside a Cloud SQL instance.
   - **CloudSqlInstanceDatabasePatchOperator**
   Updates a resource containing information about a database inside a Cloud 
SQL instance using patch semantics.
   - **CloudSqlInstanceDatabaseDeleteOperator**
   Deletes a database from a Cloud SQL instance.
   
   ### Tests
   
   - [x] My PR adds the following unit tests:
   `test_gcp_sql_operator.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Google Cloud SQL database create / patch / delete operators
> ---
>
> Key: AIRFLOW-3276
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3276
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Szymon Przedwojski
>Assignee: Szymon Przedwojski
>Priority: Minor
>
> Operators allowing to invoke Google Cloud SQL's database methods:
> - CloudSqlInstanceDatabaseCreateOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/insert])
> - CloudSqlInstanceDatabasePatchOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/patch])
> - CloudSqlInstanceDatabaseDeleteOperator 
> ([https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases/delete])



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3270) Apache airflow 1.10.0 integration with LDAP anonmyously

2018-11-02 Thread Hari Krishna ADDEPALLI LN (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672687#comment-16672687
 ] 

Hari Krishna ADDEPALLI LN commented on AIRFLOW-3270:


Here - please respond! [~ashb] - please advice resolution for this.

> Apache airflow 1.10.0 integration with LDAP anonmyously
> ---
>
> Key: AIRFLOW-3270
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3270
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication
>Affects Versions: 1.10.0
>Reporter: Hari Krishna ADDEPALLI LN
>Priority: Critical
>
> Please advise what to include in airflow.cfg when going to integrate with 
> LDAP anonymously ? We are using DS389 as LDAP server vendor name. 
>  
> {noformat}
> [webserver] 
> authenticate = True 
> auth_backend = airflow.contrib.auth.backends.ldap_auth  
> {noformat}
>  
> And 
>  
> {noformat}
> [ldap] 
> uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
> user_filter = memberOf=cn=rvs-all-prd_usphx,ou=groups,dc=odc,dc=im
> user_name_attr = uid 
> group_member_attr =
> superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
> data_profiler_filter = 
> bind_user = 
> bind_password = 
> basedn = ou=people,dc=odc,dc=im 
> cacert = /opt/orchestration/airflow/ldap_ca.crt 
> search_scope = LEVEL
> {noformat}
> I am hitting below exception:
> {noformat}
>   File "/usr/local/lib/python3.5/site-packages/ldap3/operation/search.py", 
> line 215, in parse_filter     
> raise LDAPInvalidFilterError('malformed filter') 
> ldap3.core.exceptions.LDAPInvalidFilterError: malformed filter
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)