[jira] [Created] (AIRFLOW-6976) Correct cli dag command ignore-first-depends-on-past

2020-03-02 Thread zhongjiajie (Jira)
zhongjiajie created AIRFLOW-6976:


 Summary: Correct cli dag command ignore-first-depends-on-past
 Key: AIRFLOW-6976
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6976
 Project: Apache Airflow
  Issue Type: Improvement
  Components: cli
Affects Versions: 1.10.9
Reporter: zhongjiajie
Assignee: zhongjiajie


ref PR [https://github.com/apache/airflow/pull/7490]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True

2020-03-02 Thread GitBox
zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default 
ignore_first_depends_on_past to True
URL: https://github.com/apache/airflow/pull/7490#discussion_r386845124
 
 

 ##
 File path: airflow/cli/commands/dag_command.py
 ##
 @@ -67,6 +67,13 @@ def dag_backfill(args, dag=None):
 
 signal.signal(signal.SIGTERM, sigint_handler)
 
+import warnings
+warnings.warn('--ignore_first_depends_on_past is deprecated as the value 
is always set to True',
 
 Review comment:
   will submit new PR to correct it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True

2020-03-02 Thread GitBox
zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default 
ignore_first_depends_on_past to True
URL: https://github.com/apache/airflow/pull/7490#discussion_r386844032
 
 

 ##
 File path: airflow/cli/commands/dag_command.py
 ##
 @@ -67,6 +67,13 @@ def dag_backfill(args, dag=None):
 
 signal.signal(signal.SIGTERM, sigint_handler)
 
+import warnings
+warnings.warn('--ignore_first_depends_on_past is deprecated as the value 
is always set to True',
 
 Review comment:
   Should be  `--ignore-first-depends-on-past` due to 
https://github.com/apache/airflow/pull/7148


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6975) Base AWSHook AssumeRoleWithSAML

2020-03-02 Thread Bjorn Olsen (Jira)
Bjorn Olsen created AIRFLOW-6975:


 Summary: Base AWSHook AssumeRoleWithSAML
 Key: AIRFLOW-6975
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6975
 Project: Apache Airflow
  Issue Type: Improvement
  Components: aws
Affects Versions: 1.10.9
Reporter: Bjorn Olsen
Assignee: Bjorn Olsen


Base AWS Hook currently does AssumeRole but we require it to additionally be 
able to do AssumeRoleWithSAML.

+Current+

[https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html#api_assumerole]

The AssumeRole API operation is useful for allowing existing IAM users to 
access AWS resources that they don't already have access to.

(This requires an AWS IAM user)

+Proposed addition+

[https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html#api_assumerolewithsaml]

The AssumeRoleWithSAML API operation returns a set of temporary security 
credentials for federated users who are authenticated by your organization's 
existing identity system.

(This allows federated login using another IDP rather than requiring an AWS IAM 
user).

 

+Use case+

We need to be able to authenticate an AD user against our IDP (Windows Active 
Directory).

We can obtain a SAML assertion from our IDP, and then provide it to AWS STS to 
exchange it for AWS temporary credentials, thus authorising us to use AWS 
services. 

The AWS AssumeRoleWithSAML API is intended for this use case, and the Base AWS 
Hook should be updated to allow for this method of authentication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] stale[bot] commented on issue #7184: [AIRFLOW-6574] Adding private_environment to docker operator.

2020-03-02 Thread GitBox
stale[bot] commented on issue #7184: [AIRFLOW-6574] Adding private_environment 
to docker operator.
URL: https://github.com/apache/airflow/pull/7184#issuecomment-593738311
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] stale[bot] commented on issue #7177: [AIRFLOW-6571] Rewrite BigQueryExecuteQueryOperator to use python client

2020-03-02 Thread GitBox
stale[bot] commented on issue #7177: [AIRFLOW-6571] Rewrite 
BigQueryExecuteQueryOperator to use python client
URL: https://github.com/apache/airflow/pull/7177#issuecomment-593738317
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] stale[bot] commented on issue #7087: [AIRFLOW-XXXX] Clarify breaking changes to macros in updating.md

2020-03-02 Thread GitBox
stale[bot] commented on issue #7087: [AIRFLOW-] Clarify breaking changes to 
macros in updating.md
URL: https://github.com/apache/airflow/pull/7087#issuecomment-593738315
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6974) Using MS SQL Server 17 as a backend, Migration cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = True

2020-03-02 Thread Tony Brookes (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Brookes updated AIRFLOW-6974:
--
Description: 
This took me a while to figure out as there was another issue with the 
migration in question which is 
cc1e65623dc7_add_max_tries_column_to_task_instance.py

This file USED to have an issue where it would sit there forever during an 
initdb on MS SQL Server, essentially deadlocked with itself.

I couldn't figure out why it was still sitting there for me, given that I was 
using the version of the migration where this had been fixed, so I went looking 
at the locks on the DB.  I found TWO processes running on the DB both 
originating inside the airflow initdb Python instance.

The first was happily sitting there trying to query the max_retries column on a 
table, but the other was attempting to query the table "slot_pool" from within 
example_subdag_operator.py .  I killed the session which was querying that 
table and of course my Python process crashed, helpfully with a stack trace.

The session I killed was interacting with the DB running in EXAMPLES and was 
actually complaining that the table was not a valid object name.  As soon as I 
set load_examples = False, the initdb process ran through in a few seconds and 
all was well.  But with load_examples = True it would reliably hang on this 
specific migration every single time.

I have attached a full stack trace from when I terminated the second DB session.

  was:
This took me a while to figure out as there was another issue with the 
migration in question which is 
cc1e65623dc7_add_max_tries_column_to_task_instance.py

This file USED to have an issue where it would sit there forever during an 
initdb essentially deadlocked with itself.

I couldn't figure out why it was still sitting there for me, given that I was 
using the version of the migration where this had been fixed, so I went looking 
at the locks on the DB.  I found TWO processes running on the DB both 
originating inside the airflow initdb Python instance.

The first was happily sitting there trying to query the max_retries column on a 
table, but the OTHER as attempting to query the table "slot_pool" from within 
example_subdag_operator.py .  I killed the session which was querying that 
table and of course my Python process crashed, but helpfully with a stack trace.

The session I killed was interacting with the DB running in EXAMPLES and was 
actually complaining that the table was not a valid object name.  As soon as I 
set load_examples = False, the initdb process ran through in a few seconds and 
all was well.  But with load_examples = True it would reliably hang on this 
specific migration every single time.

I have attached a full stack trace from when I terminated the second DB session.


> Using MS SQL Server 17 as a backend, Migration 
> cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples 
> = True
> --
>
> Key: AIRFLOW-6974
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6974
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.10.9
>Reporter: Tony Brookes
>Priority: Minor
> Attachments: airflow-mssql-stack-trace.txt
>
>
> This took me a while to figure out as there was another issue with the 
> migration in question which is 
> cc1e65623dc7_add_max_tries_column_to_task_instance.py
> This file USED to have an issue where it would sit there forever during an 
> initdb on MS SQL Server, essentially deadlocked with itself.
> I couldn't figure out why it was still sitting there for me, given that I was 
> using the version of the migration where this had been fixed, so I went 
> looking at the locks on the DB.  I found TWO processes running on the DB both 
> originating inside the airflow initdb Python instance.
> The first was happily sitting there trying to query the max_retries column on 
> a table, but the other was attempting to query the table "slot_pool" from 
> within example_subdag_operator.py .  I killed the session which was querying 
> that table and of course my Python process crashed, helpfully with a stack 
> trace.
> The session I killed was interacting with the DB running in EXAMPLES and was 
> actually complaining that the table was not a valid object name.  As soon as 
> I set load_examples = False, the initdb process ran through in a few seconds 
> and all was well.  But with load_examples = True it would reliably hang on 
> this specific migration every single time.
> I have attached a full stack trace from when I terminated the second DB 
> session.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6974) Using MS SQL Server 17 as a backend, Migration cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = True

2020-03-02 Thread Tony Brookes (Jira)
Tony Brookes created AIRFLOW-6974:
-

 Summary: Using MS SQL Server 17 as a backend, Migration 
cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = 
True
 Key: AIRFLOW-6974
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6974
 Project: Apache Airflow
  Issue Type: Bug
  Components: db
Affects Versions: 1.10.9
Reporter: Tony Brookes
 Attachments: airflow-mssql-stack-trace.txt

This took me a while to figure out as there was another issue with the 
migration in question which is 
cc1e65623dc7_add_max_tries_column_to_task_instance.py

This file USED to have an issue where it would sit there forever during an 
initdb essentially deadlocked with itself.

I couldn't figure out why it was still sitting there for me, given that I was 
using the version of the migration where this had been fixed, so I went looking 
at the locks on the DB.  I found TWO processes running on the DB both 
originating inside the airflow initdb Python instance.

The first was happily sitting there trying to query the max_retries column on a 
table, but the OTHER as attempting to query the table "slot_pool" from within 
example_subdag_operator.py .  I killed the session which was querying that 
table and of course my Python process crashed, but helpfully with a stack trace.

The session I killed was interacting with the DB running in EXAMPLES and was 
actually complaining that the table was not a valid object name.  As soon as I 
set load_examples = False, the initdb process ran through in a few seconds and 
all was well.  But with load_examples = True it would reliably hang on this 
specific migration every single time.

I have attached a full stack trace from when I terminated the second DB session.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil closed pull request #7054: WIP: Test PR

2020-03-02 Thread GitBox
kaxil closed pull request #7054: WIP: Test PR
URL: https://github.com/apache/airflow/pull/7054
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil opened a new pull request #7054: WIP: Test PR

2020-03-02 Thread GitBox
kaxil opened a new pull request #7054: WIP: Test PR
URL: https://github.com/apache/airflow/pull/7054
 
 
   ---
   Link to JIRA issue: https://issues.apache.org/jira/browse/AIRFLOW-
   
   - [ ] Description above provides context of the change
   - [ ] Commit message starts with `[AIRFLOW-]`, where AIRFLOW- = JIRA 
ID*
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   (*) For document-only changes, no JIRA issue is needed. Commit message 
starts `[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] lucafuji commented on a change in pull request #6870: [AIRFLOW-0578] Check return code

2020-03-02 Thread GitBox
lucafuji commented on a change in pull request #6870: [AIRFLOW-0578] Check 
return code
URL: https://github.com/apache/airflow/pull/6870#discussion_r381445504
 
 

 ##
 File path: airflow/jobs/local_task_job.py
 ##
 @@ -95,6 +95,14 @@ def signal_handler(signum, frame):
 # Monitor the task to see if it's done
 return_code = self.task_runner.return_code()
 if return_code is not None:
+if return_code != 0:
+self.task_instance.refresh_from_db()
+# there is one case we should not treat non zero return
+# code as failed: the job has been killed externally.
+if (not self.terminating) or self.task_instance.state 
== State.FAILED:
 
 Review comment:
   1. "self.terminating" means job been killed externally but not 
"state="failed". It's set in heartbeat_callback:L137. Basically it means 
whenever "ti.state != State.RUNNING", it's terminating.
   
   There are two cases this will happen
   a. explicitly calling terminate of a StandardTaskRunner, then the return 
code is -9 and the task_instance.state is not failed. In such case, we should 
not treat non zero exit code as failure.
   b. the task instance is explicitly set as failed, in this case, we should 
treat non zero exit code as failure.
   
   2. As mentioned above, "job failure" is handled in base_job.py:run:L230. If 
exception is not thrown here, job state will not be marked as failure


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #7423: [AIRFLOW-3126] Add option to specify additional K8s volumes

2020-03-02 Thread GitBox
codecov-io edited a comment on issue #7423: [AIRFLOW-3126] Add option to 
specify additional K8s volumes
URL: https://github.com/apache/airflow/pull/7423#issuecomment-586568480
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=h1) 
Report
   > Merging 
[#7423](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/2ea9278f76bf71aafb5601160602bf7f4194242f?src=pr=desc)
 will **increase** coverage by `0.1%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7423/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master#7423 +/-   ##
   =
   + Coverage   86.73%   86.84%   +0.1% 
   =
 Files 897  897 
 Lines   4275142797 +46 
   =
   + Hits3708137165 +84 
   + Misses   5670 5632 -38
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/kubernetes/worker\_configuration.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3dvcmtlcl9jb25maWd1cmF0aW9uLnB5)
 | `99.35% <100%> (+0.04%)` | :arrow_up: |
   | 
[airflow/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMva3ViZXJuZXRlc19leGVjdXRvci5weQ==)
 | `60.23% <100%> (+3.06%)` | :arrow_up: |
   | 
[airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=)
 | `90.64% <0%> (+0.14%)` | :arrow_up: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `92.15% <0%> (+0.28%)` | :arrow_up: |
   | 
[airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5)
 | `91.73% <0%> (+1.65%)` | :arrow_up: |
   | 
[airflow/providers/postgres/hooks/postgres.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvaG9va3MvcG9zdGdyZXMucHk=)
 | `94.36% <0%> (+16.9%)` | :arrow_up: |
   | 
[...roviders/google/cloud/operators/postgres\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9wb3N0Z3Jlc190b19nY3MucHk=)
 | `85.29% <0%> (+32.35%)` | :arrow_up: |
   | 
[airflow/providers/postgres/operators/postgres.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvb3BlcmF0b3JzL3Bvc3RncmVzLnB5)
 | `100% <0%> (+50%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=footer). 
Last update 
[2ea9278...f026674](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] tbobik1 commented on issue #7386: [AIRFLOW-6761] Fix WorkGroup param in AWSAthenaHook

2020-03-02 Thread GitBox
tbobik1 commented on issue #7386: [AIRFLOW-6761] Fix WorkGroup param in 
AWSAthenaHook
URL: https://github.com/apache/airflow/pull/7386#issuecomment-593672865
 
 
   I had the same problem using version 1.10.9 trying to query athena with the 
aws_athena_hook.py . Changed Workgroup to WorkGroup and I had to change it to 
WorkGroup = 'primary' for it to work. Using 'default' gave me errors. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6968) Failing when executing a dag: 'log file does not exist'

2020-03-02 Thread Jira


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omar Suárez updated AIRFLOW-6968:
-
Description: 
I am facing this error:

 

{{*** Log file does not exist: 
/usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log

Fetching from: 
[http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log]Failed
 to fetch log file from worker. HTTPConnectionPool(host='894194e3daed', 
port=8793): Max retries exceeded with url: 
/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused',))}}

 

when trying to execute a DAG using the PythonVirtualenvOperator.

Attached is the configuration of the logs inside the 'airflow.cfg' file.

I am using the Airflow Docker image from puckel.

 

  was:
I am facing this error:

 

{{*** Log file does not exist: 
/usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
*** Fetching from: 
http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
*** Failed to fetch log file from worker. 
HTTPConnectionPool(host='894194e3daed', port=8793): Max retries exceeded with 
url: 
/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection 
refused',))}}

 

when trying to execute a DAG using the PythonVirtualenvOperator.

Attached is the configuration of the logs inside the 'airflow.cfg' file.

I am using the Airflow Docker image from puckel.

 


> Failing when executing a dag: 'log file does not exist'
> ---
>
> Key: AIRFLOW-6968
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6968
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.2
>Reporter: Omar Suárez
>Priority: Minor
> Attachments: airflow_log_config.png
>
>
> I am facing this error:
>  
> {{*** Log file does not exist: 
> /usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
> Fetching from: 
> [http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log]Failed
>  to fetch log file from worker. HTTPConnectionPool(host='894194e3daed', 
> port=8793): Max retries exceeded with url: 
> /log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log
>  (Caused by NewConnectionError(' 0x7f1d0966c6d8>: Failed to establish a new connection: [Errno 111] Connection 
> refused',))}}
>  
> when trying to execute a DAG using the PythonVirtualenvOperator.
> Attached is the configuration of the logs inside the 'airflow.cfg' file.
> I am using the Airflow Docker image from puckel.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6860) Default ignore_first_depends_on_past to True

2020-03-02 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049694#comment-17049694
 ] 

ASF subversion and git services commented on AIRFLOW-6860:
--

Commit 2ea9278f76bf71aafb5601160602bf7f4194242f in airflow's branch 
refs/heads/master from Ping Zhang
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=2ea9278 ]

[AIRFLOW-6860] Default ignore_first_depends_on_past to True (#7490)



> Default ignore_first_depends_on_past to True
> 
>
> Key: AIRFLOW-6860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6860
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 1.10.9
>Reporter: Ping Zhang
>Assignee: Ping Zhang
>Priority: Minor
>
> to avoid 
> BackfillJob is deadlocked.Some of the deadlocked tasks were unable to run 
> because of "depends_on_past" relationships.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6860) Default ignore_first_depends_on_past to True

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049693#comment-17049693
 ] 

ASF GitHub Bot commented on AIRFLOW-6860:
-

KevinYang21 commented on pull request #7490: [AIRFLOW-6860] Default 
ignore_first_depends_on_past to True
URL: https://github.com/apache/airflow/pull/7490
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default ignore_first_depends_on_past to True
> 
>
> Key: AIRFLOW-6860
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6860
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 1.10.9
>Reporter: Ping Zhang
>Assignee: Ping Zhang
>Priority: Minor
>
> to avoid 
> BackfillJob is deadlocked.Some of the deadlocked tasks were unable to run 
> because of "depends_on_past" relationships.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] KevinYang21 merged pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True

2020-03-02 Thread GitBox
KevinYang21 merged pull request #7490: [AIRFLOW-6860] Default 
ignore_first_depends_on_past to True
URL: https://github.com/apache/airflow/pull/7490
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] leahecole commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-03-02 Thread GitBox
leahecole commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-593639001
 
 
   Hi @vsoch! Jarek spoke super highly about this contribution and I'd love to 
talk about how we can get you to the summit - I just followed you on twitter 
(@leahecole there too) - I think one good way would be if you're willing to 
give a talk about it - I'd be happy to help you through submitting the talk 
proposal. Feel free to DM me there to talk more details and thanks for your 
Airflow contribution! :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil removed a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil removed a comment on issue #6788: [AIRFLOW-5944] Rendering 
templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#issuecomment-592308818
 
 
   **ToDo**:
   
   - [ ] Add new column in SerializedDagTable to store unrendered template 
fields
   - [x] Add tests
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil edited a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil edited a comment on issue #6788: [AIRFLOW-5944] Rendering 
templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#issuecomment-592308818
 
 
   **ToDo**:
   
   - [ ] Add new column in SerializedDagTable to store unrendered template 
fields
   - [x] Add tests
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering 
templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#discussion_r386659768
 
 

 ##
 File path: airflow/models/dagrun.py
 ##
 @@ -470,6 +472,12 @@ def verify_integrity(self, session=None):
 1, 1)
 ti = TI(task, self.execution_date)
 session.add(ti)
+session.commit()
+
+if STORE_SERIALIZED_DAGS:
+RenderedTaskInstanceFields.delete_old_records(ti.task_id, 
ti.dag_id)
+if not RenderedTaskInstanceFields.has_templated_fields(ti, 
session):
+session.add(RenderedTaskInstanceFields(ti))
 
 Review comment:
   ToDo: Change the location of this piece of code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering 
templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#discussion_r386659768
 
 

 ##
 File path: airflow/models/dagrun.py
 ##
 @@ -470,6 +472,12 @@ def verify_integrity(self, session=None):
 1, 1)
 ti = TI(task, self.execution_date)
 session.add(ti)
+session.commit()
+
+if STORE_SERIALIZED_DAGS:
+RenderedTaskInstanceFields.delete_old_records(ti.task_id, 
ti.dag_id)
+if not RenderedTaskInstanceFields.has_templated_fields(ti, 
session):
+session.add(RenderedTaskInstanceFields(ti))
 
 Review comment:
   I want to change the location of this piece of code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page

2020-03-02 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049657#comment-17049657
 ] 

t oo commented on AIRFLOW-6747:
---

as part of this could use the new column to indicate whether a dagid is no 
longer in the dagbag

> UI - Show count of tasks in each dag on the main dags page
> --
>
> Key: AIRFLOW-6747
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6747
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.7
>Reporter: t oo
>Assignee: Ebrima Jallow
>Priority: Minor
>  Labels: gsoc, gsoc2020, mentor
>
> Main DAGs page in UI - would benefit from showing a new column: number of 
> tasks for each dag id



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator

2020-03-02 Thread GitBox
BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to 
airflow.operators.base_operator
URL: https://github.com/apache/airflow/pull/5910#issuecomment-593616994
 
 
   Okay, I'll wait for #7596 to be merged first.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dferguson992 commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor

2020-03-02 Thread GitBox
dferguson992 commented on a change in pull request #7407: [AIRFLOW-6786] Add 
KafkaConsumerHook, KafkaProduerHook and KafkaSensor
URL: https://github.com/apache/airflow/pull/7407#discussion_r386641018
 
 

 ##
 File path: airflow/contrib/hooks/kafka_consumer_hook.py
 ##
 @@ -0,0 +1,71 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from airflow.hooks import base_hook as BaseHook
+from kafka import KafkaConsumer
+
+
+class KafkaConsumerHook(BaseHook):
+
+DEFAULT_HOST = 'localhost'
+DEFAULT_PORT = 9092
+
+def __init__(self, conn_id, topic):
+super(KafkaConsumerHook, self).__init__(None)
+self.conn = self.get_connection(conn_id)
 
 Review comment:
   Address in newest commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py

2020-03-02 Thread GitBox
potiuk commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from 
models/__init__.py
URL: https://github.com/apache/airflow/pull/7596#issuecomment-593596094
 
 
   @msb217 ->  Thanks for your considerations :) . Can you please push a fixup 
on top of the old one? this way we can review just the difference? If not - 
just push everything. I think this change needs to be commited as single push 
even if it is huge. And pre-commit can be done as next step I think


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator

2020-03-02 Thread GitBox
potiuk commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to 
airflow.operators.base_operator
URL: https://github.com/apache/airflow/pull/5910#issuecomment-593592953
 
 
   Hey @BasPH -> I think it's a good idea but you have to be aware that we are 
moving a lot of stuff out of models/__init__.py in #7596 and also lazy-loading 
it in models/__init__.py, so possibly it's better to rebase after that one is 
merged. We need to make sure backwards compatibility so we need too keep the 
old from models import BaseOperator and with lazy loading / PEP-562 we can get 
this  working.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator

2020-03-02 Thread GitBox
BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to 
airflow.operators.base_operator
URL: https://github.com/apache/airflow/pull/5910#issuecomment-593580264
 
 
   @kaxil given we now have the BashOperator in airflow.operators.bash, and the 
PythonOperator in airflow.operators.python, I'd like to place this in 
airflow.operators.base, while we're at it. What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py

2020-03-02 Thread GitBox
msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports 
from models/__init__.py
URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042
 
 
   @potiuk @nuclearpinguin I've addressed the majority of your comments, 
however, considering the size of this PR and sanity's sake - would you like me 
close PR to break this down into separate PRs and Jira issues?
   
   For example:
   1. Implement `resetdb` and `all_models` case
   2. Change module path for `DAG` to `from airflow import DAG` for 
`example_dags`
   3. Change paths to model modules + lazy load user facing models in 
`airflow.models`
   4. Pre-commit hooks to be done by either of you guys
   
   I just don't want to drive you guys insane with such a large review :) Or I 
can just push if you guys don't mind


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze

2020-03-02 Thread GitBox
potiuk commented on issue #7608: [AIRFLOW-6972] Shorter frequently used 
commands in Breeze
URL: https://github.com/apache/airflow/pull/7608#issuecomment-593563472
 
 
   Sure . Shorter is better :).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mtagle commented on issue #7475: [AIRFLOW-6855]: Escape project_dataset_table in SQL query in gcs to bq …

2020-03-02 Thread GitBox
mtagle commented on issue #7475: [AIRFLOW-6855]: Escape project_dataset_table 
in SQL query in gcs to bq …
URL: https://github.com/apache/airflow/pull/7475#issuecomment-593554401
 
 
   I had some trouble running the tests locally, so I'm exploiting travis to 
see if I did it right. 爛 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #7584: [AIRFLOW-6956] Extract kill_child_processes_by_pids from DagFileProcessorManager

2020-03-02 Thread GitBox
codecov-io edited a comment on issue #7584: [AIRFLOW-6956] Extract 
kill_child_processes_by_pids from DagFileProcessorManager
URL: https://github.com/apache/airflow/pull/7584#issuecomment-592517938
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=h1) 
Report
   > Merging 
[#7584](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/1d16de7af0ba0b6c8493b105a6751693d2ef30f2?src=pr=desc)
 will **decrease** coverage by `0.23%`.
   > The diff coverage is `85%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/7584/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#7584  +/-   ##
   ==
   - Coverage   86.81%   86.58%   -0.24% 
   ==
 Files 896  897   +1 
 Lines   4270442747  +43 
   ==
   - Hits3707437011  -63 
   - Misses   5630 5736 +106
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `89.57% <50%> (+2.92%)` | :arrow_up: |
   | 
[airflow/utils/process\_utils.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9wcm9jZXNzX3V0aWxzLnB5)
 | `73.25% <88.88%> (+4.13%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `47.18% <0%> (-45.08%)` | :arrow_down: |
   | 
[...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==)
 | `69.69% <0%> (-25.26%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=)
 | `64.28% <0%> (-1.76%)` | :arrow_down: |
   | 
[airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5)
 | `91.55% <0%> (-0.03%)` | :arrow_down: |
   | 
[airflow/utils/log/cloudwatch\_task\_handler.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY2xvdWR3YXRjaF90YXNrX2hhbmRsZXIucHk=)
 | `100% <0%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=footer). 
Last update 
[1d16de7...7d83211](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in 
the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593430117
 
 
   It is worth noting that this also solves one more problem. The modules are 
always reloaded.
   
https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-6497?filter=allopenissues
   so when someone makes a change in the additional module it is correctly 
executed. Its old version is not executed. This can be a problem because the 
handler is often stored in helper module and is shared among many DAGs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

2020-03-02 Thread GitBox
mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking 
just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-593540891
 
 
   @houqp What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj merged pull request #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md

2020-03-02 Thread GitBox
mik-laj merged pull request #7606: [AIRFLOW-] Add Ternary Data to README.md
URL: https://github.com/apache/airflow/pull/7606
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] boring-cyborg[bot] commented on issue #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md

2020-03-02 Thread GitBox
boring-cyborg[bot] commented on issue #7606: [AIRFLOW-] Add Ternary Data to 
README.md
URL: https://github.com/apache/airflow/pull/7606#issuecomment-593541241
 
 
   Awesome work, congrats on your first merged pull request!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py

2020-03-02 Thread GitBox
msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports 
from models/__init__.py
URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042
 
 
   @potiuk @nuclearpinguin I've addressed the majority of your comments, 
however, considering the size of this PR and sanity's sake - would you like me 
close PR to break this down into separate PRs and Jira issues?
   
   For example:
   1. Implement `resetdb` and `all_models` case
   2. Change module path for `DAG` to `from airflow import DAG` for 
`example_dags`
   3. Change paths to model modules
   4. Pre-commit hooks to be done by either of you guys
   
   I just don't want to drive you guys insane with such a large review :) Or I 
can just push if you guys don't mind


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] msb217 commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py

2020-03-02 Thread GitBox
msb217 commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from 
models/__init__.py
URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042
 
 
   @potiuk @nuclearpinguin I've addressed the majority of your comments, 
however, considering the size of this PR and sanity's sake - would you like me 
close PR to break this down into separate PRs and Jira issues?
   
   For example:
   1. Implement `resetdb` and `all_models` case
   2. Change module path for `DAG` to `from airflow import DAG` for 
`example_dags`
   3. Change paths to model modules
   4. Pre-commit hooks to be done by either of you guys
   
   I just don't want to drive you guys insane with such a large review :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] nuclearpinguin opened a new pull request #7609: [AIRFLOW-6973] Make GCSCreateBucketOperator idempotent

2020-03-02 Thread GitBox
nuclearpinguin opened a new pull request #7609: [AIRFLOW-6973] Make 
GCSCreateBucketOperator idempotent
URL: https://github.com/apache/airflow/pull/7609
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6973) Make GCSCreateBucketOperator idempotent

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049494#comment-17049494
 ] 

ASF GitHub Bot commented on AIRFLOW-6973:
-

nuclearpinguin commented on pull request #7609: [AIRFLOW-6973] Make 
GCSCreateBucketOperator idempotent
URL: https://github.com/apache/airflow/pull/7609
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Make GCSCreateBucketOperator idempotent
> ---
>
> Key: AIRFLOW-6973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6973
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, operators
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator

2020-03-02 Thread GitBox
mik-laj commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
URL: https://github.com/apache/airflow/pull/6670#issuecomment-593527684
 
 
   You should remove "_operator" prefix from the file name.  After that, this 
operator should be in the `airflow/provviders/amazon/aws/operators` directory. 
   More information:  
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths
   Best regards,
   Kamil


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6972) Shorter frequently used commands in Breeze

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049487#comment-17049487
 ] 

ASF GitHub Bot commented on AIRFLOW-6972:
-

mik-laj commented on pull request #7608: [AIRFLOW-6972] Shorter frequently used 
commands in Breeze
URL: https://github.com/apache/airflow/pull/7608
 
 
   I often do typo in the "environment" word.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Shorter frequently used commands in Breeze
> --
>
> Key: AIRFLOW-6972
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6972
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.9
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj opened a new pull request #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze

2020-03-02 Thread GitBox
mik-laj opened a new pull request #7608: [AIRFLOW-6972] Shorter frequently used 
commands in Breeze
URL: https://github.com/apache/airflow/pull/7608
 
 
   I often do typo in the "environment" word.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6973) Make GCSCreateBucketOperator idempotent

2020-03-02 Thread Tomasz Urbaszek (Jira)
Tomasz Urbaszek created AIRFLOW-6973:


 Summary: Make GCSCreateBucketOperator idempotent
 Key: AIRFLOW-6973
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6973
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp, operators
Affects Versions: 2.0.0
Reporter: Tomasz Urbaszek






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6972) Shorter frequently used commands in Breeze.

2020-03-02 Thread Kamil Bregula (Jira)
Kamil Bregula created AIRFLOW-6972:
--

 Summary: Shorter frequently used commands in Breeze.
 Key: AIRFLOW-6972
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6972
 Project: Apache Airflow
  Issue Type: Improvement
  Components: breeze
Affects Versions: 1.10.9
Reporter: Kamil Bregula






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6972) Shorter frequently used commands in Breeze

2020-03-02 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-6972:
---
Summary: Shorter frequently used commands in Breeze  (was: Shorter 
frequently used commands in Breeze.)

> Shorter frequently used commands in Breeze
> --
>
> Key: AIRFLOW-6972
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6972
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.9
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator

2020-03-02 Thread GitBox
JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
URL: https://github.com/apache/airflow/pull/6670#issuecomment-593514642
 
 
   Also, what is this? Would this prevent the code for passing all the checks?
   ```
   _ TestOperatorsHooks.test_no_illegal_suffixes 
__
   self = 
   def test_no_illegal_suffixes(self):
   illegal_suffixes = ["_operator.py", "_hook.py", "_sensor.py"]
   files = itertools.chain(*[
   
glob.glob(f"{ROOT_FOLDER}/{part}/providers/**/{resource_type}/*.py", 
recursive=True)
   for resource_type in ["operators", "hooks", "sensors", 
"example_dags"]
   for part in ["airlfow", "tests"]
   ])
   
   invalid_files = [
   f
   for f in files
   if any(f.endswith(suffix) for suffix in illegal_suffixes)
   ]
   
   >   self.assertEqual([], invalid_files)
   E   AssertionError: Lists differ: [] != 
['/opt/airflow/tests/providers/amazon/aws/[35 chars].py']
   E   
   E   Second list contains 1 additional elements.
   E   First extra element 0:
   E   
'/opt/airflow/tests/providers/amazon/aws/operators/test_mysql_to_s3_operator.py'
   E   
   E   - []
   E   + 
['/opt/airflow/tests/providers/amazon/aws/operators/test_mysql_to_s3_operator.py']
   tests/test_project_structure.py:267: AssertionError
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] JavierLopezT edited a comment on issue #6670: [AIRFLOW-4816]MySqlToS3Operator

2020-03-02 Thread GitBox
JavierLopezT edited a comment on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
URL: https://github.com/apache/airflow/pull/6670#issuecomment-593511992
 
 
   I am stucked again with the test. Could you help me please? Sorry for 
bothering again
```
   AssertionError: Expected call: load_file(bucket_name='bucket', 
filename=, key='key')
   Actual call: load_file(bucket_name='bucket', filename=, key='key')
   ```
   The code in the operator is:
   ```
   with tempfile.NamedTemporaryFile(mode='r+', suffix='.csv') as tmp_csv:
   tmp_csv.file.write(data_df.to_csv(index=self.index, 
header=self.header))
   tmp_csv.file.seek(0)
   s3_conn.load_file(filename=tmp_csv.name,
 key=self.s3_key,
 bucket_name=self.s3_bucket)
   ```
   And the testing code (its latest version) is:
   
@mock.patch("airflow.operators.mysql_to_s3_operator.tempfile.NamedTemporaryFile")
   ...
   ```
   temp_mock.assert_called_once_with(mode='r+', suffix=".csv")
   temp_mock.return_value.__enter__.return_value.name = "file"

mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.__enter__.name,
   
key=s3_key,
   
bucket_name=s3_bucket)
   ```
   I have tried also with:
   ```
   mock_s3_hook.return_value.load_file.assert_called_once_with(filename="file",
   key=s3_key,
   
bucket_name=s3_bucket)
   ```
   and
   ```
   
mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.name,
   key=s3_key,

bucket_name=s3_bucket)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator

2020-03-02 Thread GitBox
JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
URL: https://github.com/apache/airflow/pull/6670#issuecomment-593511992
 
 
   I am stucked again with the test. Could you help me please? Sorry for 
bothering again
```
   AssertionError: Expected call: load_file(bucket_name='bucket', 
filename=, key='key')
   Actual call: load_file(bucket_name='bucket', filename=, key='key')
   ```
   The code in the operator is:
   ```
   with tempfile.NamedTemporaryFile(mode='r+', suffix='.csv') as tmp_csv:
   tmp_csv.file.write(data_df.to_csv(index=self.index, 
header=self.header))
   tmp_csv.file.seek(0)
   s3_conn.load_file(filename=tmp_csv.name,
 key=self.s3_key,
 bucket_name=self.s3_bucket)
   ```
   And the testing code (its latest version) is:
   
@mock.patch("airflow.operators.mysql_to_s3_operator.tempfile.NamedTemporaryFile")
   ...
   ```
   temp_mock.assert_called_once_with(mode='r+', suffix=".csv")
   temp_mock.return_value.__enter__.return_value.name = "file"
   
mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.__enter__.name,
   
key=s3_key,
   
bucket_name=s3_bucket)
   ```
   I have tried also with:
   ```
   mock_s3_hook.return_value.load_file.assert_called_once_with(filename="file",
   
key=s3_key,
   
bucket_name=s3_bucket)
   ```
   and
   ```
   
mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.name,
   
key=s3_key,
   
bucket_name=s3_bucket)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6971) Fix return type in CloudSpeechToTextRecognizeSpeechOperator

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049442#comment-17049442
 ] 

ASF GitHub Bot commented on AIRFLOW-6971:
-

nuclearpinguin commented on pull request #7607: [AIRFLOW-6971] Fix return type 
in CloudSpeechToTextRecognizeSpeechOp
URL: https://github.com/apache/airflow/pull/7607
 
 
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix return type in CloudSpeechToTextRecognizeSpeechOperator 
> 
>
> Key: AIRFLOW-6971
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6971
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nuclearpinguin opened a new pull request #7607: [AIRFLOW-6971] Fix return type in CloudSpeechToTextRecognizeSpeechOp

2020-03-02 Thread GitBox
nuclearpinguin opened a new pull request #7607: [AIRFLOW-6971] Fix return type 
in CloudSpeechToTextRecognizeSpeechOp
URL: https://github.com/apache/airflow/pull/7607
 
 
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6971) Fix return type in CloudSpeechToTextRecognizeSpeechOperator

2020-03-02 Thread Tomasz Urbaszek (Jira)
Tomasz Urbaszek created AIRFLOW-6971:


 Summary: Fix return type in 
CloudSpeechToTextRecognizeSpeechOperator 
 Key: AIRFLOW-6971
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6971
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp
Affects Versions: 2.0.0
Reporter: Tomasz Urbaszek






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] potiuk commented on issue #7570: [AIRFLOW-6946] Switch to MySQL 5.7 in 2.0 as base

2020-03-02 Thread GitBox
potiuk commented on issue #7570: [AIRFLOW-6946] Switch to MySQL 5.7 in 2.0 as 
base
URL: https://github.com/apache/airflow/pull/7570#issuecomment-593506014
 
 
   Hey @ashb @kaxil @ANiteckiP @anitakar -> This change moves us to MySQL 5.7. 
Also it contains the the fix for the utf8mb4 problem I discovered while testing 
unicode DAGs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] boring-cyborg[bot] commented on issue #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
boring-cyborg[bot] commented on issue #7595: [AIRFLOW-] Fix typo from 
upstream to downstream
URL: https://github.com/apache/airflow/pull/7595#issuecomment-593505971
 
 
   Awesome work, congrats on your first merged pull request!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle merged pull request #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
feluelle merged pull request #7595: [AIRFLOW-] Fix typo from upstream to 
downstream
URL: https://github.com/apache/airflow/pull/7595
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mhousley opened a new pull request #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md

2020-03-02 Thread GitBox
mhousley opened a new pull request #7606: [AIRFLOW-] Add Ternary Data to 
README.md
URL: https://github.com/apache/airflow/pull/7606
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mhousley closed pull request #7605: Add Ternary Data to README.md

2020-03-02 Thread GitBox
mhousley closed pull request #7605: Add Ternary Data to README.md
URL: https://github.com/apache/airflow/pull/7605
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mhousley opened a new pull request #7605: Add Ternary Data to README.md

2020-03-02 Thread GitBox
mhousley opened a new pull request #7605: Add Ternary Data to README.md
URL: https://github.com/apache/airflow/pull/7605
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x ] Description above provides context of the change
   - [ ] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] marwan116 commented on issue #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
marwan116 commented on issue #7595: [AIRFLOW-] Fix typo from upstream to 
downstream
URL: https://github.com/apache/airflow/pull/7595#issuecomment-593498283
 
 
   Thank you - so for documentation fixes it is actually  - sorry was a bit 
confused about that. Fixed it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
mik-laj commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to 
downstream
URL: https://github.com/apache/airflow/pull/7595#issuecomment-593496847
 
 
   Here is pull-request guideline: 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] marwan116 commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
marwan116 commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to 
downstream
URL: https://github.com/apache/airflow/pull/7595#issuecomment-593489182
 
 
   Thank you for responding - yes I noted that the format should start with 
[AIRFLOW-] but I am not sure what  - I just went to JIRA - checked the 
most recent number (6970) and I incremented mine by one ... would be good if 
there is a resource on how to create the JIRA issue  number -


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] marwan116 edited a comment on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream

2020-03-02 Thread GitBox
marwan116 edited a comment on issue #7595: [AIRFLOW-6971] Fix typo from 
upstream to downstream
URL: https://github.com/apache/airflow/pull/7595#issuecomment-593489182
 
 
   Thank you for responding - yes I noted that the format should start with 
[AIRFLOW-] but I am not sure what  is supposed to be - I just went to 
JIRA - checked the most recent number (6970) and I incremented mine by one ... 
would be good if there is a resource on how to create the JIRA issue  
number -


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6970) Improve GCP Video Intelligence system tests

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049374#comment-17049374
 ] 

ASF GitHub Bot commented on AIRFLOW-6970:
-

nuclearpinguin commented on pull request #7604: [AIRFLOW-6970] Improve GCP 
Video Intelligence system tests
URL: https://github.com/apache/airflow/pull/7604
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve GCP Video Intelligence system tests
> ---
>
> Key: AIRFLOW-6970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6970
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, tests
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nuclearpinguin opened a new pull request #7604: [AIRFLOW-6970] Improve GCP Video Intelligence system tests

2020-03-02 Thread GitBox
nuclearpinguin opened a new pull request #7604: [AIRFLOW-6970] Improve GCP 
Video Intelligence system tests
URL: https://github.com/apache/airflow/pull/7604
 
 
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add 
spark-on-k8s operator/hook/sensor
URL: https://github.com/apache/airflow/pull/7163#discussion_r386495451
 
 

 ##
 File path: airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py
 ##
 @@ -0,0 +1,88 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import Optional
+
+import yaml
+from kubernetes import client
+
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.providers.cncf.kubernetes.hooks.kubernetes import KubernetesHook
+from airflow.utils.decorators import apply_defaults
+
+
+class SparkKubernetesOperator(BaseOperator):
+"""
+Creates sparkApplication object in kubernetes cluster:
+
+   .. seealso::
+For more detail about Spark Application Object have a look at the 
reference:
+
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/api-docs.md#sparkapplication
+
+:param application_file: filepath to kubernetes custom_resource_definition 
of sparkApplication
+:type application_file:  str
+:param namespace: kubernetes namespace to put sparkApplication
+:type namespace: str
+:param kubernetes_conn_id: the connection to Kubernetes cluster
+:type conn_id: str
+"""
+
+template_fields = ['application_file', 'namespace']
+template_ext = ('yaml', 'yml', 'json')
+ui_color = '#f4a460'
+
+@apply_defaults
+def __init__(self,
+ application_file: str,
+ namespace: Optional[str] = None,
+ conn_id: str = 'kubernetes_default',
+ *args, **kwargs) -> None:
+super().__init__(*args, **kwargs)
+self.application_file = application_file
+self.namespace = namespace
+self.conn_id = conn_id
+
+def execute(self, context):
+self.log.info("Creating sparkApplication")
+hook = KubernetesHook(conn_id=self.conn_id)
+api_client = hook.get_conn()
+api = client.CustomObjectsApi(api_client)
+application_dict = self._load_application_to_dict()
+if self.namespace is None:
+namespace = hook.get_namespace()
+else:
+namespace = self.namespace
+try:
+response = api.create_namespaced_custom_object(
 
 Review comment:
   This logic should be in the hook so that it can be used again by other 
custom operators. Then we can add more methods that will start operators and 
wait for its completion or other depending on the situation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add 
spark-on-k8s operator/hook/sensor
URL: https://github.com/apache/airflow/pull/7163#discussion_r386491368
 
 

 ##
 File path: 
airflow/providers/cncf/kubernetes/example_dags/example_spark_kubernetes_operator_spark_pi.yaml
 ##
 @@ -0,0 +1,57 @@
+#
 
 Review comment:
   Can you add this file to MANIFEST.in? 
   https://github.com/apache/airflow/blob/master/MANIFEST.in


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option 
to specify the mysql client library used in MySqlHook
URL: https://github.com/apache/airflow/pull/6576#discussion_r386484225
 
 

 ##
 File path: airflow/providers/mysql/hooks/mysql.py
 ##
 @@ -16,10 +16,13 @@
 # specific language governing permissions and limitations
 # under the License.
 
+"""
+This module allows to connect to a MySQL database.
+"""
+
 import json
 
 import MySQLdb
 
 Review comment:
   Do we need this import here? Maybe we can load this library only in specific 
cases?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option 
to specify the mysql client library used in MySqlHook
URL: https://github.com/apache/airflow/pull/6576#discussion_r386483618
 
 

 ##
 File path: airflow/providers/mysql/hooks/mysql.py
 ##
 @@ -113,8 +107,44 @@ def get_conn(self):
 conn_config['unix_socket'] = conn.extra_dejson['unix_socket']
 if local_infile:
 conn_config["local_infile"] = 1
-conn = MySQLdb.connect(**conn_config)
-return conn
+return conn_config
+
+def _get_conn_config_mysql_connector_python(self, conn):
+conn_config = {
+'user': conn.login,
+'password': conn.password or '',
+'host': conn.host or 'localhost',
+'database': self.schema or conn.schema or '',
+'port': int(conn.port) if conn.port else 3306
+}
+
+if conn.extra_dejson.get('allow_local_infile', False):
+conn_config["allow_local_infile"] = True
+
+return conn_config
+
+def get_conn(self):
+"""
+Establishes a connection to a mysql database
+by extracting the connection configuration from the Airflow connection.
+
+.. note:: By default it connects to the database via the mysqlclient 
library.
+But you can also choose the mysql-connector-python library which 
lets you connect through ssl
+without any further ssl parameters required.
+
+:return: a mysql connection object
+"""
+conn = self.connection or self.get_connection(self.mysql_conn_id)  # 
pylint: disable=no-member
+
+client_name = conn.extra_dejson.get('client', 'mysqlclient')
+
+if client_name == 'mysql-connector-python':
+import mysql.connector
+conn_config = self._get_conn_config_mysql_connector_python(conn)
+return mysql.connector.connect(**conn_config)
+
+conn_config = self._get_conn_config_mysql_client(conn)
 
 Review comment:
   Can you raise an exception when an invalid value is provided? Loading a 
specific client by default can lead to difficult to detect typos.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6970) Improve GCP Video Intelligence system tests

2020-03-02 Thread Tomasz Urbaszek (Jira)
Tomasz Urbaszek created AIRFLOW-6970:


 Summary: Improve GCP Video Intelligence system tests
 Key: AIRFLOW-6970
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6970
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp, tests
Affects Versions: 2.0.0
Reporter: Tomasz Urbaszek






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] 
Rendering templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#discussion_r386480827
 
 

 ##
 File path: airflow/www/views.py
 ##
 @@ -570,25 +572,37 @@ def dag_details(self, session=None):
 @has_dag_access(can_dag_read=True)
 @has_access
 @action_logging
-def rendered(self):
+@provide_session
+def rendered(self, session=None):
 dag_id = request.args.get('dag_id')
 task_id = request.args.get('task_id')
 execution_date = request.args.get('execution_date')
 dttm = timezone.parse(execution_date)
 form = DateTimeForm(data={'execution_date': dttm})
 root = request.args.get('root', '')
-# Loads dag from file
-logging.info("Processing DAG file to render template.")
-dag = dagbag.get_dag(dag_id, from_file_only=True)
+
+logging.info("Retrieving rendered templates.")
+dag = dagbag.get_dag(dag_id)
+
 task = copy.copy(dag.get_task(task_id))
 ti = models.TaskInstance(task=task, execution_date=dttm)
 try:
-ti.render_templates()
+if STORE_SERIALIZED_DAGS:
+rtif = RenderedTaskInstanceFields.get_templated_fields(ti)
+if rtif:
+for field_name, rendered_value in rtif.items():
+setattr(task, field_name, rendered_value)
+else:
+# ToDo: Fetch raw strings from RenderedTaskInstanceFields 
table
+flash("Template field not found")
+else:
+ti.render_templates()
 
 Review comment:
   Done !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] 
Rendering templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#discussion_r386480519
 
 

 ##
 File path: airflow/utils/operator_helpers.py
 ##
 @@ -84,3 +85,24 @@ def context_to_airflow_vars(context, 
in_env_var_format=False):
 params[AIRFLOW_VAR_NAME_FORMAT_MAPPING['AIRFLOW_CONTEXT_DAG_RUN_ID'][
 name_format]] = dag_run.run_id
 return params
+
+
+def serialize_template_field(template_field):
 
 Review comment:
   Moved


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files

2020-03-02 Thread GitBox
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] 
Rendering templated_fields without accessing DAG files
URL: https://github.com/apache/airflow/pull/6788#discussion_r386480422
 
 

 ##
 File path: airflow/serialization/serialized_objects.py
 ##
 @@ -319,6 +320,9 @@ def serialize_operator(cls, op: BaseOperator) -> dict:
 if op.operator_extra_links:
 serialize_op['_operator_extra_links'] = \
 cls._serialize_operator_extra_links(op.operator_extra_links)
+serialize_op['_templated_fields'] = {
+field: serialize_template_field(getattr(op, field)) for field in 
op.template_fields
 
 Review comment:
   Removing Unrendered part from this PR. Will create a separate PR to add that


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook

2020-03-02 Thread GitBox
feluelle commented on issue #6576: [AIRFLOW-5922] Add option to specify the 
mysql client library used in MySqlHook
URL: https://github.com/apache/airflow/pull/6576#issuecomment-593466069
 
 
   @potiuk @mik-laj any final comments before I merge it? I would really like 
to get more feedback on this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator

2020-03-02 Thread GitBox
feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql 
Operator
URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941
 
 
   @RosterIn WDYT of this now?
   
   I personally don't like the `extra_options` (I added to `mysql` myself to 
`bulk_load_custom` :D). So now you are able to specify these options in the 
transfer operation. So you can load json or csv or whatever as long as LOAD 
DATA supports it.
   
   **EDIT:**
   I don't like the `extra_options` because it is not fully clear what options 
of https://dev.mysql.com/doc/refman/8.0/en/load-data.html these can be.
   But the issue is more because of `bulk_load_custom` not because of the 
implementation of the S3ToMySql, I would say.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator

2020-03-02 Thread GitBox
feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql 
Operator
URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941
 
 
   @RosterIn WDYT of this now?
   
   I personally don't like the `extra_options` (I added to `mysql` myself to 
`bulk_load_custom` :D). So now you are able to specify these options in the 
transfer operation. So you can load json or csv or whatever as long as LOAD 
DATA supports it.
   
   **EDIT:**
   I don't like the `extra_options` because it is not fully clear what options 
of https://dev.mysql.com/doc/refman/8.0/en/load-data.html these can be.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator

2020-03-02 Thread GitBox
feluelle commented on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator
URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941
 
 
   @RosterIn WDYT of this now?
   
   I personally don't like the `extra_options` (I added to `mysql` myself to 
`bulk_load_custom` :D). So now you are able to specify these options in the 
transfer operation. So you can load json or csv or whatever as long as LOAD 
DATA supports it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch

2020-03-02 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049286#comment-17049286
 ] 

ASF subversion and git services commented on AIRFLOW-2325:
--

Commit 1e3cdddcd87be3c0f11b43efea11cdbddaff4470 in airflow's branch 
refs/heads/master from Daniel Hegberg
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=1e3cddd ]

[AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task logging to 
Cloudwatch (#7437)



> Task logging with AWS Cloud watch
> -
>
> Key: AIRFLOW-2325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2325
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Reporter: Fang-Pen Lin
>Assignee: Daniel Hegberg
>Priority: Minor
> Fix For: 2.0.0
>
>
> In many cases, it's ideal to use remote logging while running Airflow in 
> production, as the worker could be easily scale down or scale up. Or the 
> worker is running in containers, where the local storage is not meant to be 
> there forever. In that case, the S3 task logging handler could be used
> [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py]
> However, it comes with drawback. S3 logging handler only uploads the log when 
> the task completed or failed. For long running tasks, it's hard to know 
> what's going on with the process until it finishes.
> To make more real-time logging, I built a logging handler based on AWS 
> CloudWatch. It uses a third party python package `watchtower`
>  
> [https://github.com/kislyuk/watchtower/tree/master/watchtower]
>  
> I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], 
> basically I just copy-pasted the code I wrote for my own project, it works 
> fine with 1.9 release, but never tested with master branch. Also, there is a 
> bug in watchtower causing task runner to hang forever when it completes. I 
> created an issue in their repo
> [https://github.com/kislyuk/watchtower/issues/57]
> And a PR for addressing that issue 
> [https://github.com/kislyuk/watchtower/pull/58]
>  
> The PR is still far from ready to be reviewed, but I just want to get some 
> feedback before I spend more time on it. I would like to see if youguys want 
> this cloudwatch handler goes into the main repo, or do youguys prefer it to 
> be a standalone third-party module. If it's that case, I can close this 
> ticket and create a standalone repo on my own. If the PR is welcome, then I 
> can spend more time on polishing it based on your feedback, add tests / 
> documents and other stuff.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch

2020-03-02 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049285#comment-17049285
 ] 

ASF subversion and git services commented on AIRFLOW-2325:
--

Commit 1e3cdddcd87be3c0f11b43efea11cdbddaff4470 in airflow's branch 
refs/heads/master from Daniel Hegberg
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=1e3cddd ]

[AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task logging to 
Cloudwatch (#7437)



> Task logging with AWS Cloud watch
> -
>
> Key: AIRFLOW-2325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2325
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Reporter: Fang-Pen Lin
>Assignee: Daniel Hegberg
>Priority: Minor
> Fix For: 2.0.0
>
>
> In many cases, it's ideal to use remote logging while running Airflow in 
> production, as the worker could be easily scale down or scale up. Or the 
> worker is running in containers, where the local storage is not meant to be 
> there forever. In that case, the S3 task logging handler could be used
> [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py]
> However, it comes with drawback. S3 logging handler only uploads the log when 
> the task completed or failed. For long running tasks, it's hard to know 
> what's going on with the process until it finishes.
> To make more real-time logging, I built a logging handler based on AWS 
> CloudWatch. It uses a third party python package `watchtower`
>  
> [https://github.com/kislyuk/watchtower/tree/master/watchtower]
>  
> I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], 
> basically I just copy-pasted the code I wrote for my own project, it works 
> fine with 1.9 release, but never tested with master branch. Also, there is a 
> bug in watchtower causing task runner to hang forever when it completes. I 
> created an issue in their repo
> [https://github.com/kislyuk/watchtower/issues/57]
> And a PR for addressing that issue 
> [https://github.com/kislyuk/watchtower/pull/58]
>  
> The PR is still far from ready to be reviewed, but I just want to get some 
> feedback before I spend more time on it. I would like to see if youguys want 
> this cloudwatch handler goes into the main repo, or do youguys prefer it to 
> be a standalone third-party module. If it's that case, I can close this 
> ticket and create a standalone repo on my own. If the PR is welcome, then I 
> can spend more time on polishing it based on your feedback, add tests / 
> documents and other stuff.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-2325) Task logging with AWS Cloud watch

2020-03-02 Thread Felix Uellendall (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Uellendall resolved AIRFLOW-2325.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Task logging with AWS Cloud watch
> -
>
> Key: AIRFLOW-2325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2325
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Reporter: Fang-Pen Lin
>Assignee: Daniel Hegberg
>Priority: Minor
> Fix For: 2.0.0
>
>
> In many cases, it's ideal to use remote logging while running Airflow in 
> production, as the worker could be easily scale down or scale up. Or the 
> worker is running in containers, where the local storage is not meant to be 
> there forever. In that case, the S3 task logging handler could be used
> [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py]
> However, it comes with drawback. S3 logging handler only uploads the log when 
> the task completed or failed. For long running tasks, it's hard to know 
> what's going on with the process until it finishes.
> To make more real-time logging, I built a logging handler based on AWS 
> CloudWatch. It uses a third party python package `watchtower`
>  
> [https://github.com/kislyuk/watchtower/tree/master/watchtower]
>  
> I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], 
> basically I just copy-pasted the code I wrote for my own project, it works 
> fine with 1.9 release, but never tested with master branch. Also, there is a 
> bug in watchtower causing task runner to hang forever when it completes. I 
> created an issue in their repo
> [https://github.com/kislyuk/watchtower/issues/57]
> And a PR for addressing that issue 
> [https://github.com/kislyuk/watchtower/pull/58]
>  
> The PR is still far from ready to be reviewed, but I just want to get some 
> feedback before I spend more time on it. I would like to see if youguys want 
> this cloudwatch handler goes into the main repo, or do youguys prefer it to 
> be a standalone third-party module. If it's that case, I can close this 
> ticket and create a standalone repo on my own. If the PR is welcome, then I 
> can spend more time on polishing it based on your feedback, add tests / 
> documents and other stuff.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049275#comment-17049275
 ] 

ASF GitHub Bot commented on AIRFLOW-2325:
-

feluelle commented on pull request #7437: [AIRFLOW-2325] Add 
CloudwatchTaskHandler option for remote task loggi…
URL: https://github.com/apache/airflow/pull/7437
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Task logging with AWS Cloud watch
> -
>
> Key: AIRFLOW-2325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2325
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: logging
>Reporter: Fang-Pen Lin
>Assignee: Daniel Hegberg
>Priority: Minor
>
> In many cases, it's ideal to use remote logging while running Airflow in 
> production, as the worker could be easily scale down or scale up. Or the 
> worker is running in containers, where the local storage is not meant to be 
> there forever. In that case, the S3 task logging handler could be used
> [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py]
> However, it comes with drawback. S3 logging handler only uploads the log when 
> the task completed or failed. For long running tasks, it's hard to know 
> what's going on with the process until it finishes.
> To make more real-time logging, I built a logging handler based on AWS 
> CloudWatch. It uses a third party python package `watchtower`
>  
> [https://github.com/kislyuk/watchtower/tree/master/watchtower]
>  
> I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], 
> basically I just copy-pasted the code I wrote for my own project, it works 
> fine with 1.9 release, but never tested with master branch. Also, there is a 
> bug in watchtower causing task runner to hang forever when it completes. I 
> created an issue in their repo
> [https://github.com/kislyuk/watchtower/issues/57]
> And a PR for addressing that issue 
> [https://github.com/kislyuk/watchtower/pull/58]
>  
> The PR is still far from ready to be reviewed, but I just want to get some 
> feedback before I spend more time on it. I would like to see if youguys want 
> this cloudwatch handler goes into the main repo, or do youguys prefer it to 
> be a standalone third-party module. If it's that case, I can close this 
> ticket and create a standalone repo on my own. If the PR is welcome, then I 
> can spend more time on polishing it based on your feedback, add tests / 
> documents and other stuff.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] feluelle commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…

2020-03-02 Thread GitBox
feluelle commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler 
option for remote task loggi…
URL: https://github.com/apache/airflow/pull/7437#issuecomment-593443949
 
 
   Nice work @dhegberg!  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle merged pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…

2020-03-02 Thread GitBox
feluelle merged pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler 
option for remote task loggi…
URL: https://github.com/apache/airflow/pull/7437
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] boring-cyborg[bot] commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…

2020-03-02 Thread GitBox
boring-cyborg[bot] commented on issue #7437: [AIRFLOW-2325] Add 
CloudwatchTaskHandler option for remote task loggi…
URL: https://github.com/apache/airflow/pull/7437#issuecomment-593443592
 
 
   Awesome work, congrats on your first merged pull request!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main 
scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593431538
 
 
   This also protects against killing the scheduler by incorrect error handling 
functions.  If sys.exit(1) appears in the handler code, the scheduler will not 
be stopped.  It will only affect one DAG.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in 
the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593431538
 
 
   This also protects against killing the scheduler by incorrect error handling 
functions.  If sys.exit(1) appears in the handler code, the scheduler will not 
be stopped.  It will only affect one DAG file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid 
loading DAGs in the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#discussion_r386425807
 
 

 ##
 File path: airflow/jobs/scheduler_job.py
 ##
 @@ -1616,26 +1633,6 @@ def _validate_and_run_task_instances(self, 
simple_dag_bag: SimpleDagBag) -> bool
 self._process_executor_events(simple_dag_bag)
 return True
 
-def _process_and_execute_tasks(self, simple_dag_bag):
 
 Review comment:
   I will revert this change and propose it in a separate PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main 
scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593430117
 
 
   It is worth noting that this also solves one more problem. The modules are 
always reloaded.
   
https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-6497?filter=allopenissues
   so when someone makes a change in the additional module it is correctly 
executed. Its old version is not executed. This can be a problem because the 
handler is often stored in helper functions and is shared among many DAGs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in 
the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593427244
 
 
   > Do you have any numbers for this please?
   
   It is very difficult to measure because it depends on the specific DAG File. 
Some DAG files take up to 30 seconds or more to load. During this time, the 
scheduler loop is stopped and does not start any new tasks.  I can measure how 
long it takes to load example_dags, but it's not just a subset of cases. It 
doesn't provide real values,... but I created a spreadsheet:
   When I ran the following script:
   ```python
   import os
   import sys
   import time
   from contextlib import contextmanager
   
   import psutil
   
   from airflow.models import DagBag
   
   
   @contextmanager
   def timing_ctx():
   time1 = time.time()
   try:
   yield
   finally:
   time2 = time.time()
   diff = (time2 - time1) * 1000.0
   print('Time: %0.3f ms' % diff)
   
   
   def get_process_memory():
   process = psutil.Process(os.getpid())
   return process.memory_info().rss
   
   
   @contextmanager
   def memory_ctx():
   before = get_process_memory()
   try:
   yield
   finally:
   after = get_process_memory()
   diff = after - before
   print('Memory: %d bytes' % diff)
   
   
   filename = sys.argv[1]
   
   with timing_ctx(), memory_ctx():
   print("Filename:", filename)
   DagBag(dag_folder=filename, include_examples=False, 
store_serialized_dags=False)
   ```
   ```
   find  airflow/providers/google/cloud/example_dags/ -type f | sort| grep -v 
"__init__.py" | grep -v "__init__.py" | xargs -n 1 readlink -e  | xargs -t -n 1 
python /files/performance/load_dag_perf_test.py
   ```
   I got following values:
   
https://docs.google.com/spreadsheets/d/1T0kLEQLSU5ujxU-W_PoxddbkEgWx70EQkpwjRLNWaic/edit?usp=sharing
   In this case, IPC communication should be very fast. I suspect it should 
take less than 3% of all DAG loading time. So it can be assumed that the main 
loop is faster by the time of loading the module.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page

2020-03-02 Thread Ebrima Jallow (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049250#comment-17049250
 ] 

Ebrima Jallow commented on AIRFLOW-6747:


Thanks. I am looking into it. 

> UI - Show count of tasks in each dag on the main dags page
> --
>
> Key: AIRFLOW-6747
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6747
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.7
>Reporter: t oo
>Assignee: Ebrima Jallow
>Priority: Minor
>  Labels: gsoc, gsoc2020, mentor
>
> Main DAGs page in UI - would benefit from showing a new column: number of 
> tasks for each dag id



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main 
scheduler loop
URL: https://github.com/apache/airflow/pull/7597#issuecomment-593427244
 
 
   > Do you have any numbers for this please?
   
   It is very difficult to measure because it depends on the specific DAG File. 
Some DAG files take up to 30 seconds or more to load. During this time, the 
scheduler loop is stopped and does not start any new tasks.  I can measure how 
long it takes to load example_dags, but it's not just a subset of cases. It 
doesn't provide real values,... but I created a spreadsheet:
   When I ran the following script:
   ```python
   import os
   import sys
   import time
   from contextlib import contextmanager
   
   import psutil
   
   from airflow.models import DagBag
   
   
   @contextmanager
   def timing_ctx():
   time1 = time.time()
   try:
   yield
   finally:
   time2 = time.time()
   diff = (time2 - time1) * 1000.0
   print('Time: %0.3f ms' % diff)
   
   
   def get_process_memory():
   process = psutil.Process(os.getpid())
   return process.memory_info().rss
   
   
   @contextmanager
   def memory_ctx():
   before = get_process_memory()
   try:
   yield
   finally:
   after = get_process_memory()
   diff = after - before
   print('Memory: %d bytes' % diff)
   
   
   filename = sys.argv[1]
   
   with timing_ctx(), memory_ctx():
   print("Filename:", filename)
   DagBag(dag_folder=filename, include_examples=False, 
store_serialized_dags=False)
   ```
   ```
   find  airflow/providers/google/cloud/example_dags/ -type f | sort| grep -v 
"__init__.py" | grep -v "__init__.py" | xargs -n 1 readlink -e  | xargs -t -n 1 
python /files/performance/load_dag_perf_test.py
   ```
   I got following values:
   
https://docs.google.com/spreadsheets/d/1T0kLEQLSU5ujxU-W_PoxddbkEgWx70EQkpwjRLNWaic/edit?usp=sharing
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-824) Allow writing to XCOM values via API

2020-03-02 Thread Robin Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Miller reassigned AIRFLOW-824:


Assignee: (was: Robin Miller)

> Allow writing to XCOM values via API
> 
>
> Key: AIRFLOW-824
> URL: https://issues.apache.org/jira/browse/AIRFLOW-824
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Robin Miller
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017

2020-03-02 Thread Baoshan Gu (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Baoshan Gu closed AIRFLOW-6931.
---
Resolution: Not A Problem

> One migration failed during "airflow initdb" in mssql server 2017
> -
>
> Key: AIRFLOW-6931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6931
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.9
> Environment: microsoft sqlserver 2017
>Reporter: Baoshan Gu
>Priority: Major
>
> Running "airflw initdb" got error:
> {code:java}
>  _mssql.MSSQLDatabaseException: (5074, b"The object 
> 'UQ__dag_run__F78A9899295C1915' is dependent on column 
> 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server 
> error: Check messages from the SQL Server\nDB-Lib error message 20018, 
> severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")
> {code}
> The issue is migration file 
> [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235]
>  does not find all constraints. 
> Confirmed that changing it to case-insensitive selection works:
> {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = 
> 'unique'){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017

2020-03-02 Thread Baoshan Gu (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049221#comment-17049221
 ] 

Baoshan Gu commented on AIRFLOW-6931:
-

Changing SQL server collate to Latin1_General_CI_AI works without any code 
changes. I am closing the ticket.

> One migration failed during "airflow initdb" in mssql server 2017
> -
>
> Key: AIRFLOW-6931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6931
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.9
> Environment: microsoft sqlserver 2017
>Reporter: Baoshan Gu
>Priority: Major
>
> Running "airflw initdb" got error:
> {code:java}
>  _mssql.MSSQLDatabaseException: (5074, b"The object 
> 'UQ__dag_run__F78A9899295C1915' is dependent on column 
> 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server 
> error: Check messages from the SQL Server\nDB-Lib error message 20018, 
> severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")
> {code}
> The issue is migration file 
> [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235]
>  does not find all constraints. 
> Confirmed that changing it to case-insensitive selection works:
> {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = 
> 'unique'){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017

2020-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049219#comment-17049219
 ] 

ASF GitHub Bot commented on AIRFLOW-6931:
-

BaoshanGu commented on pull request #7574: [AIRFLOW-6931] Fixed migrations to 
find all dependencies for mssql
URL: https://github.com/apache/airflow/pull/7574
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> One migration failed during "airflow initdb" in mssql server 2017
> -
>
> Key: AIRFLOW-6931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6931
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.9
> Environment: microsoft sqlserver 2017
>Reporter: Baoshan Gu
>Priority: Major
>
> Running "airflw initdb" got error:
> {code:java}
>  _mssql.MSSQLDatabaseException: (5074, b"The object 
> 'UQ__dag_run__F78A9899295C1915' is dependent on column 
> 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server 
> error: Check messages from the SQL Server\nDB-Lib error message 20018, 
> severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")
> {code}
> The issue is migration file 
> [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235]
>  does not find all constraints. 
> Confirmed that changing it to case-insensitive selection works:
> {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = 
> 'unique'){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] BaoshanGu commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql

2020-03-02 Thread GitBox
BaoshanGu commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all 
dependencies for mssql
URL: https://github.com/apache/airflow/pull/7574#issuecomment-593409257
 
 
   Changing SQL server collate to Latin1_General_CI_AI works without any code 
changes. I am closing the PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BaoshanGu closed pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql

2020-03-02 Thread GitBox
BaoshanGu closed pull request #7574: [AIRFLOW-6931] Fixed migrations to find 
all dependencies for mssql
URL: https://github.com/apache/airflow/pull/7574
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid 
loading DAGs in the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#discussion_r386390242
 
 

 ##
 File path: airflow/utils/dag_processing.py
 ##
 @@ -655,6 +693,7 @@ def start(self):
 # Update number of loop iteration.
 self._num_run += 1
 
+simple_dags = self.collect_results()
 
 Review comment:
   Yes. First we need to create processes, and then we can read the results. 
Otherwise, we will never get the value on the first iteration of the loop. 
Other solution: we can increase loop iteration nunber: 
https://github.com/apache/airflow/blob/cb455dc81162680f90edcd78400e1ef46c09766d/tests/utils/test_dag_processing.py#L291


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie edited a comment on issue #7593: [AIRFLOW-6959] Use NULL as dag.description default value and change r…

2020-03-02 Thread GitBox
zhongjiajie edited a comment on issue #7593: [AIRFLOW-6959] Use NULL as 
dag.description default value and change r…
URL: https://github.com/apache/airflow/pull/7593#issuecomment-592950147
 
 
   **The detail change as below**
   
   **UPDATE AT 2020-03-02**: section `homepage without description` and `dag 
detail page without description` are same as before.
   
   | section | old  
   | new
   |
   | :-- | 
:-- | 
: |
   | database| empty string 
![](https://i.loli.net/2020/02/29/QyXGLTSztmh7vUD.png) | null value 
![](https://i.loli.net/2020/02/29/zCwlgTdiyGbkrPB.png) |
   | ~~homepage without description~~   |  | |
   | dag detail page with description| 
![](https://i.loli.net/2020/02/29/CzXxy5tURBnNLDv.png)  | 
![](https://i.loli.net/2020/02/29/MN4UX8IPpLWfgiS.png)|
   | ~~dag detail page without description~~ | | |
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop

2020-03-02 Thread GitBox
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid 
loading DAGs in the main scheduler loop
URL: https://github.com/apache/airflow/pull/7597#discussion_r386390242
 
 

 ##
 File path: airflow/utils/dag_processing.py
 ##
 @@ -655,6 +693,7 @@ def start(self):
 # Update number of loop iteration.
 self._num_run += 1
 
+simple_dags = self.collect_results()
 
 Review comment:
   Yes. First we need to create processes, and then we can read the results. 
Otherwise, we will never get the value on the first iteration of the loop. 
Other solution, we can increase loop iteration nunber: 
https://github.com/apache/airflow/blob/cb455dc81162680f90edcd78400e1ef46c09766d/tests/utils/test_dag_processing.py#L291


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >