[jira] [Commented] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-16 Thread Jacob Ward (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016746#comment-17016746
 ] 

Jacob Ward commented on AIRFLOW-6556:
-

Absolutely!

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or unclear 
> documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-15 Thread Jacob Ward (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016155#comment-17016155
 ] 

Jacob Ward edited comment on AIRFLOW-6556 at 1/15/20 5:09 PM:
--

Here's a list of issues myself and my team have noticed. I'll add more as I 
notice them.
 * [Kubernetes|https://airflow.apache.org/docs/stable/kubernetes.html]:
 ** An explanation of how the KubernetesExecutor (KE) and KubernetesPodOperator 
(KPO) work.
 ** An explanation of the difference between the KE & KPO (I see this question 
and confusion arise a lot in slack).
 ** An explanation of exactly what the [kubernetes] config options are used for 
with the KE (something that I have only just started to understand is that 
these config options mostly apply to the workers, since the scheduler does not 
_have_ to run inside the cluster with KE).
 ** (Minor) The KPO hyperlink on the Kubernetes doc page sends you to the KE 
page. Also the links on this page send you to readthedocs.io
 * Some information [(possibly 
here)|https://airflow.apache.org/docs/stable/concepts.html?highlight=trigger%20rule#dag-assignment]
 about how assigning operators to dags after creation means the default_args 
aren't applied.
 * In general I think the Concepts/Core Ideas section has a lot of interesting 
and useful information, but since it is one long page it makes it harder to 
find the information you need. I think this would be better split into several 
pages with more information and examples for each concept.
 * Some information on best practice, how to avoid common pitfalls, etc.
 * [LocalExecutor is 
missing|https://airflow.apache.org/docs/stable/executor/index.html] and some 
general information on the architecture and how these Executors are supposed to 
work would be useful. I often see new users getting confused by what the 
Executor really is, and get this and operators mixed up at times.
 * [Security|https://airflow.apache.org/docs/stable/security.html#viewer] - 
What all the permissions are, and what does each one actually mean? E.g. 
'can_show' is not descriptive enough to know exactly what it means.
 * It may be covered by Kaxil's updates to the config documentation, but a page 
on how new dags are picked up, parsed and stored, and what the DagBag is and it 
interacts with/or is interacted with by the scheduler/webserver/db.


was (Author: jward):
Here's a list of issues myself and my team have noticed. I'll add more as I 
notice them.
 * [Kubernetes|https://airflow.apache.org/docs/stable/kubernetes.html]:
 ** An explanation of how the KubernetesExecutor (KE) and KubernetesPodOperator 
(KPO) work.
 ** An explanation of the difference between the KE & KPO (I see this question 
and confusion arise a lot in slack).
 ** An explanation of exactly what the [kubernetes] config options are used for 
with the KE (something that I have only just started to understand is that 
these config options mostly apply to the workers, since the scheduler does not 
_have_ to run inside the cluster with KE).
 ** (Minor) The KPO hyperlink on the Kubernetes doc page sends you to the KE 
page. Also the links on this page send you to readthedocs.io
 * Some information [(possibly 
here)|https://airflow.apache.org/docs/stable/concepts.html?highlight=trigger%20rule#dag-assignment]
 about how assigning operators to dags after creation means the default_args 
aren't applied.
 * In general I think the Concepts/Core Ideas section has a lot of interesting 
and useful information, but since it is one long page it makes it harder to 
find the information you need. I think this would be better split into several 
pages with more information and examples for each concept.
 * Some information on best practice, how to avoid common pitfalls, etc.
 * [LocalExecutor is 
missing|https://airflow.apache.org/docs/stable/executor/index.html] and some 
general information on the architecture and how these Executors are supposed to 
work would be useful. I often see new users getting confused by what the 
Executor really is, and get this and operators mixed up at times.
 * [Security|https://airflow.apache.org/docs/stable/security.html#viewer] - 
What all the permissions are, and what does each one actually mean? E.g. 
'can_show' is not descriptive enough to know exactly what it means.

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or 

[jira] [Commented] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-15 Thread Jacob Ward (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016155#comment-17016155
 ] 

Jacob Ward commented on AIRFLOW-6556:
-

Here's a list of issues myself and my team have noticed. I'll add more as I 
notice them.
 * [Kubernetes|https://airflow.apache.org/docs/stable/kubernetes.html]:
 ** An explanation of how the KubernetesExecutor (KE) and KubernetesPodOperator 
(KPO) work.
 ** An explanation of the difference between the KE & KPO (I see this question 
and confusion arise a lot in slack).
 ** An explanation of exactly what the [kubernetes] config options are used for 
with the KE (something that I have only just started to understand is that 
these config options mostly apply to the workers, since the scheduler does not 
_have_ to run inside the cluster with KE).
 ** (Minor) The KPO hyperlink on the Kubernetes doc page sends you to the KE 
page. Also the links on this page send you to readthedocs.io
 * Some information [(possibly 
here)|https://airflow.apache.org/docs/stable/concepts.html?highlight=trigger%20rule#dag-assignment]
 about how assigning operators to dags after creation means the default_args 
aren't applied.
 * In general I think the Concepts/Core Ideas section has a lot of interesting 
and useful information, but since it is one long page it makes it harder to 
find the information you need. I think this would be better split into several 
pages with more information and examples for each concept.
 * Some information on best practice, how to avoid common pitfalls, etc.
 * [LocalExecutor is 
missing|https://airflow.apache.org/docs/stable/executor/index.html] and some 
general information on the architecture and how these Executors are supposed to 
work would be useful. I often see new users getting confused by what the 
Executor really is, and get this and operators mixed up at times.
 * [Security|https://airflow.apache.org/docs/stable/security.html#viewer] - 
What all the permissions are, and what does each one actually mean? E.g. 
'can_show' is not descriptive enough to know exactly what it means.

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or unclear 
> documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-14 Thread Jacob Ward (Jira)
Jacob Ward created AIRFLOW-6556:
---

 Summary: Improving unclear and incomplete documentation
 Key: AIRFLOW-6556
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
 Project: Apache Airflow
  Issue Type: Improvement
  Components: documentation
Affects Versions: master
Reporter: Jacob Ward
Assignee: Jarek Potiuk


To help improve documentation it was discussed in the mailing list that users 
of Airflow should have somewhere to report missing, incomplete or unclear 
documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-5859) Tasks locking and heartbeat warnings

2019-11-21 Thread Jacob Ward (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Ward closed AIRFLOW-5859.
---
Resolution: Invalid

> Tasks locking and heartbeat warnings
> 
>
> Key: AIRFLOW-5859
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5859
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.6
> Environment: Airflow using LocalExecutor and Postgres
>Reporter: Jacob Ward
>Priority: Major
>
> Having two potentially related issues.
> Issue 1:
> Some of my tasks (has only been PythonOperators so far) are starting and then 
> doing nothing. I had a task that usually executes in 10-30 minutes running 
> for over 24hrs without any error messages in the logs (other than the 
> heartbeat warnings shown below) and without failing.
> So far this has only happened to tasks inside sub-dags, not sure if that's to 
> do with it?
> The logs for the sub-dag show: 
> {code:}
> [2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 
> 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 
> 1 | deadlocked: 0 | not ready: 7
> [2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,830] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986693 s
> [2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 
> 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 
> 1 | deadlocked: 0 | not ready: 7
> [2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:39,859] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986141 s
> {code}
> repeated during that 24hr period.
> Issue 2:
> In all of the logs this warning message is printed every 5 seconds:
> {code:}
> [2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 
> 14:25:29,465] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.987354 s {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5859) Tasks locking and heartbeat warnings

2019-11-14 Thread Jacob Ward (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974088#comment-16974088
 ] 

Jacob Ward commented on AIRFLOW-5859:
-

I'm still having this issue, it weirdly seems to be happening with specific 
tasks. I have investigated one and it appears the python code has executed 
successfully but Airflow still has the task in the `Running` state.

> Tasks locking and heartbeat warnings
> 
>
> Key: AIRFLOW-5859
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5859
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.6
> Environment: Airflow using LocalExecutor and Postgres
>Reporter: Jacob Ward
>Priority: Major
>
> Having two potentially related issues.
> Issue 1:
> Some of my tasks (has only been PythonOperators so far) are starting and then 
> doing nothing. I had a task that usually executes in 10-30 minutes running 
> for over 24hrs without any error messages in the logs (other than the 
> heartbeat warnings shown below) and without failing.
> So far this has only happened to tasks inside sub-dags, not sure if that's to 
> do with it?
> The logs for the sub-dag show: 
> {code:}
> [2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 
> 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 
> 1 | deadlocked: 0 | not ready: 7
> [2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,830] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986693 s
> [2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 
> 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 
> 1 | deadlocked: 0 | not ready: 7
> [2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:39,859] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986141 s
> {code}
> repeated during that 24hr period.
> Issue 2:
> In all of the logs this warning message is printed every 5 seconds:
> {code:}
> [2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 
> 14:25:29,465] {local_task_job.py:124} WARNING - Time since last 
> heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.987354 s {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5859) Tasks locking and heartbeat warnings

2019-11-06 Thread Jacob Ward (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Ward updated AIRFLOW-5859:

Description: 
Having two potentially related issues.

Issue 1:

Some of my tasks (has only been PythonOperators so far) are starting and then 
doing nothing. I had a task that usually executes in 10-30 minutes running for 
over 24hrs without any error messages in the logs (other than the heartbeat 
warnings shown below) and without failing.

So far this has only happened to tasks inside sub-dags, not sure if that's to 
do with it?

The logs for the sub-dag show: 


{code:}
[2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,830] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986693 s
[2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,859] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986141 s
{code}
repeated during that 24hr period.


Issue 2:

In all of the logs this warning message is printed every 5 seconds:


{code:}
[2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 
14:25:29,465] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.987354 s {code}


  was:
Having two potentially related issues.

Issue 1:

Some of my tasks (has only been PythonOperators so far) are starting and then 
doing nothing. I had a task that usually executes in 10-30 minutes running for 
over 24hrs without any error messages in the logs (other than the heartbeat 
warnings shown below) and without failing.

So far this has only happened to tasks inside sub-dags, not sure if that's to 
do with it?

The logs for the sub-dag show: 


{code:java}
[2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,830] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986693 s
[2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,859] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986141 s
{code}
repeated during that 24hr period.


Issue 2:

In all of the logs this warning message is printed every 5 seconds:

{{}}
{code:java}
[2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 
14:25:29,465] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.987354 s {code}
{{}}


> Tasks locking and heartbeat warnings
> 
>
> Key: AIRFLOW-5859
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5859
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.10.6
> Environment: Airflow using LocalExecutor and Postgres
>Reporter: Jacob Ward
>Priority: Major
>
> Having two potentially related issues.
> Issue 1:
> Some of my tasks (has only been PythonOperators so far) are starting and then 
> doing nothing. I had a task that usually executes in 10-30 minutes running 
> for over 24hrs without any error messages in the logs (other than the 
> heartbeat warnings shown below) and without failing.
> So far this has only happened to tasks inside sub-dags, not sure if that's to 
> do with it?
> The logs for the sub-dag show: 
> {code:}
> [2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 
> 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 
> 1 | deadlocked: 0 | not ready: 7
> [2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
> 16:41:34,830] {local_task_job.py:124} WARNING - Time since last 
> 

[jira] [Created] (AIRFLOW-5859) Tasks locking and heartbeat warnings

2019-11-06 Thread Jacob Ward (Jira)
Jacob Ward created AIRFLOW-5859:
---

 Summary: Tasks locking and heartbeat warnings
 Key: AIRFLOW-5859
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5859
 Project: Apache Airflow
  Issue Type: Bug
  Components: DagRun
Affects Versions: 1.10.6
 Environment: Airflow using LocalExecutor and Postgres
Reporter: Jacob Ward


Having two potentially related issues.

Issue 1:

Some of my tasks (has only been PythonOperators so far) are starting and then 
doing nothing. I had a task that usually executes in 10-30 minutes running for 
over 24hrs without any error messages in the logs (other than the heartbeat 
warnings shown below) and without failing.

So far this has only happened to tasks inside sub-dags, not sure if that's to 
do with it?

The logs for the sub-dag show: 


{code:java}
[2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:34,830] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986693 s
[2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 
of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | 
deadlocked: 0 | not ready: 7
[2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 
16:41:39,859] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.986141 s
{code}
repeated during that 24hr period.


Issue 2:

In all of the logs this warning message is printed every 5 seconds:

{{}}
{code:java}
[2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 
14:25:29,465] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 
s) < heartrate(5.0 s), sleeping for 4.987354 s {code}
{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)