[jira] [Created] (AIRFLOW-4854) Env Var passed to K8 worker pod should be case sensitive

2019-06-26 Thread raman (JIRA)
raman created AIRFLOW-4854:
--

 Summary: Env Var passed to K8 worker pod should be case sensitive
 Key: AIRFLOW-4854
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4854
 Project: Apache Airflow
  Issue Type: Improvement
  Components: executors
Affects Versions: 1.10.2
Reporter: raman


We are using K8Executor and there is a use case where users might provide env 
variable in lower case which is read by their custom operator in K8 Pod. But 
currently it seems that in K8 executor env variables are transformed in to 
upper case 
[https://github.com/apache/airflow/blob/f520d02cc1f41f9861f479f984bb52bda3860d30/airflow/kubernetes/secret.py#L43].

These should be passed in case sensitive manner



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4789) Add Mysql Exception Handling while pushing data to XCom

2019-06-13 Thread raman (JIRA)
raman created AIRFLOW-4789:
--

 Summary: Add Mysql Exception Handling while pushing data to XCom
 Key: AIRFLOW-4789
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4789
 Project: Apache Airflow
  Issue Type: Improvement
  Components: xcom
Affects Versions: 1.10.2
Reporter: raman


We have seen mysql exceptions while setting up xcom data. It would be better to 
add exception handling there



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-4586) Task getting stuck in Queued State

2019-05-31 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-4586:
--

Assignee: raman

> Task getting stuck in Queued State
> --
>
> Key: AIRFLOW-4586
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4586
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.0
>Reporter: raman
>Assignee: raman
>Priority: Major
>
> We are observing intermittently that Tasks get stuck in queued state and 
> never get executed by Airflow. On debugging it we found that one of the 
> queued dependency was not met due to which task did not move from queued to 
> running state. So task remained in queued state. (are_dependencies_met 
> function returned false for QUEUE_DEPS inside 
> _check_and_change_state_before_execution).
> By looking into scheduler code it seems that scheduler does not reschedule 
> the queued state tasks due to which task never got added to executor queue 
> again and remained stuck in queued state. There is a logic inside 
> _check_and_change_state_before_execution function to move the task from 
> queued to None state(which gets picked by scheduler for rescheduling) if 
> RUN_DEPS are not met but this logic seems to be missing for QUEUE_DEPS. It 
> seems that task should be moved to None state even if QUEUE_DEPS are not met.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4586) Task getting stuck in Queued State

2019-05-31 Thread raman (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852848#comment-16852848
 ] 

raman commented on AIRFLOW-4586:


Inside _process_task_instances(self, dag, queue, session=None) function 
scheduler is not considering task instances in queued state(rightly so 
otherwise it might add duplicate tasks to executor queue) so task remained in 
queued state if queued dep are not met.

Below is the code snippet for same

"{color:#205081}for run in active_dag_runs:{color}
{color:#205081} self.log.debug("Examining active DAG run: %s", run){color}
{color:#205081} # this needs a fresh session sometimes tis get detached{color}
{color:#205081} tis = run.get_task_instances(state=(State.NONE,{color}
{color:#205081} State.UP_FOR_RETRY,{color}
{color:#205081} State.UP_FOR_RESCHEDULE)){color}"

> Task getting stuck in Queued State
> --
>
> Key: AIRFLOW-4586
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4586
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.0
>Reporter: raman
>Priority: Major
>
> We are observing intermittently that Tasks get stuck in queued state and 
> never get executed by Airflow. On debugging it we found that one of the 
> queued dependency was not met due to which task did not move from queued to 
> running state. So task remained in queued state. (are_dependencies_met 
> function returned false for QUEUE_DEPS inside 
> _check_and_change_state_before_execution).
> By looking into scheduler code it seems that scheduler does not reschedule 
> the queued state tasks due to which task never got added to executor queue 
> again and remained stuck in queued state. There is a logic inside 
> _check_and_change_state_before_execution function to move the task from 
> queued to None state(which gets picked by scheduler for rescheduling) if 
> RUN_DEPS are not met but this logic seems to be missing for QUEUE_DEPS. It 
> seems that task should be moved to None state even if QUEUE_DEPS are not met.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4594) Kubernetes watcher updating Task state based on Pod state

2019-05-30 Thread raman (JIRA)
raman created AIRFLOW-4594:
--

 Summary: Kubernetes watcher updating Task state based on Pod state
 Key: AIRFLOW-4594
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4594
 Project: Apache Airflow
  Issue Type: Bug
  Components: executors
Affects Versions: 1.10.1
Reporter: raman


Currently KubernetesJobWatcher updates the task state based on the pod state 
inside 

process_status function. If it gets the pod state as failed it marks the task 
state to failed which is not the correct behaviour because task might move to 
retry state or reschedule state.

Pod might be failed because of various reasons like machine/network issue, it 
might crash. But its state might  not reflect the task state



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4586) Task getting stuck in Queued State

2019-05-29 Thread raman (JIRA)
raman created AIRFLOW-4586:
--

 Summary: Task getting stuck in Queued State
 Key: AIRFLOW-4586
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4586
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.10.0
Reporter: raman


We are observing intermittently that Tasks get stuck in queued state and never 
get executed by Airflow. On debugging it we found that one of the queued 
dependency was not met due to which task did not move from queued to running 
state. So task remained in queued state. (are_dependencies_met function 
returned false for QUEUE_DEPS inside _check_and_change_state_before_execution).

By looking into scheduler code it seems that scheduler does not reschedule the 
queued state tasks due to which task never got added to executor queue again 
and remained stuck in queued state. There is a logic inside 
_check_and_change_state_before_execution function to move the task from queued 
to None state(which gets picked by scheduler for rescheduling) if RUN_DEPS are 
not met but this logic seems to be missing for QUEUE_DEPS. It seems that task 
should be moved to None state even if QUEUE_DEPS are not met.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-4091) Support for providing kube client arguments through airflow.cfg

2019-03-14 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-4091:
--

Assignee: raman

> Support for providing kube client arguments through airflow.cfg
> ---
>
> Key: AIRFLOW-4091
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4091
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: configuration, executor, kubernetes
>Affects Versions: 1.10.2
>Reporter: raman
>Assignee: raman
>Priority: Major
>
> KubeClient supports various config options like async, _reques__timeout etc. 
> But there there is no way to provide these through airflow.cfg. 
> So add the functionality to provide different kubeclient config options 
> through cfg. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4091) Support for providing kube client arguments through airflow.cfg

2019-03-14 Thread raman (JIRA)
raman created AIRFLOW-4091:
--

 Summary: Support for providing kube client arguments through 
airflow.cfg
 Key: AIRFLOW-4091
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4091
 Project: Apache Airflow
  Issue Type: Improvement
  Components: configuration, executor, kubernetes
Affects Versions: 1.10.2
Reporter: raman


KubeClient supports various config options like async, _reques__timeout etc. 
But there there is no way to provide these through airflow.cfg. 

So add the functionality to provide different kubeclient config options through 
cfg. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3917) Support for running Airflow scheduler(with KubernetesExecutor) outside of K8 cluster

2019-03-04 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-3917:
--

Assignee: raman

> Support for running Airflow scheduler(with KubernetesExecutor) outside of K8 
> cluster
> 
>
> Key: AIRFLOW-3917
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3917
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor, kubernetes, scheduler
>Affects Versions: 1.10.1, 1.10.2
>Reporter: raman
>Assignee: raman
>Priority: Major
>
> Currently It seems that Airflow Scheduler needs to be run inside K8 cluster 
> as a long running pod.
> Though there is "in_cluster" config exposed through airflow.cfg but there 
> does not seem to be a provision to pass required config file while creating 
> kube_client.
> Following is the code snippet from kubernetes_executor.py's start function 
> where kube_client is configured without passing any config
> self.kube_client = get_kube_client()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3936) Support for emptyDir volume in KubernetesExecutor

2019-02-21 Thread raman (JIRA)
raman created AIRFLOW-3936:
--

 Summary: Support for emptyDir volume in KubernetesExecutor
 Key: AIRFLOW-3936
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3936
 Project: Apache Airflow
  Issue Type: Improvement
  Components: executor, kubernetes, scheduler
Affects Versions: 1.10.1
Reporter: raman


Currently It seems that K8 Executor expects the dags_volume_claim or git_repo 
to be always defined through airflow.cfg. Otherwise it does not come up.
Though there is support for "emptyDir" volume in worker_configuration.py but 
kubernetes_executor fails in _validate function if these configs are not 
defined.
Our dag files are stored in some remote location which can be synced to worker 
pod through init/side-car container. We are exploring if it makes sense to 
allow K8 executor to come up for cases where dags_volume_claim are git_repo are 
not defined. In such cases worker pod would look for the dags in emptyDir and 
worker_airflow_dags path (like it does for git-sync). Dag files can be made 
available in worker_airflow_dags path through init/side-car container.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3917) Support for running Airflow scheduler(with KubernetesExecutor) outside of K8 cluster

2019-02-18 Thread raman (JIRA)
raman created AIRFLOW-3917:
--

 Summary: Support for running Airflow scheduler(with 
KubernetesExecutor) outside of K8 cluster
 Key: AIRFLOW-3917
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3917
 Project: Apache Airflow
  Issue Type: Improvement
  Components: executor, kubernetes, scheduler
Affects Versions: 1.10.2, 1.10.1
Reporter: raman


Currently It seems that Airflow Scheduler needs to be run inside K8 cluster as 
a long running pod.

Though there is "in_cluster" config exposed through airflow.cfg but there does 
not seem to be a provision to pass required config file while creating 
kube_client.

Following is the code snippet from kubernetes_executor.py's start function 
where kube_client is configured without passing any config

self.kube_client = get_kube_client()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3916) Remove Metastore dependency for K8 Worker pod heartbeat

2019-02-18 Thread raman (JIRA)
raman created AIRFLOW-3916:
--

 Summary: Remove Metastore dependency for K8 Worker pod heartbeat
 Key: AIRFLOW-3916
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3916
 Project: Apache Airflow
  Issue Type: Improvement
  Components: executor, kubernetes, worker
Affects Versions: 1.10.2, 1.10.1
Reporter: raman


Currently K8 Worker pod uses Metastore for continuous heartbeat which might 
make metastore a bottleneck for 1000(s) of running Pods.

Airflow with K8 Executor mode uses KubernetesJobWatcher to track the status for 
submitted worker pod(s). It can be leveraged to remove the worker pod's 
dependency on Meta store for heartbeating. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-3882) Optimise APIs to create DAG model from DB instead of calling DagBag()

2019-02-12 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman closed AIRFLOW-3882.
--
Resolution: Not A Problem

> Optimise APIs to create DAG model from DB instead of calling DagBag()
> -
>
> Key: AIRFLOW-3882
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3882
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: api, webserver
>Affects Versions: 1.10.2
>Reporter: raman
>Priority: Minor
>
> Currently following APIs create the Dag object by calling DagBag() on all the 
> files. which is not an optimised way. There is an alternative to create a Dag 
> object from DB
> -> get_dag_run_state
> -> get_dag_runs
> -> /dags//tasks/
> -> 
> /dags//dag_runs//tasks/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3882) Optimise APIs to create DAG model from DB instead of calling DagBag()

2019-02-12 Thread raman (JIRA)
raman created AIRFLOW-3882:
--

 Summary: Optimise APIs to create DAG model from DB instead of 
calling DagBag()
 Key: AIRFLOW-3882
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3882
 Project: Apache Airflow
  Issue Type: Improvement
  Components: api, webserver
Affects Versions: 1.10.2
Reporter: raman


Currently following APIs create the Dag object by calling DagBag() on all the 
files. which is not an optimised way. There is an alternative to create a Dag 
object from DB

-> get_dag_run_state

-> get_dag_runs

-> /dags//tasks/

-> /dags//dag_runs//tasks/

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3878) K8 Pod creation error if dag_id exceeds 63 characters

2019-02-12 Thread raman (JIRA)
raman created AIRFLOW-3878:
--

 Summary: K8 Pod creation error if dag_id exceeds 63 characters 
 Key: AIRFLOW-3878
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3878
 Project: Apache Airflow
  Issue Type: Bug
  Components: executor, kubernetes
Affects Versions: 1.10.2
Reporter: raman


Getting following error while creating K8 worker Pod. Dag Id length is > 63 
char and as per 
[https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/] 
lable value should be <=63

HTTP response body: 
\{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"unable
 to parse requirement: invalid label value: \"\": must be no more than 
63 characters","reason":"BadRequest","code":400}

K8 Executor sets the dagId as the label value while creating k8pod. Avoid 
adding dag_id and task_id as pod labels if their size exceeds 63 char

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3865) Expose API to get Python code corresponding to a Dag Id

2019-02-10 Thread raman (JIRA)
raman created AIRFLOW-3865:
--

 Summary: Expose API to get Python code corresponding to a  Dag Id
 Key: AIRFLOW-3865
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3865
 Project: Apache Airflow
  Issue Type: Improvement
  Components: api, webserver
Affects Versions: 1.9.0, 2.0.0
Reporter: raman
Assignee: raman


Currently Airflow's experimental api does not expose an endpoint to download 
DAg's python code. It might be required for sharing and debugging purpose.

Proposal here is to expose following GET endpoint to allow downloading of 
dag_id code.

/dags//code

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3516) Improve K8 Executors Pod creation Rate

2019-01-03 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman updated AIRFLOW-3516:
---
Component/s: (was: scheduler)

> Improve K8 Executors Pod creation Rate
> --
>
> Key: AIRFLOW-3516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3516
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor, kubernetes
>Affects Versions: 1.10.1
>Reporter: raman
>Assignee: raman
>Priority: Major
>
> It seems that Airflow scheduler is creating worker pod sequentially(one pod 
> per scheduler loop) in Kubernetes executors sync function which is called 
> from base executors.heartbeat() function.
> One enhancement could be to create the k8 pod in batches where batch size can 
> be configured through airflow.cfg file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3516) Improve K8 Executors Pod creation Rate

2019-01-03 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-3516:
--

Assignee: raman

> Improve K8 Executors Pod creation Rate
> --
>
> Key: AIRFLOW-3516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3516
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor, kubernetes, scheduler
>Affects Versions: 1.10.1
>Reporter: raman
>Assignee: raman
>Priority: Major
>
> It seems that Airflow scheduler is creating worker pod sequentially(one pod 
> per scheduler loop) in Kubernetes executors sync function which is called 
> from base executors.heartbeat() function.
> One enhancement could be to create the k8 pod in batches where batch size can 
> be configured through airflow.cfg file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3613) Update Readme to add adobe as airflow user

2019-01-01 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-3613:
--

Assignee: raman

> Update Readme to add adobe as airflow user
> --
>
> Key: AIRFLOW-3613
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3613
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.10.0, 1.10.1
>Reporter: raman
>Assignee: raman
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3613) Update Readme to add adobe as airflow user

2019-01-01 Thread raman (JIRA)
raman created AIRFLOW-3613:
--

 Summary: Update Readme to add adobe as airflow user
 Key: AIRFLOW-3613
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3613
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.10.1, 1.10.0
Reporter: raman






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)