danccooper commented on a change in pull request #6377: [AIRFLOW-5589] monitor
pods by labels instead of names
URL: https://github.com/apache/airflow/pull/6377#discussion_r338024754
##########
File path: airflow/contrib/operators/kubernetes_pod_operator.py
##########
@@ -112,55 +113,60 @@ class KubernetesPodOperator(BaseOperator): # pylint:
disable=too-many-instance-
"""
template_fields = ('cmds', 'arguments', 'env_vars', 'config_file')
+ @staticmethod
+ def create_labels_for_pod(context):
+ """
+ Generate labels for the pod s.t. we can track it in case of Operator
crash
+
+ :param context:
+ :return:
+ """
+ labels = {
+ 'dag_id': context['dag'].dag_id,
+ 'task_id': context['task'].task_id,
+ 'exec_date': context['ts'],
+ 'try_number': context['ti'].try_number,
Review comment:
I think we need to consider the desired behaviour here.
What could happen now is that we create duplicate pods in the case of a
previous try still running e.g. in the case of task timeout & no kill or failed
kill. In that case we introduce a new scenario of duplicate pods which is what
we're trying to solve. Certainly for my jobs this would cause issues as we're
not using try_number to create output locations for example.
One option is to check for the previous try pod first and kill it if it
exists before running the next one.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services