guptakumartanuj commented on a change in pull request #5037: [AIRFLOW-4237]
Including Try Number of Task in Gantt Chart
URL: https://github.com/apache/airflow/pull/5037#discussion_r289127472
##########
File path: airflow/www/views.py
##########
@@ -1738,10 +1738,21 @@ def gantt(self, session=None):
gantt_bar_items = []
for ti in tis:
end_date = ti.end_date or timezone.utcnow()
- gantt_bar_items.append((ti.task_id, ti.start_date, end_date,
ti.state))
+ try_count = ti.try_number
+ if ti.state == State.FAILED or ti.state == State.SUCCESS:
+ try_count = ti.try_number - 1
+ gantt_bar_items.append((ti.task_id, ti.start_date, end_date,
ti.state, try_count))
+
+ tf_count = 0
+ prev_task_id = ""
for tf in ti_fails:
end_date = tf.end_date or timezone.utcnow()
- gantt_bar_items.append((tf.task_id, tf.start_date, end_date,
State.FAILED))
+ try_count = 1
+ if tf_count != 0 and tf.task_id == prev_task_id:
+ try_count = try_count + 1
Review comment:
Answer to your questions -
1. I understand the fact but that order is already maintained through this
ordered
[list](https://github.com/apache/airflow/pull/5037/files#diff-948e87b4f8f644b3ad8c7950958df033R1728)
keeping the gantt chart in mind. Anyhow metadata apart from try number is
coming in sorted order.
Example can be understood by the below use case -
mysql> select * from task_fail as tf where TF.task_id = 'get_op' AND
TF.execution_date = '2019-05-29 05:30:00.000000' AND
TF.dag_id='example_http_operator1';
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
| id | task_id | dag_id | execution_date |
start_date | end_date | duration |
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
| 8 | get_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:37:45.177828 | 2019-05-30 22:37:51.449197 | 6 |
| 27 | get_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:43:08.766188 | 2019-05-30 22:43:10.270735 | 2 |
| 46 | get_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:48:20.494280 | 2019-05-30 22:48:22.069887 | 2 |
| 62 | get_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:53:27.867508 | 2019-05-30 22:53:29.474945 | 2 |
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
4 rows in set (0.00 sec)
mysql> select * from task_fail as tf where TF.task_id = 'post_op' AND
TF.execution_date = '2019-05-29 05:30:00.000000' AND
TF.dag_id='example_http_operator1';
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
| id | task_id | dag_id | execution_date |
start_date | end_date | duration |
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
| 17 | post_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:38:19.100704 | 2019-05-30 22:38:26.619226 | 8 |
| 36 | post_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:43:36.747317 | 2019-05-30 22:43:38.238442 | 1 |
| 53 | post_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:48:48.267570 | 2019-05-30 22:48:49.595367 | 1 |
| 69 | post_op | example_http_operator1 | 2019-05-29 05:30:00.000000 |
2019-05-30 22:53:55.739041 | 2019-05-30 22:53:57.038769 | 1 |
+----+---------+------------------------+----------------------------+----------------------------+----------------------------+----------+
4 rows in set (0.00 sec)
Explanation : list has been formed by taking each individual task like
appending the above result. Please find the below screenshot to validate the
same results -
<img width="1560" alt="Screen Shot 2019-05-31 at 12 24 37 AM"
src="https://user-images.githubusercontent.com/20847563/58656806-93a64500-833a-11e9-8df3-ab91d5ce1c65.png">
2. Yes, It is not appending twice. reasoning is same as above. All the
metadata except try count is populated through the same logic. I have just
added this only in both the gantt_bar_items. Above than that, I am showing the
try count in every state. As task_fail doesn't have try number in it's table so
I have made the logic to get the same and that's how try number is appended in
gantt_bar_items of the failed tasks as well.
Moreover, the conclusion is that we need to display try_count every-time
user the Gantt UI of airflow irrespective of state and that is consistent
across different states.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services