tirkarthi commented on PR #38434: URL: https://github.com/apache/airflow/pull/38434#issuecomment-2016741776
One another thing that I noticed was that the "mean run duration" was actually "mean total duration". Since the bars are stacked the value used to calculate mean was actually "queued duration + run duration" which I mistook as run duration since most of my task instances had less than few seconds of queued seconds and I assumed it was mean run duration. Below is an example of the the change in value. This is more visible when tasks take more time in queued state than actual execution where mean run duration markline will be below mean queued duration markline. But using total run means mean total run will be above queued duration markline. To have median only for run duration we have to add "valueDim": 2 in the second markline. Is "mean total" more useful than "mean run"? Example with mean queued, run and total plotted. ```python >>> import statistics >>> queued [1.75, 1.72, 1.9, 1.58, 1.81] >>> run [28.18, 20.2, 16.16, 1.19, 22.19] >>> statistics.median(run) # mean run duration 20.2 >>> statistics.median([q + r for q, r in zip(queued, run)]) # mean total duration 21.919999999999998 ```  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
