Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/9051#issuecomment-148808343
  
    Also jumping in late, but I agree with @andrewor14 , I think we should just 
change duration to (1), that would be the most useful.  My vote is for (last 
task end) - (first task start).  I see the argument for sum(task time) as well, 
not strongly opposed to it, but in that case it would definitely need to be 
renamed from duration, maybe "total cpu time"?
    
    I do see the case for having something to help diagnose skew, but I'm not 
sure "max task time" alone really helps much.  I don't think there is one 
metric which is going to capture that plus the overall duration thats been 
discussed.  If we only want one metric on the page, I'd vote for the new 
"duration" over max task time.  I don't think max task time is really that 
useful in isolation.  Its useful on the stage page b/c you've also got the 
distribution.  it seems like you really want something like (max task time - 
90% task time)/ (90% task time).  But we can probably spend all day arguing 
about our favorite skew metric ... makes me wonder if this really belongs in 
the standard UI or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to