Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/16867
  
    The cost for median heap could be higher than TreeMap imo - for example, 
the additional dequeue + enqueue when rebalance is required ?
    If the cost is high enough, we might want to relook at the PR
    For example, 
    * if the overhead is not negligible, we would want to disable this when 
speculative execution is disabled.
    * depending on how high the cost is, disable it for numTask is below some 
value ?
    Essentially, if the benefits from median computation does not outweight the 
cost of maintaining the data structure. Looks at various task cardinality will 
help to arrive at a better judgement.
    
    @kayousterhout , @squito any other comments on how to measure the impact of 
the PR ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to