Github user mridulm commented on the pull request:

    https://github.com/apache/spark/pull/1391#issuecomment-48836408
  
    On Jul 13, 2014 3:16 PM, "nishkamravi2" <[email protected]> wrote:
    >
    > Mridul, I think you are missing the point. We understand that this
    parameter will in a lot of cases have to be specified by the developer,
    since there is no easy way to model it (that's why we are retaining it as a
    configurable parameter). However, the question is what would be a good
    default value be.
    >
    
    It does not help to estimate using the wrong variable.
    Any correlation which exists are incidental and app specific, as I
    elaborated before.
    
    The only actual correlation between executor memory and overhead is java vm
    overheads in managing very large heaps (and that is very high as a
    fraction). Other factors in spark have far higher impact than this.
    
    > "I would like a good default estimate of overhead ... But that is not
    > fraction of executor memory. "
    >
    > You are mistaken. It may not be a directly correlated variable, but it is
    most certainly indirectly correlated. And it is probably correlated to
    other app-specific parameters as well.
    
    Please see above.
    
    >
    > "Until the magic explanatory variable is found, which one is less
    problematic for end users -- a flat constant that frequently has to be
    tuned, or an imperfect model that could get it right in more cases?"
    >
    > This is the right point of view.
    
    Which has been our view even in previous discussions :-)
    It is unfortunate that we did not approximate this better from the start
    and went with the constant from the prototype.l impl.
    
    Note that this estimation would be very volatile to spark internals
    
    >
    > —
    > Reply to this email directly or view it on GitHub.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to