Github user MartinWeindel commented on the pull request:

    https://github.com/apache/spark/pull/1860#issuecomment-53203084
  
    Hey Patrick,
    
    first of all let me emphasize again that this is only a work-around. The 
    real problem is that Mesos only makes offers if there are at least 32 MB 
    memory available which conflicts with allocating memory only for Spark 
    worker executors and none for tasks.
    You seem to be right, this work-around does not help if executors 
    already consume all memory (up to a remainder of <= 31 MB).
    So I don't know if it will avoid dead locks in all cases.
    
    I can only argue from an experimental point of view, that I have not 
    seen the dead lock in my cluster anymore after applying this patch (I 
    have tested under very heavy work load).
    I suspect the chance is very small that another executor starts before 
    at least one task of the first executor is started.
    In any case, after a task is finished, there are at least 32 MB memory 
    allocatable so that Mesos always will make offers and the dead lock is 
    avoided.
    
    BTW, I have also played with changing the executor memory so that there 
    must always be some Mesos slave memory left over, but to my surprise 
    this did not avoid the dead locks reliable.
    
    So I'm not sure if this patch should be integrated into the Spark source 
    code.
    But I hope it helps to understand the issue. And maybe it makes the 
    fine-grained mode usable for similar setups like mine until a better 
    solution has been found.
    
    If I can help in any way, just tell me.
    
    Best regards,
    Martin
    
    
    Am 24.08.2014 19:16, schrieb Patrick Wendell:
    >
    > Hey Martin,
    >
    > I'm having a bit of trouble seeing how this works around the issue. 
    > From what I can tell the issue is that if someone creates Executors 
    > that consume all memory, Mesos will refuse to make offers for the 
    > tasks. However, this fix just adds 32MB of memory as a requirement for 
    > the task... but it seems like if the offer is never made in the first 
    > place, this will make no difference. Can you describe a sequence of 
    > offers where this change alters the execution? Thanks for looking into 
    > this!
    >
    >   * Patrick
    >
    > —
    > Reply to this email directly or view it on GitHub 
    > <https://github.com/apache/spark/pull/1860#issuecomment-53200124>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to