[GitHub] spark issue #16774: [SPARK-19357][ML][WIP] Adding parallel model evaluation ...

BryanCutler Thu, 16 Feb 2017 15:38:42 -0800

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/16774
  
    > I think setting the default to match current behavior is best (i.e. 1).
    
    I agree, just wanted to bring it up in case others had differing view since 
it was a concern in the JIRA.
    
    >I think it's important to note in the docs that this will actually only 
really work as expected if the FAIR scheduler is enabled, otherwise I don't 
think things will actually be executed concurrently.
    
    I ran some tests with and without the FAIR scheduler enabled.  Without FAIR 
scheduler, if there are enough resources in the cluster (e.g. cores) then the 
tasks are run concurrently.  If not enough resources, then the jobs will wait.  
With FAIR scheduler the tasks are run concurrently and share the available 
cores.  Both will complete model selection in about the same time, with equal 
resources, so I think the main benefit of FAIR is for multi-users so that one 
users jobs don't get starved.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16774: [SPARK-19357][ML][WIP] Adding parallel model evaluation ...

Reply via email to