[GitHub] spark pull request: SPARK-1713. Use a thread pool for launching ex...

srowen Tue, 06 May 2014 13:41:29 -0700

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/663#issuecomment-42346939
  
    See also `Executors.newCachedThreadPool` if the intent is to always create 
a new thread if no idle ones are available. That's closer in behavior to always 
creating a new `Thread`. You may still get loads of simultaneous threads and 
may not save many allocations.
    
    Capping the number of threads helps those two factors (why wouldn't the max 
pool size have effect?) but at some level this means this may block when the 
executor is busy. Right now the no-arg `LinkedBlockingQueue` will go on 
accepting work until it has 2 billion entries, if the executor can't keep up. 
This is probably not ideal, and should be capped at something more reasonable. 
Not sure if the possibility of blocking the thread that invokes the task is a 
problem or not.
    
    How about setting the max pool size, and queue capacity, to a value that is 
deemed quite large? like, you never want more than 1000 threads and 100K tasks 
queued? at least you're putting some sane defenses up against these situations 
rather than make 100K threads.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1713. Use a thread pool for launching ex...

Reply via email to