Hi Clay,All
Thank you for your response and help.
I already solved that.
i found that there are many reasons for this issue:  the OutOfMemoryError: 
unable to create new native thread.
the mmap maybe one of the reasons. But it's not the problem i met.( i use 
docker to run sls).

the key for my issue is:
SLS-runner.xml
  <property>
    <name>yarn.sls.runner.pool.size</name>
    <value>100000</value>
  </property>
the value i set is too large.

in the TaskRunner.java#155: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L155

executor = new ThreadPoolExecutor(threadPoolSize, threadPoolSize, 0,
      TimeUnit.MILLISECONDS, queue);

ThreadPoolExecutor:
ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, 
TimeUnit unit, BlockingQueue<Runnable> workQueue)
So the both corePoolSize and maximumPoolSize are set the same 
size:threadPoolSize.  So if the users set the value of 
yarn.sls.runner.pool.size too large. the corePoolSize will cause 
OutOfMemoryError error.
And in fact, for the sake of convenience and insurance, users will always set a 
big value.

Here is my solution:

1. I think we need set a max_PoolSize in slsconfiguration.java. and take it as 
the second parameter:
executor = new ThreadPoolExecutor(threadPoolSize, max_PoolSize, 0,
      TimeUnit.MILLISECONDS, queue);

2. We also can judge the size of input threadPoolSize, if lager than 
max_PoolSize, throws an error or warn to remind users.

I can submit an patch for my solution. What do you think?



------------------------------------------------------------------


Hi Sichen,

I would expect you are running out of mmap ranges on most stock Linux 
kernels. (Each thread takes a mmap slot.) You can increase your 
vm.max_map_count[1] to see if that helps.

-Clay

[1]: A discussion on effecting the change: 
https://www.systutorials.com/241561/maximum-number-of-mmaped-ranges-and-how-to-set-it-on-linux/

On Tue, 24 Jul 2018, 赵思晨(思霖) wrote:

> Hi,
> I am running 200+ jobs, and each job contains 100 tasks, when i using 
> slsrun.sh
> to start SLS.
> it came out error:
> 
> 2018-07-24 04:47:27,957 INFO capacity.CapacityScheduler: Added node 11.178.150
> .104:1604 clusterResource: <memory:821760000, vCores:15408000, disk: 609900000
> 0M, resource2: 8025G>
> Exception in thread "main" java.lang.OutOfMemoryError: unable to create new na
> tive thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:717)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecuto
> r.java:957)
>         at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(Thre
> adPoolExecutor.java:1617)
>         at org.apache.hadoop.yarn.sls.scheduler.TaskRunner.start(TaskRunner.ja
> va:157)
>         at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:247)
>         at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:950)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:957)
> 
> I set the Xmx and Xms in Hadoop-env.sh: -Xmx20480m, -Xms20480m, but still
> doesn't work.
> 
> Anyone help me?
> 
> thanks inadvance
> 
> Sichen
> 
>

Reply via email to