Re: Nutch 1.6 : java.lang.OutOfMemoryError: unable to create new native thread

Sebastian Nagel Sun, 03 Mar 2013 12:41:30 -0800

Hi Kiran,

there are many possible reasons for the problem. Beside the limits on the 
number of processes
the stack size in the Java VM and the system (see java -Xss and ulimit -s).


I think in local mode there should be only one mapper and consequently only
one thread spent for parsing. So the number of processes/threads is hardly the
problem suggested that you don't run any other number crunching tasks in 
parallel
on your desktop.

Luckily, you should be able to retry via "bin/nutch parse ..."
Then trace the system and the Java process to catch the reason.

Sebastian

On 03/02/2013 08:13 PM, kiran chitturi wrote:
> Sorry, i am looking to crawl 400k documents with the crawl. I said 400 in
> my last message.
> 
> 
> On Sat, Mar 2, 2013 at 2:12 PM, kiran chitturi 
> <[email protected]>wrote:
> 
>> Hi!
>>
>> I am running Nutch 1.6 on a 4 GB Mac OS desktop with Core i5 2.8GHz.
>>
>> Last night i started a crawl on local mode for 5 seeds with the config
>> given below. If the crawl goes well, it should fetch a total of 400
>> documents. The crawling is done on a single host that we own.
>>
>> Config
>> ---------------------
>>
>> fetcher.threads.per.queue - 2
>> fetcher.server.delay - 1
>> fetcher.throughput.threshold.pages - -1
>>
>> crawl script settings
>> ----------------------------
>> timeLimitFetch- 30
>> numThreads - 5
>> topN - 10000
>> mapred.child.java.opts=-Xmx1000m
>>
>>
>> I have noticed today that the crawl has stopped due to an error and i have
>> found the below error in logs.
>>
>> 2013-03-01 21:45:03,767 INFO  parse.ParseSegment - Parsed (0ms):
>>> http://scholar.lib.vt.edu/ejournals/JARS/v33n3/v33n3-letcher.htm
>>> 2013-03-01 21:45:03,790 WARN  mapred.LocalJobRunner - job_local_0001
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>         at java.lang.Thread.start0(Native Method)
>>>         at java.lang.Thread.start(Thread.java:658)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.addThread(ThreadPoolExecutor.java:681)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655)
>>>         at
>>> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:92)
>>>         at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:159)
>>>         at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:93)
>>>         at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:97)
>>>         at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:44)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>>         at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>> (END)
>>
>>
>>
>> Did anyone run in to the same issue ? I am not sure why the new native
>> thread is not being created. The link here says [0] that it might due to
>> the limitation of number of processes in my OS. Will increase them solve
>> the issue ?
>>
>>
>> [0] - http://ww2.cs.fsu.edu/~czhang/errors.html
>>
>> Thanks!
>>
>> --
>> Kiran Chitturi
>>
> 
> 
>

Re: Nutch 1.6 : java.lang.OutOfMemoryError: unable to create new native thread

Reply via email to