[ 
https://issues.apache.org/jira/browse/NUTCH-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090384#comment-17090384
 ] 

ASF GitHub Bot commented on NUTCH-2781:
---------------------------------------

sebastian-nagel opened a new pull request #512:
URL: https://github.com/apache/nutch/pull/512


   - increase default value for NUTCH_HEAPSIZE to 4096 MB (from 1000 MB)
   - remove -Dmapred.child.java.opts=-Xmx1000m from default options in bin/crawl


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Increase default Java heap size
> -------------------------------
>
>                 Key: NUTCH-2781
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2781
>             Project: Nutch
>          Issue Type: Improvement
>          Components: runtime
>    Affects Versions: 1.16
>            Reporter: Sebastian Nagel
>            Priority: Minor
>             Fix For: 1.17
>
>
> The Nutch run script (bin/nutch) sets a "conservative" Java heap size of 1000 
> MB. This default was defined [15 years 
> ago|https://github.com/apache/nutch/blame/dcbb0f2bf450c6bec6f45125c68f5c7a0f061474/src/bin/nutch#L24].
>  It's probably safe to increase the heap size to a value suitable to process 
> more pages or larger documents. What about 4096 MB?
> Note this overlaps with NUTCH-2501 (Java heap size defined via 
> mapred.child.java.opts in distributed mode).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to