[
https://issues.apache.org/jira/browse/NUTCH-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090480#comment-17090480
]
ASF GitHub Bot commented on NUTCH-2501:
---------------------------------------
sebastian-nagel commented on a change in pull request #279:
URL: https://github.com/apache/nutch/pull/279#discussion_r413699673
##########
File path: src/bin/crawl
##########
@@ -171,6 +175,8 @@ fi
CRAWL_PATH="$1"
LIMIT="$2"
+JAVA_CHILD_HEAP_MB=`expr "$NUTCH_HEAP_MB" / "$NUM_TASKS"`
Review comment:
Hi @mfeltscher, this PR is now superceded by #513 - I've decided not to
add any new environment variables but to document how the task memory can be
set using the existing command-line flags:
```
$> bin/crawl -D mapreduce.map.memory.mb=4608 -D
mapreduce.map.java.opts=-Xmx4096m \
-Dmapreduce.reduce.memory.mb=4608
-Dmapreduce.reduce.java.opts=-Xmx4096m ...
```
Thanks for contribution and the discussion!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Take into account $NUTCH_HEAPSIZE when crawling using crawl script
> ------------------------------------------------------------------
>
> Key: NUTCH-2501
> URL: https://issues.apache.org/jira/browse/NUTCH-2501
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 1.14
> Reporter: Moreno Feltscher
> Assignee: Sebastian Nagel
> Priority: Major
> Fix For: 1.17
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)