Hudson (JIRA)
Thu, 02 Oct 2008 21:19:07 -0700
[
https://issues.apache.org/jira/browse/NUTCH-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636524#action_12636524
]
Hudson commented on NUTCH-640:
------------------------------
Integrated in Nutch-trunk #588 (See
[http://hudson.zones.apache.org/hudson/job/Nutch-trunk/588/])
- confusing description "set it to Integer.MAX_VALUE"
> confusing description "set it to Integer.MAX_VALUE"
> ---------------------------------------------------
>
> Key: NUTCH-640
> URL: https://issues.apache.org/jira/browse/NUTCH-640
> Project: Nutch
> Issue Type: Improvement
> Components: documentation
> Affects Versions: 0.9.0
> Reporter: Stijn Vermeeren
> Assignee: Doğacan Güney
> Priority: Minor
> Attachments: NUTCH-640.patch
>
>
> This property "indexer.max.tokens" has the following description in
> nutch-default.xml :
> " The maximum number of tokens that will be indexed for a single field
> in a document. This limits the amount of memory required for
> indexing, so that collections with very large files will not crash
> the indexing process by running out of memory.
> Note that this effectively truncates large documents, excluding
> from the index tokens that occur further in the document. If you
> know your source documents are large, be sure to set this value
> high enough to accomodate the expected size. If you set it to
> Integer.MAX_VALUE, then the only limit is your memory, but you
> should anticipate an OutOfMemoryError."
> Apparently, "set it to Integer.MAX_VALUE" here means <<substitute the integer
> value of Integer.MAX_VALUE>>, and not <<put the text "Integer.MAX_VALUE"
> between the value tags>>. I think this is very confusing and the description
> should be improved.
> I first put <value>Integer.MAX_VALUE</value> in my configuration, and it took
> a long time to figure out what was wrong, especially since Nutch rolled back
> on the default value of 10000 instead of giving an error.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.