Marco Ebbinghaus created NUTCH-2666:
---------------------------------------
Summary: increase default value for http.content.limit
Key: NUTCH-2666
URL: https://issues.apache.org/jira/browse/NUTCH-2666
Project: Nutch
Issue Type: Improvement
Components: fetcher
Affects Versions: 1.15
Reporter: Marco Ebbinghaus
The default value for http.content.limit (The length limit for downloaded
content using the http://
protocol, in bytes. If this value is nonnegative (>=0), content longer
than it will be truncated; otherwise, no truncation at all. Do not
confuse this setting with the file.content.limit setting.) is set to 64kb.
Maybe this default value should be increased as many pages today are greater
than 64kb.
The description might also be updated as this is not only the case for the http
protocol, but also for https.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)