Dear developers,
I have installed a nutch system on a linux enterprise server with 8GB RAM.
My JAVA VM has 4GB RAM, when nutch starts.
I have configured a web-crawler to scan pdf documents (abour 3000) in intranet.
After about 100 PDF docs, there is always a outOfMemory Exception.
I tried
[
https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804448#action_12804448
]
Sami Siren commented on NUTCH-766:
--
+1, I'm going to agree on this one here Julien. Other
[
https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804546#action_12804546
]
Chris A. Mattmann commented on NUTCH-766:
-
Hi Sami:
{quote}
Chris, can you please
[
https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804558#action_12804558
]
Andrzej Bialecki commented on NUTCH-766:
-
I agree with Chris, +1 on keeping the old
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The FrontPage page has been changed by JohnWhelan.
The comment on this change is: Changes to Cygwin mount points have broken the
WhelanLabs Search Engine Manager. No new version is
[
https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vu Hoang updated NUTCH-780:
---
Component/s: (was: ndfs)
fetcher
Nutch crawler did not read configuration files