Hi All,
We are getting an OOM Exception during the processing of
http://www.fotofinity.com/cgi-bin/homepages.cgi . We have also applied
Nutch-497 patch to our source code. But actually the error is coming during
the parse method.
Does anybody has any idea regarding this. Here is the complete
I successfully run the whole-web crawl with the my new ubuntu OS, and I am
ready to fix the bug. I need someone to guide me to get the most updated
source code and the bug assignment.
Thank you in advance!!
Adam Shuy, President
ePacific Web Design Hosting
Professional Web/Software developer
Next fetch time is set incorrectly
--
Key: NUTCH-515
URL: https://issues.apache.org/jira/browse/NUTCH-515
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 1.0.0
[
https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512930
]
Doğacan Güney commented on NUTCH-439:
-
A big +1 from me. Though, it may be useful to break this patch into
You could try looking at these two discussions:
http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html
http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html
--Kai
- Original Message
From: Tsengtan A Shuy [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org;
[
https://issues.apache.org/jira/browse/NUTCH-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513019
]
Andrzej Bialecki commented on NUTCH-515:
-
+1 - sorry for the mess up ...
Next fetch time is set
[
https://issues.apache.org/jira/browse/NUTCH-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513040
]
Doğacan Güney commented on NUTCH-515:
-
With more than a hundred config options, and with the way we use hadoop's
[
https://issues.apache.org/jira/browse/NUTCH-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513044
]
Doğacan Güney commented on NUTCH-506:
-
If there are no objections, I am going to commit this one.
Just to get
Hi all,
Thanks for your suggestions.
I am running parse on a single url (
http://www.fotofinity.com/cgi-bin/homepages.cgi). For other urls, parse
works perfectly. we are getting this error because of the html of the page.
The page contains many anchor tags which are not closed properly. Hence