[ http://issues.apache.org/jira/browse/NUTCH-189?page=comments#action_12364270 ]
Bryan Pendleton commented on NUTCH-189: --------------------------------------- I think this is caused by a similar issue I've been running into in my code, though I'm not testing crawling, so I can't be sure. I'll attach a patch that fixes my issue.... which I will report if this isn't the fix for both. > Injection infinite loop > ----------------------- > > Key: NUTCH-189 > URL: http://issues.apache.org/jira/browse/NUTCH-189 > Project: Nutch > Type: Bug > Environment: Linux > Reporter: Andy Liu > Priority: Minor > > f you inject the crawldb with a url file that doesn't end with a line feed, > an infinite loop is entered. > 060104 160950 Running job: job_7uku5w > 060104 160952 map 0% > 060104 160954 map 50% > 060104 160957 map -2631% > 060104 160959 map -259756% > 060104 161002 map -538552% > 060104 161006 map -818413% > 060104 161009 map -1098421% > 060104 161011 map -1377851% > 060104 161014 map -1657718% > 060104 161018 map -1939534% > 060104 161021 map -2218515% > 060104 161023 map -2588212% > 060104 161026 map -2868787% > 060104 161030 map -3147637% -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
