Are you running in Win2k, Windows XP, Windows Server? Do you have virus scanner on? Do you have anyfirewall software enabled? Anything blocking ports?
Do you use NDFS or local? Are you on NTFS or FAT32 file system? How large is the dataset you are working with? Have you split into more smaller jobs instead of big/large jobs? --- Nguyen Ngoc Giang <[EMAIL PROTECTED]> wrote: > Hi all, > > I'd like to bring back this topic, which has been > ignored several times in > Nutch mailing list as well as JIRA ( > http://issues.apache.org/jira/browse/NUTCH-94, > http://issues.apache.org/jira/browse/NUTCH-96, > http://issues.apache.org/jira/browse/NUTCH-117 ). > Here is my error stack: > > 060104 110314 Finishing update > 060104 110314 Processing pagesByURL: Sorted 11 > instructions in 0.016seconds. > 060104 110314 Processing pagesByURL: Sorted 687.5 > instructions/second > java.io.IOException: already exists: > C:\tomcat\webapps\ROOT\data\db\webdb.new\pagesByURL > at > org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:86) > at > org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown( > WebDBWriter.java:549) > at > org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544) > at > org.apache.nutch.tools.UpdateDatabaseTool.close( > UpdateDatabaseTool.java:375) > > > This error happens not only at update time, but > also at fetchlist time. > And the weird thing is that it happens so > undeterministically. I debugged > around and it seems the problem is because some > CloseProcessors didn't > terminate correctly, causing the webdb.new not > deletable. Then I try to > reduce to only 1 thread, with lightweight load (as > suggested in the JIRA > discussion), but it doesn't help. But when I try to > run step by step using > debugging mode of the IDE, there was no problem. > > Can anyone help me to figure out this issue? > Thanks very much. > > Regards, > Giang > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
