Looks like you cancelled a previous crawl in the middle of it, or something else did.
delete the D:\cygwin\home\nutch-0.7.2\bin\fqjoke2\db\webdb.old directory and recrawl. You should be fine. ----- Original Message ----- From: "kevin" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Saturday, July 08, 2006 3:34 AM Subject: Re: [Nutch-general] Link db (traversal + modification) > Hi, > > I ran nutch using this command: > $ ./nutch crawl urlfq2.txt -dir fqjoke2 -depth 20 -threads 10 >& fq2.log > during the crawling ,the following exception occured: > > 060708 182413 status: segment 20060708181314, 471 pages, 69 errors, > 5655871 bytes, 657469 ms > 060708 182413 status: 0.7163836 pages/s, 67.20696 kb/s, 12008.219 > bytes/page > 060708 182414 Updating D:\cygwin\home\nutch-0.7.2\bin\fqjoke2\db > Exception in thread "main" java.io.IOException: Impossible condition: > directories D:\cygwin\home\nutch-0.7.2\bin\fqjoke2\db\webdb.old and > D:\cygwin\home\nutch-0.7.2\bin\fqjoke2\db\webdb cannot exist > simultaneously > at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1484) > at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1457) > at > org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:360) > at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:141) > > > > why this happened ? any solution available? many thanks! > > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
