I'm hoping that my emails actually reach other people, as they've been 
ignored so far.

I just ran a recrawl today to crawl a few injected URLs that I have.  At the 
end of the recrawl I received the following error:

060623 122916 merging segment indexes to: 
/home/honda/nutch-0.7.2/crawl/index
Exception in thread "main" java.io.IOException: 
/home/honda/nutch-0.7.2/crawl/segments/20060619230003/index not a directory
        at org.apache.lucene.store.FSDirectory.init(FSDirectory.java:180)
        at 
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:141)
        at org.apache.nutch.indexer.IndexMerger.merge(IndexMerger.java:80)
        at org.apache.nutch.indexer.IndexMerger.main(IndexMerger.java:160)

Of course all of the crawled segments are not in the index.

Can ANYONE tellme how to fix this?  I'm getting a bit discouraged with Nutch 
due to the large number of errors I keep receiving during crawls.  I do not 
want to have to recrawl my entire sitelist AGAIN just to fix this.

Anyone?

Matt 


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to