I have tried a lot ofdifferent things, but I can't get nutch to tun a crawl 
command.  

I am using cygwin on windows 7.
I have the java classpath set, and I am getting feedback when I run bin/nutch.
But the crawl execute gives me an error:
Error running:
  
/cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-1.11/runtime/local/bin/nutch
 inject http://localhost:8983/solr//crawldb TestCrawl
Failed with exit value 127.
My command is


$ bin/crawl -D 
C:/Users/User5/Documents/Nutch/apache-nutch-1.11/runtime/local/urls/seeds.txt 
Test Crawl http://localhost:8983/solr/  2
The full output is:

Injecting seed URLs
/cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-1.11/runtime/local/bin/nutch
 inject http://l                                                                
                                          ocalhost:8983/solr//crawldb TestCrawl
Injector: starting at 2015-12-26 13:11:12
Injector: crawlDb: http://localhost:8983/solr/crawldb
Injector: urlDir: TestCrawl
Injector: Converting injected urls to crawl db entries.
Injector: java.lang.IllegalArgumentException: Wrong FS: 
http://localhost:8983/solr/crawldb, expected: file:///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
        at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:298)
        at org.apache.nutch.crawl.Injector.run(Injector.java:379)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.nutch.crawl.Injector.main(Injector.java:369)

Error running:
  
/cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-1.11/runtime/local/bin/nutch
 inject http://localhost:8983/solr//crawldb TestCrawl
Failed with exit value 127.

Any help with this would be much appreciated!!


Reply via email to