Hello - solrindex should no longer exist. Anyway, you should use
bin/nutch index -Dsolr.server.url=http://blablabla crawldb segmentpath -----Original message----- > From:Guy McD <[email protected]> > Sent: Thursday 24th December 2015 14:30 > To: [email protected] > Subject: java.io.IOException: No FileSystem for scheme: http > > When running > > bin/nutch solrindex \ http://localhost:8983/solr/nutch_solr_data_core \ > crawl/crawldb/ -linkdb crawl/linkdb/ $s1 > > the Nutch results do not get indexed into Solr. Solr shows no docs in the > core. > > The only thing that looks like an error message in the response to that > command is: > Indexer: java.io.IOException: No FileSystem for scheme: http > > 1. Is that the issue? > I suspect it has more to do with the Java implementation than anything, but > not sure where to go with that suspicion. > 2. How do I fix it? > Or at least point me in the right direction to figure it out for myself. > > Background: > > - Followed Nutch tutorial at Apache's Nutch page. > - Everything appears to have worked to this point. > - nutch_solr_data_core does exist as a core in Solr. > - Data was definitely brought back by the crawl. > - Searched mail archives and Google for the phrase No FileSystem for > scheme: http. No useful responses were found. Seems like whenever this > question is asked, ti doesn't get answered. > > > Specs: > > - Nutch 1.11 > - Solr 5.4.0 > - Java Default for Ubuntu 14.04 LTS > > > Response Message in Entirety: > root@Walleye:/nutch/nutch# bin/nutch solrindex \ > http://localhost:8983/solr/nutch_solr_data_core \ crawl/crawldb/ -linkdb > crawl/linkdb/ > $s1 > Indexer: starting at 2015-12-24 08:44:07 > Indexer: deleting gone documents: false > Indexer: URL filtering: false > Indexer: URL normalizing: false > Active IndexWriters : > SolrIndexWriter > solr.server.type : Type of SolrServer to communicate with (default > 'http' however options include 'cloud', 'lb' and 'concurrent') > solr.server.url : URL of the Solr instance (mandatory) > solr.zookeeper.url : URL of the Zookeeper URL (mandatory if 'cloud' > value for solr.server.type) > solr.loadbalance.urls : Comma-separated string of Solr server > strings to be used (madatory if 'lb' value for solr.server.type) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.commit.size : buffer size when sending to Solr (default 1000) > solr.auth : use authentication (default false) > solr.auth.username : username for authentication > solr.auth.password : password for authentication > > > Indexer: java.io.IOException: No FileSystem for scheme: http > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256) > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:222) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:231) > > Guy McDowell > [email protected] > http://www.GuyMcDowell.com >

