When running bin/nutch solrindex \ http://localhost:8983/solr/nutch_solr_data_core \ crawl/crawldb/ -linkdb crawl/linkdb/ $s1
the Nutch results do not get indexed into Solr. Solr shows no docs in the core. The only thing that looks like an error message in the response to that command is: Indexer: java.io.IOException: No FileSystem for scheme: http 1. Is that the issue? I suspect it has more to do with the Java implementation than anything, but not sure where to go with that suspicion. 2. How do I fix it? Or at least point me in the right direction to figure it out for myself. Background: - Followed Nutch tutorial at Apache's Nutch page. - Everything appears to have worked to this point. - nutch_solr_data_core does exist as a core in Solr. - Data was definitely brought back by the crawl. - Searched mail archives and Google for the phrase No FileSystem for scheme: http. No useful responses were found. Seems like whenever this question is asked, ti doesn't get answered. Specs: - Nutch 1.11 - Solr 5.4.0 - Java Default for Ubuntu 14.04 LTS Response Message in Entirety: root@Walleye:/nutch/nutch# bin/nutch solrindex \ http://localhost:8983/solr/nutch_solr_data_core \ crawl/crawldb/ -linkdb crawl/linkdb/ $s1 Indexer: starting at 2015-12-24 08:44:07 Indexer: deleting gone documents: false Indexer: URL filtering: false Indexer: URL normalizing: false Active IndexWriters : SolrIndexWriter solr.server.type : Type of SolrServer to communicate with (default 'http' however options include 'cloud', 'lb' and 'concurrent') solr.server.url : URL of the Solr instance (mandatory) solr.zookeeper.url : URL of the Zookeeper URL (mandatory if 'cloud' value for solr.server.type) solr.loadbalance.urls : Comma-separated string of Solr server strings to be used (madatory if 'lb' value for solr.server.type) solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml) solr.commit.size : buffer size when sending to Solr (default 1000) solr.auth : use authentication (default false) solr.auth.username : username for authentication solr.auth.password : password for authentication Indexer: java.io.IOException: No FileSystem for scheme: http at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833) at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:222) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:231) Guy McDowell [email protected] http://www.GuyMcDowell.com

