Aha! Thank you Markus. And now that's on the books for those that follow after me. Getting a path error now, but even I can figure that one out.
Merry Christmas, Happy Holidays, and a Prosperous, Contented New Year Guy McDowell [email protected] http://www.GuyMcDowell.com On Thu, Dec 24, 2015 at 9:55 AM, Markus Jelsma <[email protected]> wrote: > Hello - solrindex should no longer exist. Anyway, you should use > > bin/nutch index -Dsolr.server.url=http://blablabla crawldb segmentpath > > > -----Original message----- > > From:Guy McD <[email protected]> > > Sent: Thursday 24th December 2015 14:30 > > To: [email protected] > > Subject: java.io.IOException: No FileSystem for scheme: http > > > > When running > > > > bin/nutch solrindex \ http://localhost:8983/solr/nutch_solr_data_core \ > > crawl/crawldb/ -linkdb crawl/linkdb/ $s1 > > > > the Nutch results do not get indexed into Solr. Solr shows no docs in the > > core. > > > > The only thing that looks like an error message in the response to that > > command is: > > Indexer: java.io.IOException: No FileSystem for scheme: http > > > > 1. Is that the issue? > > I suspect it has more to do with the Java implementation than anything, > but > > not sure where to go with that suspicion. > > 2. How do I fix it? > > Or at least point me in the right direction to figure it out for myself. > > > > Background: > > > > - Followed Nutch tutorial at Apache's Nutch page. > > - Everything appears to have worked to this point. > > - nutch_solr_data_core does exist as a core in Solr. > > - Data was definitely brought back by the crawl. > > - Searched mail archives and Google for the phrase No FileSystem for > > scheme: http. No useful responses were found. Seems like whenever this > > question is asked, ti doesn't get answered. > > > > > > Specs: > > > > - Nutch 1.11 > > - Solr 5.4.0 > > - Java Default for Ubuntu 14.04 LTS > > > > > > Response Message in Entirety: > > root@Walleye:/nutch/nutch# bin/nutch solrindex \ > > http://localhost:8983/solr/nutch_solr_data_core \ crawl/crawldb/ -linkdb > > crawl/linkdb/ > > $s1 > > Indexer: starting at 2015-12-24 08:44:07 > > Indexer: deleting gone documents: false > > Indexer: URL filtering: false > > Indexer: URL normalizing: false > > Active IndexWriters : > > SolrIndexWriter > > solr.server.type : Type of SolrServer to communicate with > (default > > 'http' however options include 'cloud', 'lb' and 'concurrent') > > solr.server.url : URL of the Solr instance (mandatory) > > solr.zookeeper.url : URL of the Zookeeper URL (mandatory if > 'cloud' > > value for solr.server.type) > > solr.loadbalance.urls : Comma-separated string of Solr server > > strings to be used (madatory if 'lb' value for solr.server.type) > > solr.mapping.file : name of the mapping file for fields (default > > solrindex-mapping.xml) > > solr.commit.size : buffer size when sending to Solr (default > 1000) > > solr.auth : use authentication (default false) > > solr.auth.username : username for authentication > > solr.auth.password : password for authentication > > > > > > Indexer: java.io.IOException: No FileSystem for scheme: http > > at > > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385) > > at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > > at > > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > > at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > > at > > > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256) > > at > > > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) > > at > > > org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45) > > at > > > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) > > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) > > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) > > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) > > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > > at > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) > > at > org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833) > > at > org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:222) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at > org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:231) > > > > Guy McDowell > > [email protected] > > http://www.GuyMcDowell.com > > >

