CLASSIFICATION: UNCLASSIFIED I have made it past the stupid crap I was doing that was causing errors and gotten to the point in the tutorial where I am trying to index the resources into solr... Not completely sure the integration is perfect but tried
bin/nutch solrindex http://localhost:8983/solr crawl/crawldb/ -linkdb crawl/linkdb/ crawl/segments/20160727090259/ -filter - normalize Segment dir is complete: crawl/segments/20160727090259. The input path at - is not a segment... skipping The input path at normalize is not a segment... skipping Indexer: starting at 2016-07-27 09:50:58 Indexer: deleting gone documents: false Indexer: URL filtering: true Indexer: URL normalizing: false Active IndexWriters : SOLRIndexWriter solr.server.url : URL of the SOLR instance solr.zookeeper.hosts : URL of the Zookeeper quorum solr.commit.size : buffer size when sending to SOLR (default 1000) solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml) solr.auth : use authentication (default false) solr.auth.username : username for authentication solr.auth.password : password for authentication Indexing 1/1 documents Deleting 0 documents Indexing 1/1 documents Deleting 0 documents Indexer: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) What do I need to do to get this to run? Thanks, Kris ~~~~~~~~~~~~~~~~~~~~~~~~~~ Kris T. Musshorn FileMaker Developer - Contractor - Catapult Technology Inc. US Army Research Lab Aberdeen Proving Ground Application Management & Development Branch 410-278-7251 [email protected] ~~~~~~~~~~~~~~~~~~~~~~~~~~ CLASSIFICATION: UNCLASSIFIED

