You are correct, I did some research and found it to be a TIKA issue, its is fixed by setting the "Title" field to multivalued in schema.xml.I think by default the Nutch schema should be updated accordingly!
<field name="title" type="text_general" stored="true" indexed="true" multiValued="true"/> Thnx On Sat, May 3, 2014 at 8:27 PM, BlackIce <[email protected]> wrote: > Bad Request > > request: http://localhost:8983/solr/update?wt=javabin&version=2 > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430) > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:135) > at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:88) > at > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50) > at > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41) > at > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:458) > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:500) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:321) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:53) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398) > 2014-05-03 14:40:07,256 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186) > > > In Solr it says: > > Sat 03 May 2014 01:37:57 PM CEST ERROR SolrCore > org.apache.solr.common.SolrException: > ERROR: [doc=http://www.joomlatune.com/rss.xml] multiple values encountered > for non multiValued field title: [JoomlaTune News, rss20.xml] Sat 03 > May 2014 01:37:57 PM CEST ERROR SolrCore org.apache.solr.common.SolrException: > ERROR: [doc=http://www.joomlatune.com/rss.xml] multiple values encountered > for non multiValued field title: [JoomlaTune News, rss20.xml] Sat 03 > May 2014 02:26:55 PM CEST ERROR SolrCore org.apache.solr.common.SolrException: > ERROR: > [doc=http://elfaroblogs.typepad.com/.a/6a019affb1d342970b019b05266c0c970d-300wi] > multiple values encountered for non multiValued field title: > [BLOGS-PRUEBA3, 6a019affb1d342970b019b05266c0c970d-300wi.jpg] Sat 03 > May 2014 02:26:55 PM CEST ERROR SolrCore org.apache.solr.common.SolrException: > ERROR: > [doc=http://elfaroblogs.typepad.com/.a/6a019affb1d342970b019b05266c0c970d-300wi] > multiple values encountered for non multiValued field title: > [BLOGS-PRUEBA3, 6a019affb1d342970b019b05266c0c970d-300wi.jpg] Sat 03 > May 2014 02:40:06 PM CEST ERROR SolrCore org.apache.solr.common.SolrException: > ERROR: > [doc=http://losblogs.elfaro.net/.a/6a019affb1d342970b019b05266c0c970d-800wi] > multiple values encountered for non multiValued field title: > [BLOGS-PRUEBA3, 6a019affb1d342970b019b05266c0c970d-800wi.jpg] Sat 03 > May 2014 02:40:07 PM CEST ERROR SolrCore org.apache.solr.common.SolrException: > ERROR: > [doc=http://losblogs.elfaro.net/.a/6a019affb1d342970b019b05266c0c970d-800wi] > multiple values encountered for non multiValued field title: > [BLOGS-PRUEBA3, 6a019affb1d342970b019b05266c0c970d-800wi.jpg] > > Thnx > > > On Sat, May 3, 2014 at 3:30 PM, remi tassing <[email protected]>wrote: > >> or RAM (Xmx/Xms) >> >> >> On Sat, May 3, 2014 at 9:29 PM, remi tassing <[email protected]> >> wrote: >> >> > Could you provide the complete stack trace? Probably add more debug info >> > in. >> > >> > This could be due to some disk size issue... >> > >> > >> > On Sat, May 3, 2014 at 8:51 PM, BlackIce <[email protected]> wrote: >> > >> >> HI, playing around with Nutch 1.8 in localmode on Solr 4.7.. >> >> >> >> When indexing larger crawls 10k and up I get: >> >> >> >> Indexer: java.io.IOException: Job failed! >> >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) >> >> at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114) >> >> at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176) >> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >> at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186) >> >> >> >> >> >> is this related to: >> >> >> >> >> http://lucene.472066.n3.nabble.com/SolrIndex-java-io-IOException-Job-failed-td3585509.html >> >> >> >> ??? >> >> >> >> If so what is the solution since there are no Solr jars, only Lucene >> 4.3? >> >> Downgrading to Solr 4.3? (I really would like to avoid this and >> actually >> >> go >> >> Solr 4.8) >> >> >> >> Thnx >> >> >> > >> > >> > >

