You are correct, I did some research and found it to be a TIKA issue, its
is fixed by setting the "Title" field to multivalued in schema.xml.I think
by default the Nutch schema should be updated accordingly!

<field name="title" type="text_general" stored="true" indexed="true"
multiValued="true"/>

Thnx


On Sat, May 3, 2014 at 8:27 PM, BlackIce <[email protected]> wrote:

> Bad Request
>
> request: http://localhost:8983/solr/update?wt=javabin&version=2
>     at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
>     at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
>     at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>     at
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:135)
>     at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:88)
>     at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50)
>     at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
>     at
> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:458)
>     at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:500)
>     at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:321)
>     at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:53)
>     at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
> 2014-05-03 14:40:07,256 ERROR indexer.IndexingJob - Indexer:
> java.io.IOException: Job failed!
>
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
>     at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
>     at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
>
>
> In Solr it says:
>
> Sat 03 May 2014 01:37:57 PM CEST ERROR SolrCore 
> org.apache.solr.common.SolrException:
> ERROR: [doc=http://www.joomlatune.com/rss.xml] multiple values encountered
> for non multiValued field title: [JoomlaTune News,​ rss20.xml]  Sat 03
> May 2014 01:37:57 PM CEST ERROR SolrCore org.apache.solr.common.SolrException:
> ERROR: [doc=http://www.joomlatune.com/rss.xml] multiple values encountered
> for non multiValued field title: [JoomlaTune News,​ rss20.xml]  Sat 03
> May 2014 02:26:55 PM CEST ERROR SolrCore org.apache.solr.common.SolrException:
> ERROR:
> [doc=http://elfaroblogs.typepad.com/.a/6a019affb1d342970b019b05266c0c970d-300wi]
> multiple values encountered for non multiValued field title:
> [BLOGS-PRUEBA3,​ 6a019affb1d342970b019b05266c0c970d-300wi.jpg]  Sat 03
> May 2014 02:26:55 PM CEST ERROR SolrCore org.apache.solr.common.SolrException:
> ERROR:
> [doc=http://elfaroblogs.typepad.com/.a/6a019affb1d342970b019b05266c0c970d-300wi]
> multiple values encountered for non multiValued field title:
> [BLOGS-PRUEBA3,​ 6a019affb1d342970b019b05266c0c970d-300wi.jpg]  Sat 03
> May 2014 02:40:06 PM CEST ERROR SolrCore org.apache.solr.common.SolrException:
> ERROR:
> [doc=http://losblogs.elfaro.net/.a/6a019affb1d342970b019b05266c0c970d-800wi]
> multiple values encountered for non multiValued field title:
> [BLOGS-PRUEBA3,​ 6a019affb1d342970b019b05266c0c970d-800wi.jpg]  Sat 03
> May 2014 02:40:07 PM CEST ERROR SolrCore org.apache.solr.common.SolrException:
> ERROR:
> [doc=http://losblogs.elfaro.net/.a/6a019affb1d342970b019b05266c0c970d-800wi]
> multiple values encountered for non multiValued field title:
> [BLOGS-PRUEBA3,​ 6a019affb1d342970b019b05266c0c970d-800wi.jpg]
>
> Thnx
>
>
> On Sat, May 3, 2014 at 3:30 PM, remi tassing <[email protected]>wrote:
>
>> or RAM (Xmx/Xms)
>>
>>
>> On Sat, May 3, 2014 at 9:29 PM, remi tassing <[email protected]>
>> wrote:
>>
>> > Could you provide the complete stack trace? Probably add more debug info
>> > in.
>> >
>> > This could be due to some disk size issue...
>> >
>> >
>> > On Sat, May 3, 2014 at 8:51 PM, BlackIce <[email protected]> wrote:
>> >
>> >> HI, playing around with Nutch 1.8 in localmode on Solr 4.7..
>> >>
>> >> When indexing larger crawls 10k and up I get:
>> >>
>> >> Indexer: java.io.IOException: Job failed!
>> >>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
>> >>     at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
>> >>     at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
>> >>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> >>     at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
>> >>
>> >>
>> >>  is this related to:
>> >>
>> >>
>> http://lucene.472066.n3.nabble.com/SolrIndex-java-io-IOException-Job-failed-td3585509.html
>> >>
>> >> ???
>> >>
>> >> If so what is the solution since there are no Solr jars, only Lucene
>> 4.3?
>> >> Downgrading to Solr 4.3? (I really would like to avoid this and
>> actually
>> >> go
>> >> Solr 4.8)
>> >>
>> >> Thnx
>> >>
>> >
>> >
>>
>
>

Reply via email to