Re: Indexing to Solr4.2 with nutch 1.6

Lewis John Mcgibbney Tue, 09 Apr 2013 11:15:27 -0700

Before we do the upgrade we need to consolidate all of these use cases.
What criteria do we want to review and accept as the unique key? Will this
change between Nutch trunk and 2.x?


On Tuesday, April 9, 2013, Amit Sela <[email protected]> wrote:
> Well, according to our other corresponding, the only thing I did different
> in my schema.xml (schema-solr4.xml) before rebuilding nutch was the
>  <uniqueKey>url</uniqueKey> instead of <uniqueKey>id</uniqueKey>.
>
> It all goes well until the dedup phase where the MapReduce throws:
>
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:604)
> at java.util.ArrayList.get(ArrayList.java:382)
> at
>
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:268)
> at
>
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
> at
>
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
> at
>
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
>
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> Thanks.
>
>
> On Mon, Apr 8, 2013 at 10:33 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
>> I would probably be best to describe what you've tried here, possibly a
>> paste of your schema, what you've done (if anything) to the Nutch source
to
>> get it working with Solr 4, etc.
>> The stack trace you get would also be beneficial.
>> Thank you
>> Lewis
>>
>>
>> On Mon, Apr 8, 2013 at 4:13 AM, Amit Sela <[email protected]> wrote:
>>
>> > Is it possible ? I saw a Jira open about connecting to SolrCloud via
>> > ZooKeeper but in direct connection to one of the server is it possible
to
>> > index with nutch 1.6 into Solr4.2 setup as cloud with ZooKeeper
ensemble
>> ?
>> > because I keep getting IndexOutOfBounds exceptions in the dedup M/R
>> phase.
>> >
>> > Thanks.
>> >
>>
>>
>>
>> --
>> *Lewis*
>>
>

-- 
*Lewis*

Re: Indexing to Solr4.2 with nutch 1.6

Reply via email to