Well, according to our other corresponding, the only thing I did different
in my schema.xml (schema-solr4.xml) before rebuilding nutch was the
 <uniqueKey>url</uniqueKey> instead of <uniqueKey>id</uniqueKey>.

It all goes well until the dedup phase where the MapReduce throws:

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:268)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

Thanks.


On Mon, Apr 8, 2013 at 10:33 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> I would probably be best to describe what you've tried here, possibly a
> paste of your schema, what you've done (if anything) to the Nutch source to
> get it working with Solr 4, etc.
> The stack trace you get would also be beneficial.
> Thank you
> Lewis
>
>
> On Mon, Apr 8, 2013 at 4:13 AM, Amit Sela <[email protected]> wrote:
>
> > Is it possible ? I saw a Jira open about connecting to SolrCloud via
> > ZooKeeper but in direct connection to one of the server is it possible to
> > index with nutch 1.6 into Solr4.2 setup as cloud with ZooKeeper ensemble
> ?
> > because I keep getting IndexOutOfBounds exceptions in the dedup M/R
> phase.
> >
> > Thanks.
> >
>
>
>
> --
> *Lewis*
>

Reply via email to