Yes, I can reproduce it with the command bin/nutch updatedb. Sorry, I am
new in nutch, could you please advice how I can get info about the
outlinks.size() ?

Thanks



On Wed, Dec 19, 2012 at 2:58 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi,
>
> In short no.
>
> I see that just before we distribute the score to outlinks in line 70 of
> DbUpdateMapper.java [0] there is a TODO which reads
>
> // TODO: Outlink filtering (i.e. "only keep the first n outlinks")
>
> I wonder if this could be why the if condition is satisfied in the toInt()
> method (line 739) of Bytes.java [1]
>
> Can you reproduce this and explain a bit more about the outlinks.size() for
> the URL?
>
> Thanks
>
> Lewis
>
> [0]
>
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateMapper.java?view=markup
> [1]
>
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/util/Bytes.java?view=markup
>
> On Wed, Dec 19, 2012 at 11:35 AM, Stanislav Orlenko
> <[email protected]>wrote:
>
> > Hello
> > Have anyone faced such a problem?
> >
> > java.lang.IllegalArgumentException: offset (0) + length (4) exceed the
> > capacity of the array: 2
> >         at
> > org.apache.nutch.util.Bytes.explainWrongLengthOrOffset(Bytes.java:559)
> >         at org.apache.nutch.util.Bytes.toInt(Bytes.java:740)
> >         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:611)
> >         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:598)
> >         at
> >
> >
> org.apache.nutch.scoring.opic.OPICScoringFilter.distributeScoreToOutlinks(OPICScoringFilter.java:128)
> >         at
> >
> >
> org.apache.nutch.scoring.ScoringFilters.distributeScoreToOutlinks(ScoringFilters.java:117)
> >         at
> > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:70)
> >         at
> > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:37)
> >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> >
> > Nutch version is 2.1.
> >
> > Thanks
> >
>
>
>
> --
> *Lewis*
>

Reply via email to