Yes, I can reproduce it with the command bin/nutch updatedb. Sorry, I am new in nutch, could you please advice how I can get info about the outlinks.size() ?
Thanks On Wed, Dec 19, 2012 at 2:58 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi, > > In short no. > > I see that just before we distribute the score to outlinks in line 70 of > DbUpdateMapper.java [0] there is a TODO which reads > > // TODO: Outlink filtering (i.e. "only keep the first n outlinks") > > I wonder if this could be why the if condition is satisfied in the toInt() > method (line 739) of Bytes.java [1] > > Can you reproduce this and explain a bit more about the outlinks.size() for > the URL? > > Thanks > > Lewis > > [0] > > http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateMapper.java?view=markup > [1] > > http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/util/Bytes.java?view=markup > > On Wed, Dec 19, 2012 at 11:35 AM, Stanislav Orlenko > <[email protected]>wrote: > > > Hello > > Have anyone faced such a problem? > > > > java.lang.IllegalArgumentException: offset (0) + length (4) exceed the > > capacity of the array: 2 > > at > > org.apache.nutch.util.Bytes.explainWrongLengthOrOffset(Bytes.java:559) > > at org.apache.nutch.util.Bytes.toInt(Bytes.java:740) > > at org.apache.nutch.util.Bytes.toFloat(Bytes.java:611) > > at org.apache.nutch.util.Bytes.toFloat(Bytes.java:598) > > at > > > > > org.apache.nutch.scoring.opic.OPICScoringFilter.distributeScoreToOutlinks(OPICScoringFilter.java:128) > > at > > > > > org.apache.nutch.scoring.ScoringFilters.distributeScoreToOutlinks(ScoringFilters.java:117) > > at > > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:70) > > at > > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:37) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > > > > Nutch version is 2.1. > > > > Thanks > > > > > > -- > *Lewis* >

