I have a pretty good guess. Even as we speak, I am testing updating to
Hadoop 0.20.203.0 in Mahout. The only difference that causes a problem is
that the newer Hadoop adds a "_SUCCESS" file to output dirs. This confuses a
few bits of Mahout code that don't properly ignore that. I've got a change
to fix that. If all goes well it will go in tonight.

I give it reasonable odds that this is your issue.

2011/6/23 Patricio Echagüe <[email protected]>

> Hi all, I'm observing this exception using/integrating Brisk with Mahout.
>
>
> *Brisk* currently works perfectly with all hadoop stack (Hadoop, hive,
> pig).
>
>
> I read a similar thread:
> http://comments.gmane.org/gmane.comp.apache.mahout.user/6757 which makes
> me
> think it can be hadoop related.
>
>
> We are using *0.20.203* (Yahoo distribution)
>
>
> Does this exception look familiar to you all ?
>
>
> It happens after running the 3 jobs for the Example: *Clustering of
> Synthetic control data*.
>
>
> INFO [IPC Server handler 0 on 56077] 2011-06-22 17:24:40,599
> TaskTracker.java (line 2428) attempt_201106221720_0003_m_000000_0 0.0%
>
>  INFO [IPC Server handler 5 on 8012] 2011-06-22 17:24:41,806
> TaskInProgress.java (line 551) Error from
> attempt_201106221720_0003_m_000000_0: java.lang.IndexOutOfBoundsException
>
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
>
> at
>
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
>
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1930)
>
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2062)
>
> at
>
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKeyValue(SequenceFileRecordReader.java:68)
>
> at
>
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531)
>
> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:253)
>
>
> Thanks
>

Reply via email to