This came up a while back while I was testing on CDH and the synthetic control clustering (all the clustering actually) accounts for this. It's not clear from the output from which job the exception is being thrown. We had a similar problem when _logs was introduced earlier.
-----Original Message----- From: Patricio Echagüe [mailto:[email protected]] Sent: Thursday, June 23, 2011 11:38 AM To: [email protected] Subject: java.lang.IndexOutOfBoundsException Hi all, I'm observing this exception using/integrating Brisk with Mahout. *Brisk* currently works perfectly with all hadoop stack (Hadoop, hive, pig). I read a similar thread: http://comments.gmane.org/gmane.comp.apache.mahout.user/6757 which makes me think it can be hadoop related. We are using *0.20.203* (Yahoo distribution) Does this exception look familiar to you all ? It happens after running the 3 jobs for the Example: *Clustering of Synthetic control data*. INFO [IPC Server handler 0 on 56077] 2011-06-22 17:24:40,599 TaskTracker.java (line 2428) attempt_201106221720_0003_m_000000_0 0.0% INFO [IPC Server handler 5 on 8012] 2011-06-22 17:24:41,806 TaskInProgress.java (line 551) Error from attempt_201106221720_0003_m_000000_0: java.lang.IndexOutOfBoundsException at java.io.DataInputStream.readFully(DataInputStream.java:175) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1930) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2062) at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKeyValue(SequenceFileRecordReader.java:68) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Thanks
