Hi all, I probably find a bug in InputSamper, under hadoop 0.21.0. In the file InputSampler.java under package org.apache.hadoop.mapreduce.lib.partition, inside function getSample, a record reader is created but not initialized. So when trying to use the record reader, an exception will be thrown. Because some of the objects referenced by the record reader haven't been initialized properly.
For example, near line 217: ...... RecordReader<K,V> reader = inf.createRecordReader(splits.get(i), new TaskAttemptContextImpl(job.getConfiguration(), new TaskAttemptID())); while (reader.nextKeyValue()) { ...... } The reader should be initialized before calling "reader.nextKeyValue()". Cheers