InputSampler.RandomSampler only accepts Text keys
-------------------------------------------------

                 Key: MAPREDUCE-2520
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2520
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: William McNeill
            Priority: Minor


I want to do a total sort on some data whose key type is Writable but not Text. 
 I wrote an InputSampler.RandomSampler object following the example in the 
"Total Sort" section of Hadoop: The Definitive Guide.  When I call 
InputSampler.writePartitionFile() I get a runtime class cast exception because 
my key type cannot be cast to Text.  Specifically the issue seems to be the 
following section of InputSampler.getSample():

    K key = reader.getCurrentKey();
    ....
    Text keyCopy = WritableUtils.<Text>clone((Text)key, job.getConfiguration());

You can only use a RandomSampler on data with Text keys despite the fact that 
InputSampler takes <Key, Value> generic parameters.

InputSampler.getSample() should be changed to cast the key to type K instead of 
type Text.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to