I want to do a total sort on some data whose key type is Writable but not
Text.  I wrote an InputSampler.RandomSampler object following the example in
the "Total Sort" section of *Hadoop: The Definitive Guide*.  When I
call InputSampler.writePartitionFile() I get a class cast exception because
my key type cannot be cast to Text.  Specifically the issue seems to be the
following section of InputSampler.getSample():

    K key = reader.getCurrentKey();
    ....
    Text keyCopy = WritableUtils.<Text>clone((Text)key,
job.getConfiguration());

>From this source it does appear that you can only use a RandomSampler on
data with Text keys.  However, I'm confused because I don't see this
mentioned in any documentation, and I assume this wouldn't be the case
because InputSampler takes <Key, Value> generic specifications.

   1. Does InputSampler.RandomSampler only work on data with Text key
   values?
   2. If so, what is the easiest way to generate a random sample for data
   with non-Text key values?  Is there example code anywhere?

Reply via email to