That sounds like a bug to me.

I think the easiest way would be to modify InputSampler to handle non Text keys.

-Joey

On Wed, May 18, 2011 at 4:24 PM, W.P. McNeill <[email protected]> wrote:
> I want to do a total sort on some data whose key type is Writable but not
> Text.  I wrote an InputSampler.RandomSampler object following the example in
> the "Total Sort" section of *Hadoop: The Definitive Guide*.  When I
> call InputSampler.writePartitionFile() I get a class cast exception because
> my key type cannot be cast to Text.  Specifically the issue seems to be the
> following section of InputSampler.getSample():
>
>    K key = reader.getCurrentKey();
>    ....
>    Text keyCopy = WritableUtils.<Text>clone((Text)key,
> job.getConfiguration());
>
> From this source it does appear that you can only use a RandomSampler on
> data with Text keys.  However, I'm confused because I don't see this
> mentioned in any documentation, and I assume this wouldn't be the case
> because InputSampler takes <Key, Value> generic specifications.
>
>   1. Does InputSampler.RandomSampler only work on data with Text key
>   values?
>   2. If so, what is the easiest way to generate a random sample for data
>   with non-Text key values?  Is there example code anywhere?
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Reply via email to