Re: Cluster By Algorithm?

Aaron McCurry Sun, 11 Apr 2010 14:21:39 -0700

Thanks a lot!  I figured it was that simple.

Aaron


On Sun, Apr 11, 2010 at 5:16 PM, Zheng Shao <[email protected]> wrote:

> Its as simple as taking a hashcode of the key and mod by number of
> reducers. To get started, have a try of any .q files in clientpositive
> directory.
>
> On the code side, HiveKey.java has the implementation.
>
>
>
> Sent from my iPhone
>
>
> On Apr 11, 2010, at 2:48 PM, Aaron McCurry <[email protected]> wrote:
>
>  I have a search solution that is down stream of some Netezza data marts
>> that I'm replacing with a Hive solution.  We already partition the data for
>> the search solution 32 ways and I would like to take advantage of the data
>> clustering in Hive (buckets), so that I don't have to do any post
>> processing.  Is there documentation that describes how the data is hashed or
>> how it's organized across the buckets?  Or could someone point me to a class
>> that implements it?  Thanks!
>>
>> Aaron
>>
>

Re: Cluster By Algorithm?

Reply via email to