Cluster By Algorithm?

Aaron McCurry Sun, 11 Apr 2010 12:49:06 -0700

I have a search solution that is down stream of some Netezza data marts that
I'm replacing with a Hive solution.  We already partition the data for the
search solution 32 ways and I would like to take advantage of the data
clustering in Hive (buckets), so that I don't have to do any post
processing.  Is there documentation that describes how the data is hashed or
how it's organized across the buckets?  Or could someone point me to a class
that implements it?  Thanks!


Aaron

Cluster By Algorithm?

Reply via email to