On Mon, Aug 23, 2010 at 8:36 PM, Hien. To Trong <hie...@vng.com.vn> wrote:
> OrderPreservingPartitioner is efficient range queries but can cause
> unevently distributed data. Does anyone has an idea of a
> HybridPartitioner which takes advantages of both RandomPartitioner
> and OPP, or at least a partitioner trade off between them.


What you are looking for is skew adaptive partitioning i.e. like a
B+Tree except distributable.

A couple different methods for doing something like this exist, but
you rarely see them and they have their own (different) tradeoffs. To
the best of my knowledge, implementation requires a fairly deep
architectural commitment; it is more involved than simply defining a
partitioning function and the "adaptive" aspect must be distribution
friendly. It is an active area of research in the literature with no
obvious and simple solutions that can be lashed onto a database engine
"as is".

-- 
J. Andrew Rogers

Reply via email to