On Mon, Aug 23, 2010 at 8:36 PM, Hien. To Trong <hie...@vng.com.vn> wrote: > OrderPreservingPartitioner is efficient range queries but can cause > unevently distributed data. Does anyone has an idea of a > HybridPartitioner which takes advantages of both RandomPartitioner > and OPP, or at least a partitioner trade off between them.
What you are looking for is skew adaptive partitioning i.e. like a B+Tree except distributable. A couple different methods for doing something like this exist, but you rarely see them and they have their own (different) tradeoffs. To the best of my knowledge, implementation requires a fairly deep architectural commitment; it is more involved than simply defining a partitioning function and the "adaptive" aspect must be distribution friendly. It is an active area of research in the literature with no obvious and simple solutions that can be lashed onto a database engine "as is". -- J. Andrew Rogers