sounds interesting... btree on top of cassandra ;)

On Sun, Jun 6, 2010 at 12:16 PM, David Boxenhorn <da...@lookin2.com> wrote:

> I'm still thinking about the problem of how to handle range queries on very
> large sets of data, using Random Partitioning.
>
> Has anyone used tree search to solve this? What do you think?
>
> More specifically, something like this:
>
> - Store a maximum of 1000 values per supercolumn (or some other fixed
> number)
> - Each supercolumn has a "greaterChild" and a "lessChild" in addition to
> the values
> - When the number of values in the supercolumn grows beyond the maximum,
> split it into 3 parts, with the top third going into "greaterChild" and the
> bottom third into "lessChild"
> - To find a value, look at "greaterChild" and "lessChild" to find out
> whether your key is within the current range, and if not, where to look next
> - Range searches mean finding the first value, then looking at
> "greaterChild" or "lessChild" (depending on the direction of your search)
> until you reach the end of the range.
>
> Super Column Family:
>
> index [ <columnFamilyId> [ "firstVal" : <val> ,
>                            "lastVal" : <val> ,
>                            <val> : <dataId>,
>                            "lessChild" : <columnFamilyId> ,
>                            "greaterChild" : <columnFamilyId> ]
>
>

Reply via email to