sounds interesting... btree on top of cassandra ;) On Sun, Jun 6, 2010 at 12:16 PM, David Boxenhorn <da...@lookin2.com> wrote:
> I'm still thinking about the problem of how to handle range queries on very > large sets of data, using Random Partitioning. > > Has anyone used tree search to solve this? What do you think? > > More specifically, something like this: > > - Store a maximum of 1000 values per supercolumn (or some other fixed > number) > - Each supercolumn has a "greaterChild" and a "lessChild" in addition to > the values > - When the number of values in the supercolumn grows beyond the maximum, > split it into 3 parts, with the top third going into "greaterChild" and the > bottom third into "lessChild" > - To find a value, look at "greaterChild" and "lessChild" to find out > whether your key is within the current range, and if not, where to look next > - Range searches mean finding the first value, then looking at > "greaterChild" or "lessChild" (depending on the direction of your search) > until you reach the end of the range. > > Super Column Family: > > index [ <columnFamilyId> [ "firstVal" : <val> , > "lastVal" : <val> , > <val> : <dataId>, > "lessChild" : <columnFamilyId> , > "greaterChild" : <columnFamilyId> ] > >