Jonathan, Thanks for your thoughts....
> On Sun, Feb 6, 2011 at 11:03 AM, Shaun Cutts <sh...@cuttshome.net> wrote: >> What I think you should be doing is the following: open iterators on the >> matching keys for each of the indexes; the inside loop would pick an >> iterator at random, and pull a match from it. This would assure that the >> expected number of entries examined is a small multiple (# of other indexes) >> of the index with the most "precision". > > Isn't this a bad approximation of performing an intersection by > iterating through each index results (which, since they return in > sorted order, can be done without horrible memory cost)? I don't think so, unless I'm not understanding you. The point is... suppose you have 1M matches on the key for one index, and 1 in the other (for a row among the 1M matching the first), and you have no information about which is which. If you arbitrarily pick one to go first at random, you loop 500K times on average. If you do what I'm suggesting you loop 2 times on average -- a big difference :). This is "breadth first search" but the breadth is just the number of indexes... if there are really thousands of indexes you could bound to the X most likely, but in that case getting the order right for optimal depth-first is quite unlikely, whereas since this is an "AND", for even modest X you lower your chances of a "wrong" choice exponentially. Also, is the amount of memory for a "cursor" in an index really that large compared with the memory that you use for query specification in any case? Your current method does assume it has random access to the query. I do think its a flaw that something which can be done in constant time actually takes time which could be proportional to the number of rows, just by picking the wrong index. > I'm not opposed to doing that. But I have no idea how to guess when > that is better than the current inner-loop method. What I'm proposing is always at most a constant times number of iterations worse (ok -- # of times proportional to the number of indexes used), vs something that could be arbitrarily much worse. But you could adapt the current heuristic to tune the probabilities, so that the worse case will be asymptotically the same as current.... but the average case will be much better. > >> I know you have a new type of index in the works... but it doesn't look like >> "trunk" has any modifications for "scan", and presumably the strategy I just >> mentioned is pretty general (not depending on histograms, etc). Does it >> sound like a good idea? > > I think you're thinking of > https://issues.apache.org/jira/browse/CASSANDRA-1472, and my > understanding is that the bitmap approach basically does what you want > only faster. > I'll have to study it. As far as I know what I'm proposing is orthogonal to histogram methods that don't actually keep stats on the joins (which is a lot of stats to keep), and even in this case it can be used to optimize "lower resolution" statistics to the partial experience from sampling the current query. It does introduce extra overhead for small queries, of course, so if you had histograms that guaranteed that the query was going to be very small for a particular index then it would be best to put that first without fiddling. However, typically you have exact info for only the common keys, not the rare ones. -- Shaun > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com