Thanks for the clarification, Nick. Karl
On Mar 23, 5:58 am, "Nick Johnson (Google)" <[email protected]> wrote: > Hi Karl, > > You're correct that it is indeed the cartesian product in this case - it > produces one index entry for every unique tuple of values from the indexed > columns. > > This gets slightly more complicated in the situation where the same column > is being indexed multiple times. In that situation, the naive cartesian > product is not used - instead, the additional constraints that the values > are sorted in non-descending order and that no value occurs more than once > in a tuple are added, so as to eliminate unnecessary duplicates. Eg, an > index on (foo, foo), with a value of [1,2,3] for foo will result in the > following entries: > > (1,2),(1,3),(2,3) > > Notably omitting the redundant entries: > > (2,1),(3,1),(3,2) > > And the invalid entries: > > (1,1),(2,2),(3,3) > > -Nick Johnson > > > > On Tue, Mar 23, 2010 at 2:22 AM, kostmo <[email protected]> wrote: > > OK, I just watched Brett Slatkin's I/O talk [1] and he mentions "cross > > product" a couple of times, so it seems that the use of the word > > "permutation" in the docs is incorrect; the number of index entries is > > indeed proportional to the Cartesian product, rather than > > "permutations" which would lead to a factorial (n!) number of index > > entries. > > > I've filed a doco issue here [2]. > > > Karl > > > [1] > >http://code.google.com/events/io/2009/sessions/BuildingScalableComple... > > [2]http://code.google.com/p/googleappengine/issues/detail?id=3003 > > > On Mar 22, 11:50 am, kostmo <[email protected]> wrote: > > > It is still unclear to me exactly how many index entries will be > > > produced by including multiple ListProperty's in an entity. In one > > > post [1], between the inquiring user and the Google rep, the > > > possibilities for the number of index entries included the power set, > > > Cartesian product, or the "number of unique combinations" (this was > > > the rep's answer, without mention of "permutations"). > > > > I'd like to verify that the correct combinatorics terminology is being > > > used in the documentation on exploding indexes [2]. > > > > Quote: > > > "the index table must include a row for every permutation of the > > > values of every property for the index" > > > This is followed by the example: > > > e2 = MyModel() > > > e2.x = ['red', 'blue'] > > > e2.y = [1, 2] > > > "the index must store 12 property values: 2 each for the built-in > > > indexes on x and y, and 2 for each of the 4 permutations of x and y in > > > the custom index". > > > > The example does not make me completely confident about what type of > > > combinatoric expression is truly being used, since the sum of the > > > number of permutations for each property happens to equal the > > > cardinality of the Cartesian product of the two properties. The > > > enumeration given in this post [3] seems to suggest that it is > > > actually the Cartesian product at work. > > > > To clarify this issue, we could use Python 2.6 to illustrate an > > > example:>>> from itertools import product, permutations > > > >>> x = ['red', 'blue', 'green'] > > > >>> y = [1, 2, 3] > > > >>> list(permutations(x)) + list(permutations(y)) > > > > [('red', 'blue', 'green'), ('red', 'green', 'blue'), ('blue', 'red', > > > 'green'), ('blue', 'green', 'red'), ('green', 'red', 'blue'), > > > ('green', 'blue', 'red'), (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), > > > (3, 1, 2), (3, 2, 1)]>>> list(product(x, y)) > > > > [('red', 1), ('red', 2), ('red', 3), ('blue', 1), ('blue', 2), > > > ('blue', 3), ('green', 1), ('green', 2), ('green', 3)] > > > > Is the number of custom index entries that will be generated by an > > > entity with the two 3-valued properties given above proportional to > > > the sum of permutations (there are 12) or the Cartesian product (there > > > are 9)? > > > > Thanks, > > > Karl > > > > [1] "Efficient way to structure my data model" > >http://groups.google.com/group/google-appengine/browse_thread/thread/... > > > [2] > >http://code.google.com/appengine/docs/python/datastore/queriesandinde... > > > [3] "Size of index in case of exploding index" > >http://groups.google.com/group/google-appengine/browse_thread/thread/... > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<google-appengine%[email protected]> > > . > > For more options, visit this group at > >http://groups.google.com/group/google-appengine?hl=en. > > -- > Nick Johnson, Developer Programs Engineer, App Engine > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: > 368047 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
