I doubt BigTable keeps track of index positions... it seems from its architecture (distributed sorted list of keys) that it would require updating every entity in the table on every insert to keep track of each entity's position. So, the question becomes: do you really need this information, or can your application be restructured to use the index key itself for whatever purpose you have in mind for the index key's position?
I can think of at least one example where you might really need this information: a ranking system that ranks a huge list of entities based on where they fall in the sorted list of some key (probably a numerical "strength" value calculated from the rest of the entity's properties). Let's say the entities represent Users on some competitive flash gaming site, and JoeNewUser, who has been a member of the site for 3 weeks and played 734 games of Tower Defense, wants to know what his rank is. Joe's rank is 73,248, but we don't want to have to look at the 73,247 people ahead of him (sorted by TotalScore DESC) to find that out. If your actual use case has nothing in common with this, let me know, but I think it at least parallels the general gist of what you are asking for. What you are looking for is a B-Tree, with the addition of subtree size in the subtree pointers: http://en.wikipedia.org/wiki/B-tree The data model for that might look something like this: class BTree(Model): totalScores = ListProperty(int) #a sorted list totalScores, up to some limit long user = ListProperty(Key(User)) #same length as totalScores, points to the User the score belongs to pointers = ListProperty(Key(BTree)) #pointers to child Nodes subTreeCount = ListProperty(int) #number of keys in all child Nodes below this pointer subTreeSum = ListProperty(int) #a prefix sum (running total) of subTreeCount +1 for each totalScore So it's really just a bunch of list properties. You set how big you want the list to get (bigger is better - fewer datastore calls for lookups - but keep in mind the 1 MB limit). Follow the procedures on the wiki page linked above, but on insert, add one to the subTreeCount for the each pointer you follow (and 1 to every subTreeSum from there to the right on each node). To find the index of an item you are searching for, keep track of the subTreeSum of all the subtrees to the left of the subtree you are following, plus the nodes you pass along the way, and add them all up to get the index of the final node. Don't forget to binary search the totalScores list - there's no need to look at every item in it, since it's already sorted. You may have to do some linear searching at the end if several users have the same totalScore, but you can avoid that by using fixed point numbers and appending the UserID to the key to ensure uniqueness while still sorting correctly (lexicographically). Split operations are relatively simple too, since you have all the data you need to populate the new subTreeCounts and subTreeSums... it's just bookkeeping. Looking at the complexity of this, if you have 100 keys per BTree node, and a million entities, you can find the one you are looking for and its index in 3 sequential datastore calls, while comparing to 8 keys per node. 1000 keys per page gets you down to 2 datastore lookups and 10 comparisons per page, but you start spending more and more CPU time serializing and deserializing the data in and out of the datastore, so you'll have to play around with it to find the Goldilocks number. Good luck, and let me know if you need a more detailed explanation (though it will probably take a few days). David On Nov 1, 3:18 am, de Witte <[email protected]> wrote: > Hi, > > I'm looking for the following: > > Is it possible to retrieve the position of an entity in an index table? > > Of course, within going through the entire index by retrieving keys. > > Ayy tipes would be welcome -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
