Anyone? Is manipulating key structure to order them properly in BigTable a good idea? How to structure my queries to do prefix/range scans on those keys? Thanks!
Erem On Oct 4, 12:57 pm, Erem <[email protected]> wrote: > 3 questions about optimizing BigTable datastore queries on a > timestamped class. I'm using JDO on the Java runtime. > > SOME CONTEXT > My class, Event, requires speedy scans on time windows. An Event's > timestamp is immutable. > > Given that (1) BigTable physically stores rows in lexicographic order > by rowkey, and (2) what I specify as PrimaryKey directly maps to > BigTable rowkey (after appending to appID, etc), I will set up my > PrimaryKeys to be lexicographically ordered in time followed by an > entity-unique ID. > > Some example keys, where entity is identified by its UTCDate+millis > +unique ID. > UTCDATE ::TIME ::UNIQUE ID > 2000-12-01::13:15:00.000::1 //Dec 01, 2000, at 1:15PM > 2001-12-01::13:15:00.000::2 //Dec 01, 2001, same time > 2002-12-01::13:15:00.000::3 //Dec 01, 2002, same time > > This key structure allows me to do useful, dense prefix- and range- > scans directly on the entities table. For example > (not syntactic GQL. See Question 2.) > WHERE key MATCHES '2000*' //all events from the year 2000 > WHERE key MATCHES '2000-12*' //all from month of December, 2000 > WHERE key MATCHES '2000-12-01*' //all from Dec 01, 2000 > WHERE key > '2000-12-01' //between Dec 1 and Christmas, 2000 > && key < '2000-12-25' > > 3 QUESTIONS IN ORDER OF PRAGMATIC TO OBSCURE > (1) Am I actually optimizing anything by doing this, or am I wasting > time? Should I expect this to be super-fast because it's performing a > dense read on 1 or 2 BigTable Tablets? > > (2) I need to use range- and prefix- scan against the key in order to > take advantage of these optimizations. How do I use them in the Java > API? Is range-scan as simple as "WHERE key > :bottomRange and key > < :topRange"? What about prefix-scan? > > (3) BigTable guarantees that entities are stored in order by RowKey. > Its specifications also say that SSTables, once written, are > immutable. In this case, what happens when the following sequence > happens: > (i) an SSTable is written that contains rowkeys = 1,2,3,5; > (ii) I commit an entity with rowkey 4. > Does the SSTable get deleted and recreated? Is guarantee 1 broken and > my entity(key=4) written to a different SSTable? The answer to this > effects how optimized this approach really is. > > thanks in advance for the advice you datastore ninjas you. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
