Brett Slatkin gave a fantastic talk on this type problem/solution. http://code.google.com/events/io/sessions/BuildingScalableComplexApps.html
He suggests using a ListProperty and exploding the date into it, then a range can be selected by using set membership. Watching this video it is worth every minute. On Tue, Oct 6, 2009 at 10:56 AM, Erem <[email protected]> wrote: > > Anyone? Is manipulating key structure to order them properly in > BigTable a good idea? How to structure my queries to do prefix/range > scans on those keys? > Thanks! > > Erem > > On Oct 4, 12:57 pm, Erem <[email protected]> wrote: > > 3 questions about optimizing BigTable datastore queries on a > > timestamped class. I'm using JDO on the Java runtime. > > > > SOME CONTEXT > > My class, Event, requires speedy scans on time windows. An Event's > > timestamp is immutable. > > > > Given that (1) BigTable physically stores rows in lexicographic order > > by rowkey, and (2) what I specify as PrimaryKey directly maps to > > BigTable rowkey (after appending to appID, etc), I will set up my > > PrimaryKeys to be lexicographically ordered in time followed by an > > entity-unique ID. > > > > Some example keys, where entity is identified by its UTCDate+millis > > +unique ID. > > UTCDATE ::TIME ::UNIQUE ID > > 2000-12-01::13:15:00.000::1 //Dec 01, 2000, at 1:15PM > > 2001-12-01::13:15:00.000::2 //Dec 01, 2001, same time > > 2002-12-01::13:15:00.000::3 //Dec 01, 2002, same time > > > > This key structure allows me to do useful, dense prefix- and range- > > scans directly on the entities table. For example > > (not syntactic GQL. See Question 2.) > > WHERE key MATCHES '2000*' //all events from the year 2000 > > WHERE key MATCHES '2000-12*' //all from month of December, 2000 > > WHERE key MATCHES '2000-12-01*' //all from Dec 01, 2000 > > WHERE key > '2000-12-01' //between Dec 1 and Christmas, 2000 > > && key < '2000-12-25' > > > > 3 QUESTIONS IN ORDER OF PRAGMATIC TO OBSCURE > > (1) Am I actually optimizing anything by doing this, or am I wasting > > time? Should I expect this to be super-fast because it's performing a > > dense read on 1 or 2 BigTable Tablets? > > > > (2) I need to use range- and prefix- scan against the key in order to > > take advantage of these optimizations. How do I use them in the Java > > API? Is range-scan as simple as "WHERE key > :bottomRange and key > > < :topRange"? What about prefix-scan? > > > > (3) BigTable guarantees that entities are stored in order by RowKey. > > Its specifications also say that SSTables, once written, are > > immutable. In this case, what happens when the following sequence > > happens: > > (i) an SSTable is written that contains rowkeys = 1,2,3,5; > > (ii) I commit an entity with rowkey 4. > > Does the SSTable get deleted and recreated? Is guarantee 1 broken and > > my entity(key=4) written to a different SSTable? The answer to this > > effects how optimized this approach really is. > > > > thanks in advance for the advice you datastore ninjas you. > > > -- Kevin Pierce Software Architect VendAsta Technologies Inc. [email protected] (306)955.5512 ext 103 www.vendasta.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
