Hi Ben, I've dealt with several apps who's datastores are growing by millions of entities per day, they have billions of entities. The performance remains the same, provided you're not returning an ever increasing number of results. If you're just asking the see all data between two timestamps for a given user, you'll be fine. Unless there is some other reason to, I would probably not personally waste the resources to batch and compres the data. If it is about reducing storage cost, you're liable to spend more money in CPU time mapping the data to bundle, compress, and index it than just paying for the storage space of the entities.
What you might want to think about, if you're not already, is making sure you use very short kind and property names in the datastore. Storing the property name millions of times can add up, especially if you've got long property names on indexed properties. I've been using the namespace (multitenant) feature in some apps. It works well when there is a very clear and distinct boundary. It can make things like sharing data outside the namespace more difficult though, since you can not query across namespaces. Robert On Thu, May 12, 2011 at 11:26, Benjamin <[email protected]> wrote: > Hi Guys, > > My app has taken off recently, and it's been very exciting to see so > much traffic. I am trying now to deal with a very high volume of > infrequently accessed /archival data store entries and making sure i'm > not going to hit a performance / cost bottleneck. > > Let's say I have a store of entities without any relationships that > grows by 100,000 a day. The entries are accessed by a query for three > properties (a foreign key and timestamps that are between two provided > dates), usually returning several thousand values at a time. Any > advice for the best way to store these values? > > I have been considering splitting the data i'm storing into tenants, > using a multitenant architecture, using a background task to compress > chunks of data into blobs and putting them in the store, and so on, > but i'm wondering if the values are indexed is it smarter to just > leave them alone and let the store grow to billions of entries. Do > queries slow down by N as the datastore / index grows? > > Thoughts? Is there a performance benefit with a multi-tenant > architecture? > > Ben > > My app is an online data historian called Nimbits www.nimbits.com - > it's free and open source on google code and allows developers to feed > sensor data into data points to do calculations, alerts, relay etc - > Internet of things sort of thing. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
