Thanks Robert. Keeping field names small is good advice. I like to do
that and use an interface for the more readable names outside the DAO
in my domain model.

Every scenario i have for splitting up the data just means more CPU
when it's better to just leave it be.

On May 13, 1:29 am, Robert Kluin <[email protected]> wrote:
> Hi Ben,
>   I've dealt with several apps who's datastores are growing by
> millions of entities per day, they have billions of entities.  The
> performance remains the same, provided you're not returning an ever
> increasing number of results.  If you're just asking the see all data
> between two timestamps for a given user, you'll be fine.  Unless there
> is some other reason to, I would probably not personally waste the
> resources to batch and compres the data.  If it is about reducing
> storage cost, you're liable to spend more money in CPU time mapping
> the data to bundle, compress, and index it than just paying for the
> storage space of the entities.
>
>   What you might want to think about, if you're not already, is making
> sure you use very short kind and property names in the datastore.
> Storing the property name millions of times can add up, especially if
> you've got long property names on indexed properties.
>
>   I've been using the namespace (multitenant) feature in some apps.
> It works well when there is a very clear and distinct boundary.  It
> can make things like sharing data outside the namespace more difficult
> though, since you can not query across namespaces.
>
> Robert
>
>
>
>
>
>
>
> On Thu, May 12, 2011 at 11:26, Benjamin <[email protected]> wrote:
> > Hi Guys,
>
> > My app has taken off recently, and it's been very exciting to see so
> > much traffic. I am trying now to deal with a very high volume of
> > infrequently accessed /archival data store entries and making sure i'm
> > not going to hit a performance / cost bottleneck.
>
> > Let's say I have a store of entities without any relationships that
> > grows by 100,000 a day. The entries are accessed by a query for three
> > properties (a foreign key and timestamps that are between two provided
> > dates), usually returning several thousand values at a time. Any
> > advice for the best way to store these values?
>
> > I have been considering splitting the data i'm storing into tenants,
> > using a multitenant architecture, using a background task to compress
> > chunks of data into blobs and putting them in the store, and so on,
> > but i'm wondering if the values are indexed is it smarter to just
> > leave them alone and let the store grow to billions of entries.  Do
> > queries slow down by N as the datastore / index grows?
>
> > Thoughts? Is there a performance benefit with a multi-tenant
> > architecture?
>
> > Ben
>
> > My app is an online data historian called Nimbitswww.nimbits.com-
> > it's free and open source on google code and allows developers to feed
> > sensor data into data points to do calculations, alerts, relay etc -
> > Internet of things sort of thing.
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Google App Engine" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group 
> > athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to