I am using a "mixed" approach. When I receive data I process it (by validating inputs and update related entity's statistics), then when I compute aggregates I create and store a JSON object containing individual records with them.
Compared to individual entities, I have seen only a negligible decrease in write performance (a few extra CPU cycles used and no change in API time). I see a huge increase on read performance since 95% of the time I can directly use the entities with the JSON data. I only use the individual records for "advanced" searching and an infrequent report. Even when needing to display individual records I have found deserializing the JSON and writing fields out to be as quick (or faster!) than fetching the the individual records from the datastore. I have about 5 individual records bundled into 1 JSON object. I tested both methods performance using several hundred records of each format (identical data in the proper formats for each method). Robert On Tue, Nov 10, 2009 at 3:14 PM, Paul Kinlan <[email protected]> wrote: > Hi, > > I am doing something similar at the moment on http://www.ahoyo.com. We > parse feeds and aggregate them into a canonical JSON form that can be read > directly by our client applications. Pre-aggregating the data-feed as soon > as we poll it or receive a pubsubhubbub notification rather than compute it > when the client requests the data allows us to have a very speedy http > handler (it is important because this is the touch point for our users). We > aren't memcaching the data at the moment, but it is very simple to add in > and should save us a lot of datastore time on popular client applications. > > There is very little processing effort required to give the data to our > clients so the cost should be predictable per user of our system, if we > didn't precompute the data the performance of the client applications is > datastore query, quantity of data and sort dependent (and for popular apps > it would end up costing us a lot of money). > > There are downsides, none of this data that is json formatted is > searchable, but if you can live with that your solution is pretty much what > we do. > > Paul > > 2009/11/10 davidgm <[email protected]> > > >> Hi again, >> >> I just figured out that using JSON to store my main data in datastore >> could be a good idea. >> I would like to share this view and have your opinion about this: >> >> My app usually returns data to client in JSON format. >> This data comes from different db classes, created almost as a >> relational db (maybe a newbie mistake). >> Instead of having those multiple records, fetching them and building >> the aggregate structure and converting it to JSON... I just wonder if >> it would be better to store the JSON string into a single db class. >> >> Updates will need JSON parsing back and forth, but they are few >> updates compared to read queries. >> And db lock contention will be limited since only a few dozens of >> clients are interested in one data chunk. >> >> I think this trick can cut down CPU usage and datastore API calls. >> >> But here comes another point: JSON string will be into memcache also. >> This cache already reduces CPU and datastore calls. Maybe the >> difference between a classic approach (close to relational db) and >> this JSON string storage will be unnoticeable. >> >> The intermediate solution is to store a complex aggregate into one db >> class, but I didn't yet figure out how to do it. Precision: I use >> Python. >> >> So, my question is: >> >> In this context, what do you think of using JSON strings to store data >> into datastore? >> >> Thanks for any advice >> >> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
