the 2nd method makes sense to me.  the 1st method looks to me like playing
russian roulette with my data; it is twice as fast when everything is
working but what if memcached crashes?  memcached is not fault tolerant and
it was not design to be, after all, its just cache.  i think there are
applications that can tolerate hiccups like that but most of the
applications i worked on can't.


On Sat, Sep 26, 2009 at 8:37 PM, Jeroen <[email protected]> wrote:

>
> Biggest problem: The datastore is deadslow and uses indane amounts of
> cpu. I found 2 ways around it, backwards ones imho, but if it works,
> it works.
> Maybe my usecase is unique, as it involves frequent updates to the
> data (10k records) stored.
>
> 1st solution:
> Only update the datastore after 2 new updates of the data, store
> intermittent data in memcache.  (eg: 1) store in datastore & put in
> cache, 2) fetch from cache, update cache (if not in cache update
> datastore) 3) store in datastore and update cache 4) fetch from cache,
> update cache (if not in cache update datastore) 5) datastore, 6)
> cache, 7) .... etc )
>
> 2nd solution:
> Store non indexed data (about 10 fields) in one big blob, that you
> serialize when storing data, deserializing when reading.
>
> Both work fairly well (combining both methods reduced cpa usage by
> over 50%), but are cripled, by appenginge.
>
> The 1st method need a more reliable memache, atleast its limits needs
> to be clear (there havwe been moments it was only able to held 8k
> (total 10mb data) items, and moments it would held 20k (adding to
> about 30mb), when only holding 8k, data gets lost. Of course the
> nature of a cache is that it can loose date, but it would be nice if
> it behaved in a predicatable way)
>
> The 2nd method needs a good performing serialization mechanism. For
> python the obvious choice is pickle (which i'm using), but in all it's
> wisdom google decided not to include cpickle. Thus performance is
> terrible. (Yaml yielded even worse results, as the c-extention needed
> to speed things up isn't availble )(Another option might be protocol
> buffers, well.. those don't work on appengine (the google package in
> which the python code resides is locked or something))
>
> All this gives me the feeling that I'm forced to pay CPU costs that
> shouldn't be there:
> - i didn't ask for a deadslow bigtable datastore (it really is that
> damned datastore that's still eating half mu CPU usage)
> - i try to optimize, but the tools for it are crippled
>
> I fully understand the successtory related to serving static content.
> But for dynamic content, for future project, i'll hapillily not try
> using appengine anymore.
> >
>


-- 
.......__o
.......\<,
....( )/ ( )...

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to