Re: [google-appengine] Help needed with CPU Time

Eli Jones Wed, 29 Sep 2010 17:09:00 -0700

Are you doing the Gets in a batch all at once or individually?  (Naturally,
my recommendation would be to get everything in a batch if they are from the
same Model)

What exactly are the objects that you are getting?  Since you are getting by
keyname, you could modify your code fairly easy to update the memcache
whenever one of these objects is updated.. and just have your code check
memcache first when doing the gets.. and only hit the datastore if the
object was not found (and, of course, updating memcache at that point).

Doing batch puts, batch gets and looking into using memcache should give you
the best improvement in CPU, API time.  (In the least, you should test these
approaches out to see what the benefit is.)

If the changes above don't make a significant impact, I'd say you might
could look into using an alternative to a transaction on the put.

Is the put creating new entities or is it updating existing ones?

If its creating new ones, then you could try an approach like this:

1. Store the list of keys that are being put. (not sure if you are manually
setting the entity keys)
2. Attempt batch put (with some process for retrying once or twice in the
event of failure)
3. If batch put fails, issue a delete against the list of keys. (or fire off
a background task that tries the delete offline.. and maybe logs or emails
you info about the error).

Granted, that may be a little too Wild West.. since the put could fail.. and
then the delete or the task could fail.. or new entities could be put for
the keynames before the task completed its cleanup of the half-put entities.

But, if the error doesn't happen that often.. and the cleanup process can
handle 99% of the errors you might encounter.. this could save you a lot of
CPU time..

Also, would it be possible to have the code detect when datastore data
related to this batch put was in an error state?  e.g.  Your code issues a
get against datastore entities that it expects to find.. if the data comes
back incomplete (with an entity missing), it has some process to handle
this.

Anyway, that might not be as useful to you.. especially of the batch put is
updating entities.. since you won't be able to tell if the data is
inconsistent when doing a get.  And, maybe the batch putting and or batch
memcache getting could give you performance close enough to what you are
looking for..

On Wed, Sep 29, 2010 at 6:43 PM, Michael <[email protected]> wrote:

> For the request you describe here, what modules are you importing?
> *Nothing extra that I know of besides what is necessary for Java/JDO to
> work with App Engine.  What's the best way to figure this out?*
>
> Are the Gets by keyname or are they filtered queries?
> *The gets are done with keys.*
>
> Are you doing the puts all at once (for Python, that would mean putting all
> the entities into a entityList and calling db.put(entityList)) or are you
> doing separate puts for each entity?
> *The puts are done separately, but I was going to look into batching them
> later today.  Doesn't this type of batching reduce realtime, and not cpu/api
> time?*
>
> Have you looked into using memcache for the gets?  (if you can't use
> memcache, why not? etc)
> *We haven't looked at memcache yet.  What's the overhead like if you don't
> hit the cache often?  For our game, each user will have their own list of
> games that they will be making moves in.  So there isn't a lot of overlap in
> data between users or between games, but they might access a couple of the
> same objects in a matter of a few minutes.  Does that seem like a data
> access pattern that could benefit from memcache?*
>
> Are the CPU Times you are reporting the average for a cold instance of the
> request or a hot instance (If you look in the logs for the request, it
> should tell you "This request caused a new process to be started.." if it is
> a cold instance)?
> *Hot instance.*
>
> How many indexes are defined on the Model types that are being put to the
> datastore?
> *There are 3 different types of objects being saved.  Two of them have 1
> index each and the other has 3 indices.  I don't know if it really needs all
> those anymore since I think some of them were auto-generated during
> development.*
>
> Is there any way around using a Transaction? (not really worth exploring
> unless you've tested the process without a transaction.. and it performs
> drastically better)
> *We can't really get rid of the transaction.  App Engine threw an
> exception once while writing the objects and one was saved while one wasn't,
> and that screwed things up.  So the transaction protects against that.*
>
> Have you looked into manually logging cpu time in the code?  (e.g.
> recording cpu time before the gets.. and then after.. to see what CPU they
> use.. then doing the same for the Puts)
> *We've mostly been relying on the timings that are already in the log and
> in AppStats.  But we can manually log the times too.  What is the difference
> between what that would tell us and the stats we're already looking at?**
> *
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Help needed with CPU Time

Reply via email to