Re: [google-appengine] Re: Architecture approaches for puts

Robert Kluin Thu, 11 Nov 2010 14:14:50 -0800

Steve,
  No problem.  Hopefully some of our / questions are beneficial.  :)
By the way, as an economist I totally agree with your last remark;
I've had the same thoughts as you (re performance) many times.  In
this case I think Google incentive is competition -- predominately
from AWS, and possibly from Azure.  If the platform is too unstable or
poor performing people will find a better solution.


  If you are using the total from the counter as your id, you might
want to think rethink using a sharded counter to generate your id
values.  If you get two requests sufficiently close together you will
get a duplicate id.  Each shard is in its own entity group, you can
not read from other groups within a transaction.  That means you are
getting your id value outside a transaction.   Of course if you are
actually using the shard_id + the shard's current value as a key_name
your fine, provided you get the value in a transaction.


Robert




On Thu, Nov 11, 2010 at 15:36, stevep <[email protected]> wrote:
>
> Robert,
>
> Overall let me say thanks. Your comments really helped for this
> subject.
>
>> allocate_ids can be used to generate a single id, just like doing
>> SomeKind().put() will generate an id automatically.  I am not aware of
>> any recommended way to use sharded counters to generate a unique
>> sequence of sequential numbers.
>
> We are using this code (suggested for for incrementing counters, but
> works for sequential key values as well):
> http://code.google.com/appengine/articles/sharding_counters.html
>
> To be honest, I was not aware of allocate_ids when setting this up
> (started coding as noobie_levelZero, now have advanced to
> noobie_levelOne :-)
>
> There seems to be very little overhead for this approach, so will
> stick with it for now. We do download this model's data for some
> analytics processing. Sequential numeric keys are not necessary for
> this, but they prove beneficial anytime we eyeball the analytics. Not
> sure if the allocate_id key values would be numerically sequenced.
> Once everything else is done, I'll look deeper into comparing these
> two approaches, and will post a thread at that time. That's surely a
> "don't hold your breath" schedule though.
>
>> Eli brought up a good point regarding this issue.  I assume the reason
>> for this 'complicated' process is to return an id to the client as
>> quickly as possible, that way if the client re-submits the request you
>> do not get duplicated data?
>
> The bigger issue (as per my response to Eli) is to avoid throttling of
> the client response handler.
>
> Unless we screw up the handling of the generated key value from the
> initial POST call, a resend will simply cause a duplicate put()
> costing us cpu cycles, but not a duplicate record.
>
>> I was more curious about the case when you make a request then the
>> internet connection fails (or the user hits 'submit' twice really
>> fast).  The server will still successfully completes the write, but
>> the client will not know.  So how do you prevent the client from
>> re-submitting the request, which might result in a duplicate record.
>
> Right now we're using an Adobe AIR app for the client using keys
> generated by the client. Process first posts the data to the local AIR
> sqlite DB, then, if the user is on-line, it attempts the GAE POST. If
> the response works, the returned key value goes into the sqlite DB as
> a reference field. GUI is disabled while this happens (with an
> appropriate dialog showing). This is all going to be changed with the
> new browser only version, so these issues will need to be addressed.
>
>> I totally agree with this approach, I think it is very similar to my
>> process.  I do not use the task-queue to write the initial record
>> though, it is put during the request I return the key in.  Your
>> approach should be safe because you return the key in the first
>> request, but there could be cases that result in lots of unneeded
>> re-submits.  For instance if the task-queue is backed up.
>
> Thanks -- nice to know I am not missing something obvious. Writing the
> new rec during the initial POST call is much preferred as I noted in
> my response to Eli, but I just think we will have too much "double
> whammy" risks when GAE infrastructure is under load -- see end of my
> response to Eli.**
>
> My thanks again,
> stevep
>
> **  Wouldn't it be nice if the throttling limit was dynamic according
> to how well GAE infrastructure was running -- something we cannot
> control, so why do we end up paying for it. There is IMHO a perverse,
> reverse incentive in current setup where profit and revenues will
> maximize at the point where GAE infrastructure investments are
> minimized to yield maximum "under load" conditions without causing
> customers to decide other other cloud services are clearly superior.
> Note that this also applies to cold start cpu overhead costs. A
> dynamic throttling algo and a standard cold-start charge (based on
> standard infrastructure performance, not varying due to load) would go
> a very long way to correct this perversion. Here's an interesting link
> about importance of incentives that made me think of GAE's current
> setup when I read it (weird but true ):
> http://www.npr.org/blogs/money/2010/09/09/129757852/pop-quiz-how-do-you-stop-sea-captains-from-killing-their-passengers
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Architecture approaches for puts

Reply via email to