Re: [google-appengine] High-performance, high-precision sharded counters in the HRD

Robert Kluin Sat, 20 Aug 2011 23:57:18 -0700

Hey Jeff,
  If I remember correctly, I've used the memcache incr function without
issue at maybe a couple hundred ops / second.  I've got some processing that
is quite similar to what you're describing, I periodically see keys get in a
sort of "stuck" state for brief time --  maybe 30 seconds to a minute --
where they won't return or take a value. You'd want to be sure your logic
and process can handle that because I see it daily.

  Stephen has a neat idea with using pull-queue tasks as a marker.  It would
be really great if you could batch the tasks in some way, I think there is
an issue for this (maybe subqueues).  The reliability of inserting tasks has
been excellent from what I've seen, as good as or better than any of the
other services, so using them as a marker might be a really good way to
ensure you don't oversell an item.  In your case, you may not need to care
if you aren't immediately able to lease a task.  Just using them as markers
might be sufficient to know if the purchase is OK or not.

  I started writing a process using pull queues for aggregations like this
in April.  I use the pull tasks as markers, then I've got a dispatcher
process that leases them, groups them, and inserts aggregator tasks to
process each batch.  I try to insert a dispatcher task any time something is
added to the pull queue.  I use a simple system of batch numbers and rounded
timestamps so I don't insert a new dispatcher for every item, made more
efficient by memcache markers.

   As for the shared counter, I would use the predictable names + fetch by
key method.  That will give you the best chance of not missing something,
just keep the number of shards as low as you can.  Each key requires a
separate transaction to fetch the entity, true whether using a query or
fetch by key, so each additional entity increases the chance a fetched
entity has already been updated. This is a great spot to use multiple async
batch fetches to minimize the actual wall-clock time.  You should be able to
come up with a nice way of adding / removing shards to keep this as
efficient as possible.

Robert

On Aug 20, 2011 7:49 PM, "Jeff Schnitzer" <[email protected]> wrote:
>
> On Sat, Aug 20, 2011 at 2:34 PM, Stephen <[email protected]> wrote:
> > On Sat, Aug 20, 2011 at 9:50 PM, Jeff Schnitzer <[email protected]>
wrote:
> >>
> >> Second:  The key is being able to get a strongly consistent sum of the
> >> sharded counters.
> >
> > You only really need to know whether widgets are in stock or out of
stock.
> >
> > While widgets are in stock, record the customers intent to purchase by
> > placing the order in a pull-queue. Bulk process the queue in batches
> > of 1000 requests, deducting 1000 from a total-widgets counter. Flip
> > the out of stock bit when you run out.
> >
> > All orders received get an entity in the datastore marked with
> > success, or failure for the stragglers which entered the queue before
> > the out of stock bit could be flipped.
> >
> > Ajax form submission on the front end, receive a ticket immediately,
> > poll the ticket until success or failure.
>
> This is a really interesting idea, especially combined with the
> channel api to push notification to the client.  But I'm worried about
> a couple things:
>
> 1) Are there periods when pull task queues go down?  There are for
> push queues.  And they back up occasionally.  I would hate to have the
> experience that the user clicks "buy" and then waits for significant
> times (more than a second or two) before they get an ok to proceed.
>
> 2) Wouldn't this require continual looping to see if there is anything
> on the task queue?  If it checks every 2 seconds, that's potentially 2
> seconds of user waiting.  Or if this is done by kicking off a push
> task, we're back to long waits or failures when the push task system
> is backed up or disabled.
>
> 3) I'm a little concerned about getting enough parallelism with this.
> If the task processor leases 1,000 tasks, they'll be a mishmosh of
> different kinds.  At the very least it will require writing 1,000
> purchase records, which will probably take a few seconds even with
> async.  Maybe the answer is to do smaller chunks and run multiple
> processors in parallel, but now we get back to potential collision and
> sharding issues.  And the total throughput of the system will be 1,000
> per (however long it takes) rather than 1,000 per purchasable item per
> (however long it takes).  This feels solveable but it might be a
> balancing act rather than straightforward engineering.
>
> Just to add a few bits of complexity that I didn't mention the first time:
>
>  * The UI must display the # of units left
>
>  * There's a shopping-cart of sorts whereby the product is deducted
> from inventory when it gets added to a cart.  It gets returned to the
> pool if the transaction hasn't been completed in 15 mins (similar to
> the way eventbrite works).
>
> However, I don't think this extra complexity is too hard to add to a
> basic purchase system that can already handle restricting sales to a
> specific # of units.
>
> Jeff
>
> --
> You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
[email protected].
> For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] High-performance, high-precision sharded counters in the HRD

Reply via email to