Re: [google-appengine] High-performance, high-precision sharded counters in the HRD

Ikai Lan (Google) Mon, 22 Aug 2011 10:39:02 -0700

(and if I read your entire email, I would have realized that you already
figured it out that you can do a batch get by key)


--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com | twitter.com/ikai



On Mon, Aug 22, 2011 at 10:37 AM, Ikai Lan (Google) <[email protected]>wrote:

> Jeff,
>
> 1. I don't know the exact number, but this method has been used in apps
> that get thousands of QPS
> 2. No, because there is also a Key index. There's no entity group root to
> "transact" on to see if we have the newest version of data. If you did a
> batch get with the keys themselves, it would be strongly consistent. Here's
> my example that does that:
>
> http://io-bootcamp-datastore.appspot.com/
>
> My example uses pregenerated keys with a prefix and a batch get by key for
> strong consistency, though the write behind counter cache (for sorting) is
> eventually consistent.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine
> plus.ikailan.com | twitter.com/ikai
>
>
>
> On Sat, Aug 20, 2011 at 1:50 PM, Jeff Schnitzer <[email protected]>wrote:
>
>> The short version boiled down to two questions:
>>
>> 1) What is the maximum realistic throughput for memcache increment()
>> on a single counter?
>>
>> 2) The documentation for the HRD says that queries across entity
>> groups are eventually consistent.  Does this extend to __key__ queries
>> as well?  For example, will this be eventually or strongly consistent?
>>
>> SELECT * FROM Thing WHERE __key__ >= KEY('Thing', 10) AND __key__ <=
>> KEY('Thing', 20)
>>
>> -----
>>
>> The long version, to hopefully open a discussion:
>>
>> I'm selling various types of virtual goods, each of which have a fixed
>> number available.  The constraints for each product are:
>>
>>  * I can sell up to exactly N items
>>  * I cannot under any circumstances sell N+1
>>  * For some products, N might be hundreds of thousands
>>  * The peak purchase rate is expected to be hundreds per second when a
>> highly anticipated product goes on sale.  Maybe thousands per second.
>>
>> I have thought of several strategies for building this system, but the
>> best one requires a counter that can:
>>
>> 1) Be updated hundreds of times per second (thousands?)
>> 2) Give me a transactionally safe count, or at least temporarily err
>> on the side of overcounting (not undercounting!)
>>
>> Between memcache and the datastore, I can build this:
>>
>>  * Memcache increment() gives me a current atomic count of purchases.
>>  * Sharded counters give me a scalable persistent record of purchases
>> (the purchase record can be parented by the count shard and updated in
>> the same transaction)
>>  * Memcache putIfUntouched() lets me safely initialize the memcache
>> value as long as I can get a strongly consistent count from the shards
>>  * By ordering memcache increments *before* shard increments for
>> purchases, and shard decrements *before* memcache decrements for
>> refunds, I can guarantee that datastore failures will at worst cause
>> overcounts, which is fine.  A timeout on the memcache value means it
>> will eventually become accurate again.
>>
>> First question:  What is the practical throughput of
>> memcache.increment() on a single counter?  Thousands per second?
>>
>> Second:  The key is being able to get a strongly consistent sum of the
>> sharded counters.  The example of a sharded counter in the appengine
>> documentation uses a query to fetch the shards, which will only work
>> on the M/S datastore.
>>
>> Since get() is strongly consistent, the obvious solution is to give
>> the shards predictable names and use batch get() for XYZ-shard1,
>> XYZ-shard2, ... XYZ-shard100.  Ok, but it's not quite as nice as being
>> able to say query for __key__ >= XYZ-shard1 and __key__ <=
>> XZY-shard100 because some of the shards might not exist... or I might
>> want to decrease the number of shards for a running system.  Should I
>> just not care about this case?
>>
>> I guess I've probably answered question #2... but if anyone else has
>> any suggestions for how to build a system like this, I'd love to hear
>> your thoughts.
>>
>> Thanks,
>> Jeff
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] High-performance, high-precision sharded counters in the HRD

Reply via email to