Re: [google-appengine] Re: Announcing the High Replication Datastore for App Engine

Ikai Lan (Google) Thu, 13 Jan 2011 14:01:05 -0800

Hey Waldemar, sorry for missing your original question. Okay, let me attempt
to answer these questions:

1. According to the docs Get/Delete/Put are strongly consistent, so in order
to get up-to-date entity data can I run a keys_only query and then use
db.get() to fetch the latest results?

No. The problem is that indexes across multiple entity groups are eventually
consistent, so though db.get will return the newest version of that entity
(it is within a transaction), the indexes themselves may be out of date and
only return a partial or incorrect result set.

2. Is get() only consistent for single-key calls or also for a list of keys?
Both. In another thread, a user demonstrated that a batch get of 500 root
entities was significantly more expensive than with master-slave datastore.
500 transactions are needed. If you enable eventual consistency,
transactions are not used and performance is much more similar to
master-slave.

3. OK. So according to 2 and 3, almost always (99+%) you will hit a
master server, so specifying eventual consistency makes no difference.
So for the MS datastore this flag affects robustness not performance,
right?
I'm not entirely clear what is being asked here. Can you provide an example?
If you see my answer to question #2 in this post, it may clarify the
benefits of the eventual consistency setting.

4. What about the HR datastore -- are queries across entity groups also
likely to be 99+% consistent, or is the figure different?
We don't provide a % consistency, as this makes no sense. If you just made a
massive data update prior to the query, the data will likely not be
consistent. If you made a small number of chances, the data likely will be.

The 99% I quoted earlier refers to the amount of time most queries should be
consistent after, say, 100ms - 99.99(9?)% of the time, we will replicate
data fast enough so that queries are fairly consistent, but in that fraction
of a fraction of a percent of the time, a large number of queries will be
inconsistent.

5. As cross-entity group HR queries are eventually consistent by default,
is there ever a reason to use the eventually consistent flag, such as
when querying a single entity group?
If you don't want transaction overhead when querying by key you may want to
enable this flag, but for most intents and purposes you probably don't need
the eventually consistent flag.

6. Is there any difference between queries, gets, puts or deletes related
to consistency, HR or MS datastore, or do the rules apply equally no
matter the datastore operation?
I'm going to avoid answering this question since this thread should answer
most of your questions. If there's anything you're still unclear about,
please post a follow-up.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
Blogger: http://googleappengine.blogspot.com
Reddit: http://www.reddit.com/r/appengine
Twitter: http://twitter.com/app_engine

On Thu, Jan 13, 2011 at 1:27 PM, Stephen
<[email protected]<sdeasey%[email protected]>
> wrote:

> On Fri, Jan 7, 2011 at 7:57 PM, Ikai Lan (Google)
> <[email protected] <ikai.l%[email protected]>> wrote:
> > Thanks for the update, Stephen (I am now realizing that there are 2
> Stephens
> > in this thread and that you were not having a dialogue in public with
> > yourself)! I misspoke earlier: the numbers I cited were latency numbers
> in
> > the event of a failure in the Slave datastore. The actual numbers for the
> > majority (99+%) of requests is still a few hundred milliseconds. In most
> > cases, replication is even synchronous. It's in the error cases where the
> > average replication lag is 3 minutes.
> >
> > - The number of errors is less than a fraction of a fraction of a
> fraction
> > of a percent. I don't think a percent number makes sense here: it's the
> fact
> > that the errors are not evenly distributed. That is - a request that
> blocks
> > due to unavailability is likely to follow be several requests and lead to
> a
> > sudden spike of instability in your application that may depend on many
> > small, fast calls.
> >
> > To answer your last two questions, it makes more sense to explain how
> > eventually consistent reads work:
> >
> > 1. Fire off an RPC to the datastore service internally.
> > 2. Wait a grace period for this RPC to return. This grace period is in
> the
> > tens of milliseconds. Most requests (99+%) will return in a time that is
> > far, far under this grace period
> > 3. If the grace period has been surpassed, we send out an RPC to the
> slave
> > datastore. Now - we just wait to see who returned first: the original RPC
> to
> > the master datastore or the second RPC to the slave datastore. Again, in
> > most cases, the master datastore is likely to respond first simply
> because
> > it was given a head start
>
>
> OK. So according to 2 and 3, almost always (99+%) you will hit a
> master server, so specifying eventual consistency makes no difference.
> So for the MS datastore this flag affects robustness not performance,
> right?
>
> Couple more questions...
>
> What about the HR datastore -- are queries across entity groups also
> likely to be 99+% consistent, or is the figure different?
>
> As cross-entity group HR queries are eventually consistent by default,
> is there ever a reason to use the eventually consistent flag, such as
> when querying a single entity group?
>
> Is there any difference between queries, gets, puts or deletes related
> to consistency, HR or MS datastore, or do the rules apply equally no
> matter the datastore operation?
>
> Thanks for taking the time to explain this so far -- much appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Announcing the High Replication Datastore for App Engine

Reply via email to