Hi Patrick,

Good questions!

On Tue, Apr 20, 2010 at 12:57 AM, Patrick Twohig
<[email protected]>wrote:

> Hi All,
>
> As I understand it, the process of performing a single fetch (call to
> get())  from the dastastore using a key basically involves finding the host
> housing the entity, opening a socket, fetching the data, and then cleaning
> up the connection.  So to fetch something like 30 entities from the
> datastore, you're repeating the process 30 times over in serial, each time
> incurring whatever overhead is involved.  I also read that if you perform
> bulk fetches, (ie passing multiple keys at once) you can eliminate a great
> deal of that overhead.  In one of the videos I watched from Google I/0 2009,
> the presenter (whose name I forget - d'oh) said that performing a bulk fetch
> actually performs the fetches in parallel from the data store and you shoudl
> see requests noticeably faster.
>
> Currently I have a few situations where the app performs many fetches from
> the data store in serially, rather than in bulk, and I believe it is the
> result of these requests being extremely slow and CPU intensive.  Where
> possible, I put into place as much bulk fetches as I can but I'm a little
> stuck in a few places.
>
> I'm basing the fetch latency on today's numbers --
> http://code.google.com/status/appengine/detail/datastore/2010/04/19.
> Anomalies aside,  It looks like the get latency somewhere between 80ms and
> 160ms, let's spit difference and just say that it's 120 milliseconds.
> Additionally, the query latency is somewhere between 250ms and 500ms.
> Splitting the difference, that's 375ms.  I'm just going to use those numbers
> as a ballpark estimate for fetching multiple entities from the data store,
> feel free to correct me if any of my reasoning is flawed or incorrect.
>

The figures shown by the status site seem to be on the high side at the
moment - they represent worst cases. In my own apps, gets are observed to be
more on the order of 10-20ms, while queries vary widely depending on
returned data, but average about 100-300ms.


> Example 1: http://imagepaste.nullnetwork.net/viewimage.php?id=830
>
> Given the above example, I'm assuming that if I performed an ancestor query
> with Foo("A") as the ancestor it would effectively bulk-fetch the entire
> entity group.  I could then use the result of that query to get the data I
> need.  That would make the fetch from the datastore one query, 375
> milliseconds versus (7entities * 160ms) or 1120ms.  So long as you need  3
> or more entities (3 * 160) it would stand to reason that you're just better
> off just fetching the whole thing.  In some simple tests I did, that seemed
> to be the case, the query approach was faster, and that's great if you know
> everything is in the same entity group.
>
> Example 2:  http://imagepaste.nullnetwork.net/viewimage.php?id=831
>
> Given the above example, none of the entities are in the same entity group,
> but I would want to try to perform bulk fetches wherever possible.  I would
> first fetch Foo("A").  I would then see that it has two key properties
> pointing to Bar("B") and Bar("C"), perform a fetch of those two entities at
> once.  Finally, I would see that Bar("B") and Bar("C") each reference two
> more entities -- Baz("D"), Baz("E"), Baz("F"), and Baz("G") for a total of
> four.  In the worst case, I would fetch each entity individually taking,
> once again, 1120ms.  In the best case and I perform 3 fetches, (fetch A
> first, then fetch B and C, then lastly fetch D, E, F, and G), it would be
> more in the neighborhood of 480 milliseconds.  It's still an improvement
> over fetching each entity individually, but not much.
>

Very similar to this is the 'referenceproperty prefetching' pattern - see
http://blog.notdot.net/2010/01/ReferenceProperty-prefetching-in-App-Engine
<http://blog.notdot.net/2010/01/ReferenceProperty-prefetching-in-App-Engine>


>
> So I was thinking of ways to improve this, the second example in
> particular, because I have a few places in my app where that exact thing is
> happening.  Right now it's actually implemented with individual fetches, but
> it backed by memcache in many circumstances so that definitely helps.
>
> So given that, here's my questions...
>
>    - When serializing the objects, would it be worthwhile adding some sort
>    of metadata in the entity that would tell me what other entities it
>    references (either directly or indirectly) so that I could fetch the whole
>    thing with one or two API calls?  I was thinking that an entity could have
>    child entities with all the keys it references directly or indirectly.  
> This
>    would be a huge pain to implement, and I'm not sure it would make a
>    noticeable performance boost.
>
>
Certainly, if you experience serial gets as a significant problem that isn't
solved with simple prefetching, this could be worth doing. I would avoid
using child entities, however, and simply have a list of keys instead.


>    - Is there something "under the covers" of the API that actually makes
>    more efficient usage of resources that I don't know about?
>
>
There's a lot 'under the covers' - what specifically are you thinking of?

-Nick Johnson


>    - Is there something in the API that I don't know about that could make
>    the second example faster w/o much effort?
>    - Is my design just bad and I should figure out a better way of doing
>    it?  If so, how would I go about doing that?
>
> Alright, that's all for now.
>
> Thanks,
> Patrick.
>
> --
> Patrick H. Twohig.
>
> Namazu Studios
> P.O. Box 34161
> San Diego, CA 92163-4161
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>



-- 
Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. ::
Registered in Dublin, Ireland, Registration Number: 368047
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to