On Mon, May 23, 2011 at 2:06 PM, Bing <[email protected]> wrote:
>
> In the google io talk, data join is implemented by Append method. But
> it seems the Append method is only to append lists together. Is that
> Append method just a high-level concept or is there an implementation?
>
> Also, join can be implemented by using referenceProperty. It is not
> necessary to do the map first, and append sets of the map results
> together.

Append is in here:

http://code.google.com/p/appengine-pipeline/source/browse/trunk/src/pipeline/common.py

The idea is you would append the inputs together, then run Shuffle on
them in combination. Shuffle is what actually does the join. You can
find Shuffle here:

http://code.google.com/p/appengine-mapreduce/source/browse/trunk/python/src/mapreduce/shuffler.py

As for the referenceProperty thing, I'm not sure exactly what you
mean, but generally if you have to do any explicit queries in a map or
reduce phase you are going to hit scalability problems. The job needs
to run at full speed with minimal latency for each mapper or reduce
input.

-Brett

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to