On Mon, May 23, 2011 at 2:06 PM, Bing <[email protected]> wrote: > > In the google io talk, data join is implemented by Append method. But > it seems the Append method is only to append lists together. Is that > Append method just a high-level concept or is there an implementation? > > Also, join can be implemented by using referenceProperty. It is not > necessary to do the map first, and append sets of the map results > together.
Append is in here: http://code.google.com/p/appengine-pipeline/source/browse/trunk/src/pipeline/common.py The idea is you would append the inputs together, then run Shuffle on them in combination. Shuffle is what actually does the join. You can find Shuffle here: http://code.google.com/p/appengine-mapreduce/source/browse/trunk/python/src/mapreduce/shuffler.py As for the referenceProperty thing, I'm not sure exactly what you mean, but generally if you have to do any explicit queries in a map or reduce phase you are going to hit scalability problems. The job needs to run at full speed with minimal latency for each mapper or reduce input. -Brett -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
