If you need access to all message values in vprog, there's nothing wrong with building up an array in mergeMsg (option #1). This is what org.apache.spark.graphx.lib.TriangleCount does, though with sets instead of arrays. There will be a performance penalty because of the communication, but it sounds like that's unavoidable here.
Ankur <http://www.ankurdave.com/> On Wed, Apr 23, 2014 at 8:20 PM, Ryan Compton <compton.r...@gmail.com > wrote: > 1. a hacky mergeMsg (i.e. combine a,b -> Array(a,b) and then do the > median in vprog) >