thanks. that's what i thought. The biggest thing that might make it really good for distributed matrices is message broadcast to several nodes (i think original Pregel has that). Oh well i guess i will wait for GraphX -- that one should have it, i suppose :)
On Thu, Dec 19, 2013 at 2:51 PM, Matei Zaharia <[email protected]>wrote: > No, it’s just that the latency is not always predictable :) > > But yes, Bagel uses hashing operations like groupByKey and reduceByKey > (depending on whether you have a combiner). You could implement the same > thing by hand. The Bagel source code is only about 200 lines actually. > > Matei > > On Dec 19, 2013, at 11:46 AM, Dmitriy Lyubimov <[email protected]> wrote: > > I guess Bagel-related questions are ignored, possibly because Bagel is > slated for retirement? > > > On Tue, Dec 17, 2013 at 10:35 AM, Dmitriy Lyubimov <[email protected]>wrote: > >> Hello, >> >> i have a quick question: >> >> It just recently occurred to me thtat in Spark group-by is not >> shuffle-and-sort but rather "shuffle-and-hash", i.e. there's no sorting >> phase. Right? >> >> In that light, a single bagel iteration should really cost just as much >> as message grouping with the regular "group by key" thing. >> >> Right? >> >> Thank you in advance for the clarification. >> -Dmitriy >> > > >
