Hello, i have a quick question:
It just recently occurred to me thtat in Spark group-by is not shuffle-and-sort but rather "shuffle-and-hash", i.e. there's no sorting phase. Right? In that light, a single bagel iteration should really cost just as much as message grouping with the regular "group by key" thing. Right? Thank you in advance for the clarification. -Dmitriy
