On Mon, Dec 29, 2014 at 2:11 PM, Stephen Haberman <stephen.haber...@gmail.com> wrote: > Yeah...I was trying to poke around, are the Iterables that Spark passes > into cogroup already materialized (e.g. the bug was making a copy of > an already-in-memory list) or are the Iterables streaming?
The result of cogroup has values that are Iterable, yes. I don't recall whether there has been a change to make them actually lazy or not. This wasn't changed here in any event. But the result of a flatMap(Values) does not have to produce an Iterable. If you run a for loop over Iterators instead, you get an Iterator over the result. >> I think this may also be a case where Scala's lazy collections (with >> .view) could be useful? > > Probably? Do you mean within user code, or that Spark would pass in an > already-lazy collection? Within Spark. I think the effect would be similar, in that elements are lazily materialized. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org