On Mon, Dec 29, 2014 at 2:11 PM, Stephen Haberman
<stephen.haber...@gmail.com> wrote:
> Yeah...I was trying to poke around, are the Iterables that Spark passes
> into cogroup already materialized (e.g. the bug was making a copy of
> an already-in-memory list) or are the Iterables streaming?

The result of cogroup has values that are Iterable, yes. I don't
recall whether there has been a change to make them actually lazy or
not. This wasn't changed here in any event.

But the result of a flatMap(Values) does not have to produce an
Iterable. If you run a for loop over Iterators instead, you get an
Iterator over the result.


>> I think this may also be a case where Scala's lazy collections (with
>> .view) could be useful?
>
> Probably? Do you mean within user code, or that Spark would pass in an
> already-lazy collection?

Within Spark. I think the effect would be similar, in that elements
are lazily materialized.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to