recent join/iterator fix

Stephen Haberman Sun, 28 Dec 2014 20:30:12 -0800

Hey,

I saw this commit go by, and find it fairly fascinating:


https://github.com/apache/spark/commit/c233ab3d8d75a33495298964fe73dbf7dd8fe305

For two reasons: 1) we have a report that is bogging down exactly in
a .join with lots of elements, so, glad to see the fix, but, more
interesting I think:

2) If such a subtle bug was lurking in spark-core, it leaves me worried
that every time we use .map in our own cogroup code, that we'll be
committing the same perf error.

Has anyone thought more deeply about whether this is a big deal or not?
Should ".iterator.map" vs. ".map" be strongly preferred/best practice
for cogroup code?

Thanks,
Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

recent join/iterator fix

Reply via email to