I take a look at HashOuterJoin and it's building a Hashtable for both sides.
This consumes quite a lot of memory when the partition is big. And it doesn't reduce the iteration on streamed relation, right? Thanks! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org