Re: oome from blockmanager

Stephen Haberman Sat, 26 Oct 2013 13:15:22 -0700

Hi Patrick,

> Just wondering, how many reducers are you using in this shuffle? By
> 7,000 partitions, I'm assuming you mean the map side of the shuffle.
> What about the reduce side?


7,000 on that side as well.

We're loading about a month's worth of data in one RDD, with ~7,000
partitions, and cogrouping it with another RDD with 50 partitions, and
the resulting RDD also has 7,000 partitions.

(As, since we don't have spark.default.parallelism set, the
defaultPartitioner logic chooses the max of [50, 7,0000] to be the next
partition size.)

I believe that is what you're asking by number of reducers? The number
of partitions in the post-cogroup ShuffledRDD?

Also, AFAICT, I don't believe we get to the reduce/ShuffledRDD side of
this cogroup--after 2,000-3,000 ShuffleMapTasks on the map side is when
it bogs down.

Thanks,
Stephen

Re: oome from blockmanager

Reply via email to