Dell - Internal Use - Confidential
I got an exception "can't zip RDDs with unusual numbers of Partitions" when I 
apply any action (reduce, collect) of dataset created by zipping two dataset of 
10 million entries each.  The problem occurs independently of the number of 
partitions or when I let Spark creates those partitions.

Interestingly enough, I do not have problem zipping datasets of 1 and 2.5 
million entries.....
A similar problem was reported on this board with 0.8 but remember if the 
problem was fixed.

Any idea? Any workaround?

I appreciate.

Reply via email to