Re: Issue with zip and partitions

2014-04-02 Thread Xiangrui Meng
From API docs: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc. Assumes that the two RDDs have the *same number of partitions* and the *same number of elements in each partition* (e.g. one was made through a map on the

Issue with zip and partitions

2014-04-01 Thread Patrick_Nicolas
Dell - Internal Use - Confidential I got an exception can't zip RDDs with unusual numbers of Partitions when I apply any action (reduce, collect) of dataset created by zipping two dataset of 10 million entries each. The problem occurs independently of the number of partitions or when I let