[ 
https://issues.apache.org/jira/browse/SPARK-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703229#comment-14703229
 ] 

Sean Owen commented on SPARK-10112:
-----------------------------------

Do the partitions have the same number of elements each? despite the message 
this is also required, IIRC.
Again, more detail about what exactly is in the RDDs would help, like the exact 
output of printing the number of partitions, and maybe the contents of each 
partition separately with foreachPartition.

> ValueError: Can only zip with RDD which has the same number of partitions on 
> one machine but not on another
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-10112
>                 URL: https://issues.apache.org/jira/browse/SPARK-10112
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>         Environment: Ubuntu 14.04.2 LTS
>            Reporter: Abhinav Mishra
>
> I have this piece of code which works fine on one machine but when I run this 
> on another machine I get error as - "ValueError: Can only zip with RDD which 
> has the same number of partitions". My code is:
> rdd2 = sc.parallelize(list1) 
> assert rdd1. getNumPartitions() == rdd2. getNumPartitions()
> rdd3 = rdd1.zip(rdd2).map(lambda ((x1,x2,x3,x4), y): (y,x2, x3, x4))
> list = rdd3.collect()
> My rdd1 has this structure - [(1,2,3),(4,5,6)....]. My rdd2 has this 
> structure - [1,2,3....]
>  
> Both my rdd's - rdd1 and rdd2, have same number of elements and same number 
> of partition (both have 1 partition) and I tried to use repartition() as well 
> but it does not resolves this issue.
> The above code works fine on one machine but throws error on another. I tired 
> to look for some explanations but I couldn't find any specific reason for 
> this behavior. I have spark 1.3 on the machine on which it runs without any 
> error and spark 1.4 on machine on which this error comes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to