[ https://issues.apache.org/jira/browse/SPARK-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pedro Rodriguez closed SPARK-5385. ---------------------------------- Resolution: Fixed Indeed, not a bug, fixed by calling textFiles, then passing partitions.size as parameter to parallelize for number of slices. > Calling textFile, parallelize, zip, then partitions causes failure on some > local[*] > ----------------------------------------------------------------------------------- > > Key: SPARK-5385 > URL: https://issues.apache.org/jira/browse/SPARK-5385 > Project: Spark > Issue Type: Bug > Reporter: Pedro Rodriguez > > There is a bug in Spark core which produces the exception: "Can't zip RDDs > with unequal numbers of partitions" > General Steps to reproduce: > 1. Run sc.textFiles > 2. Run sc.parallelize > 3. Zip results of top two > 4. Call partitions on result of zip > 5. Run for local, local[2], local[3],... > 6. My machine (macbook air) fails on local[3]. > Github repository with code example: https://github.com/EntilZha/spark-zip-bug > Steps to run: execute "sbt run", wait for failure > Stack trace: > java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of > partitions > at > org.apache.spark.rdd.ZippedPartitionsBaseRDD.getPartitions(ZippedPartitionsRDD.scala:57) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:203) > at App$.main(App.scala:33) > at App.main(App.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > I am looking into the relevant classes, but insight would be appreciated. > This ticket may also be related to > https://issues.apache.org/jira/browse/SPARK-2823 and > https://issues.apache.org/jira/browse/SPARK-5351 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org