[
https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph K. Bradley updated SPARK-5973:
-------------------------------------
Assignee: Davies Liu
> zip two rdd with AutoBatchedSerializer will fail
> ------------------------------------------------
>
> Key: SPARK-5973
> URL: https://issues.apache.org/jira/browse/SPARK-5973
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 1.3.0, 1.2.1
> Reporter: Davies Liu
> Assignee: Davies Liu
> Priority: Blocker
>
> zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by
> SPARK-4841
> {code}
> >> a.zip(b).count()
> 15/02/24 12:11:56 ERROR PythonRDD: Python worker exited unexpectedly (crashed)
> org.apache.spark.api.python.PythonException: Traceback (most recent call
> last):
> File "/Users/davies/work/spark/python/pyspark/worker.py", line 101, in main
> process()
> File "/Users/davies/work/spark/python/pyspark/worker.py", line 96, in
> process
> serializer.dump_stream(func(split_index, iterator), outfile)
> File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in
> pipeline_func
> return func(split, prev_func(split, iterator))
> File "/Users/davies/work/spark/python/pyspark/rdd.py", line 2249, in
> pipeline_func
> return func(split, prev_func(split, iterator))
> File "/Users/davies/work/spark/python/pyspark/rdd.py", line 270, in func
> return f(iterator)
> File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in <lambda>
> return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
> File "/Users/davies/work/spark/python/pyspark/rdd.py", line 933, in
> <genexpr>
> return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
> File "/Users/davies/work/spark/python/pyspark/serializers.py", line 306, in
> load_stream
> " in pair: (%d, %d)" % (len(keys), len(vals)))
> ValueError: Can not deserialize RDD with different number of items in pair:
> (123, 64)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]