[
https://issues.apache.org/jira/browse/SPARK-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222262#comment-14222262
]
Josh Rosen commented on SPARK-4489:
-----------------------------------
It looks like this is still a legitimate issue; the underlying bug is due to
the Java API's handling of ClassTags plus incomplete test coverage for the Java
API. Regarding the ClassTag workaround in the gist, I think that you might be
able to use the {{retag()}} method that I added in the fix to SPARK-1040 to
quickly fix this. I may be able to take a look at this reproduction later, but
I'm going to leave this unassigned for now since it would be a great starter
task for someone to pick up.
> JavaPairRDD.collectAsMap from checkpoint RDD may fail with ClassCastException
> -----------------------------------------------------------------------------
>
> Key: SPARK-4489
> URL: https://issues.apache.org/jira/browse/SPARK-4489
> Project: Spark
> Issue Type: Bug
> Components: Java API
> Affects Versions: 1.1.0
> Reporter: Christopher Ng
>
> Calling collectAsMap() on a JavaPairRDD reconstructed from a checkpoint fails
> with a ClassCastException:
> Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object;
> cannot be cast to [Lscala.Tuple2;
> at
> org.apache.spark.rdd.PairRDDFunctions.collectAsMap(PairRDDFunctions.scala:595)
> at
> org.apache.spark.api.java.JavaPairRDD.collectAsMap(JavaPairRDD.scala:569)
> at org.facboy.spark.CheckpointBug.main(CheckpointBug.java:46)
> Code sample reproducing the issue:
> https://gist.github.com/facboy/8387e950ffb0746a8272
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]