[ 
https://issues.apache.org/jira/browse/SPARK-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903542#comment-14903542
 ] 

Glenn Strycker commented on SPARK-2737:
---------------------------------------

I am getting a similar error in Spark 1.3.0... see a new ticket I created:  
https://issues.apache.org/jira/browse/SPARK-10762

> ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs
> ---------------------------------------------------------------------
>
>                 Key: SPARK-2737
>                 URL: https://issues.apache.org/jira/browse/SPARK-2737
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 0.8.0, 0.9.0, 1.0.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>             Fix For: 1.1.0
>
>
> The Java API's use of fake ClassTags doesn't seem to cause any problems for 
> Java users, but it can lead to issues when passing JavaRDDs' underlying RDDs 
> to Scala code (e.g. in the MLlib Java API wrapper code).  If we call 
> {{collect()}} on a Scala RDD with an incorrect ClassTag, this causes 
> ClassCastExceptions when we try to allocate an array of the wrong type (for 
> example, see SPARK-2197).
> There are a few possible fixes here.  An API-breaking fix would be to 
> completely remove the fake ClassTags and require Java API users to pass 
> {{java.lang.Class}} instances to all {{parallelize()}} calls and add 
> {{returnClass}} fields to all {{Function}} implementations.  This would be 
> extremely verbose.
> Instead, I propose that we add internal APIs to "repair" a Scala RDD with an 
> incorrect ClassTag by wrapping it and overriding its ClassTag.  This should 
> be okay for cases where the Scala code that calls {{collect()}} knows what 
> type of array should be allocated, which is the case in the MLlib wrappers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to