[ 
https://issues.apache.org/jira/browse/SPARK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002275#comment-14002275
 ] 

Matei Zaharia commented on SPARK-1857:
--------------------------------------

The problem is that it's not currently supported to run actions on an RDD 
within another RDD operation. For example you couldn't do a.map(_ => m.count()) 
either. The error message could probably be improved. I'd also like to see 
lookup() support being called within an operation in the future but it's not 
supported by the current architecture.

> map() with lookup() causes exception
> ------------------------------------
>
>                 Key: SPARK-1857
>                 URL: https://issues.apache.org/jira/browse/SPARK-1857
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Michael Malak
>
> Using map() and lookup() in conjunction throws an exception
> {noformat}
> val a = sc.parallelize(Array(11))
> val m = sc.parallelize(Array((11,21)))
> a.map(m.lookup(_)(0)).collect
> 14/05/14 15:03:35 ERROR Executor: Exception in task ID 23
> scala.MatchError: null
> at org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:551)
> {noformat}
> A workaround is:
> {noformat}
> a.map((_,0)).join(m).map(_._2._2).collect
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to