[ https://issues.apache.org/jira/browse/SPARK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002275#comment-14002275 ]
Matei Zaharia commented on SPARK-1857: -------------------------------------- The problem is that it's not currently supported to run actions on an RDD within another RDD operation. For example you couldn't do a.map(_ => m.count()) either. The error message could probably be improved. I'd also like to see lookup() support being called within an operation in the future but it's not supported by the current architecture. > map() with lookup() causes exception > ------------------------------------ > > Key: SPARK-1857 > URL: https://issues.apache.org/jira/browse/SPARK-1857 > Project: Spark > Issue Type: Bug > Affects Versions: 0.9.0 > Reporter: Michael Malak > > Using map() and lookup() in conjunction throws an exception > {noformat} > val a = sc.parallelize(Array(11)) > val m = sc.parallelize(Array((11,21))) > a.map(m.lookup(_)(0)).collect > 14/05/14 15:03:35 ERROR Executor: Exception in task ID 23 > scala.MatchError: null > at org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:551) > {noformat} > A workaround is: > {noformat} > a.map((_,0)).join(m).map(_._2._2).collect > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)