I see. What about using the following in place of variable a ? http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
Cheers On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty <ashish.shro...@gmail.com> wrote: > @Sean - Agree that there is no action, but I still get the > stackoverflowerror, its very weird > > @Ted - Variable a is just an int - val a = 10 ... The error happens when > I try to pass a variable into the closure. The example you have above works > fine since there is no variable being passed into the closure from the > shell. > > -Ashish > > On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhih...@gmail.com> wrote: > >> Using Spark shell : >> >> scala> import scala.collection.mutable.MutableList >> import scala.collection.mutable.MutableList >> >> scala> val lst = MutableList[(String,String,Double)]() >> lst: scala.collection.mutable.MutableList[(String, String, Double)] = >> MutableList() >> >> scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double))) >> >> scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >> <console>:27: error: not found: value a >> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >> ^ >> >> scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0) >> rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at >> <console>:27 >> >> scala> rdd.count() >> ... >> 15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at >> <console>:30, took 0.478350 s >> res1: Long = 10000 >> >> Ashish: >> Please refine your example to mimic more closely what your code actually >> did. >> >> Thanks >> >> On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <so...@cloudera.com> wrote: >> >>> That can't cause any error, since there is no action in your first >>> snippet. Even calling count on the result doesn't cause an error. You >>> must be executing something different. >>> >>> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shro...@gmail.com> >>> wrote: >>> > I am running the Spark shell (1.2.1) in local mode and I have a simple >>> > RDD[(String,String,Double)] with about 10,000 objects in it. I get a >>> > StackOverFlowError each time I try to run the following code (the code >>> > itself is just representative of other logic where I need to pass in a >>> > variable). I tried broadcasting the variable too, but no luck .. >>> missing >>> > something basic here - >>> > >>> > val rdd = sc.makeRDD(List(<Data read from file>) >>> > val a=10 >>> > rdd.map(r => if (a==10) 1 else 0) >>> > This throws - >>> > >>> > java.lang.StackOverflowError >>> > at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318) >>> > at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) >>> > at >>> > >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) >>> > at >>> > >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) >>> > at >>> > >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) >>> > at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) >>> > at >>> > >>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) >>> > at >>> > >>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) >>> > at >>> > >>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) >>> > ... >>> > ... >>> > >>> > More experiments .. this works - >>> > >>> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList >>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>> > >>> > But below doesn't and throws the StackoverflowError - >>> > >>> > val lst = MutableList[(String,String,Double)]() >>> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double))) >>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>> > >>> > Any help appreciated! >>> > >>> > Thanks, >>> > Ashish >>> > >>> > >>> > >>> > -- >>> > View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html >>> > Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: user-h...@spark.apache.org >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>