I'm not sure how to reproduce it? this code does not produce an error in master.
On Sun, Aug 30, 2015 at 7:26 PM, Ashish Shrowty <ashish.shro...@gmail.com> wrote: > Do you think I should create a JIRA? > > > On Sun, Aug 30, 2015 at 12:56 PM Ted Yu <yuzhih...@gmail.com> wrote: >> >> I got StackOverFlowError as well :-( >> >> On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty <ashish.shro...@gmail.com> >> wrote: >>> >>> Yep .. I tried that too earlier. Doesn't make a difference. Are you able >>> to replicate on your side? >>> >>> >>> On Sun, Aug 30, 2015 at 12:08 PM Ted Yu <yuzhih...@gmail.com> wrote: >>>> >>>> I see. >>>> >>>> What about using the following in place of variable a ? >>>> >>>> http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables >>>> >>>> Cheers >>>> >>>> On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty >>>> <ashish.shro...@gmail.com> wrote: >>>>> >>>>> @Sean - Agree that there is no action, but I still get the >>>>> stackoverflowerror, its very weird >>>>> >>>>> @Ted - Variable a is just an int - val a = 10 ... The error happens >>>>> when I try to pass a variable into the closure. The example you have above >>>>> works fine since there is no variable being passed into the closure from >>>>> the >>>>> shell. >>>>> >>>>> -Ashish >>>>> >>>>> On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhih...@gmail.com> wrote: >>>>>> >>>>>> Using Spark shell : >>>>>> >>>>>> scala> import scala.collection.mutable.MutableList >>>>>> import scala.collection.mutable.MutableList >>>>>> >>>>>> scala> val lst = MutableList[(String,String,Double)]() >>>>>> lst: scala.collection.mutable.MutableList[(String, String, Double)] = >>>>>> MutableList() >>>>>> >>>>>> scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double))) >>>>>> >>>>>> scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>>>>> <console>:27: error: not found: value a >>>>>> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>>>>> ^ >>>>>> >>>>>> scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0) >>>>>> rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at >>>>>> <console>:27 >>>>>> >>>>>> scala> rdd.count() >>>>>> ... >>>>>> 15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at >>>>>> <console>:30, took 0.478350 s >>>>>> res1: Long = 10000 >>>>>> >>>>>> Ashish: >>>>>> Please refine your example to mimic more closely what your code >>>>>> actually did. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <so...@cloudera.com> >>>>>> wrote: >>>>>>> >>>>>>> That can't cause any error, since there is no action in your first >>>>>>> snippet. Even calling count on the result doesn't cause an error. You >>>>>>> must be executing something different. >>>>>>> >>>>>>> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shro...@gmail.com> >>>>>>> wrote: >>>>>>> > I am running the Spark shell (1.2.1) in local mode and I have a >>>>>>> > simple >>>>>>> > RDD[(String,String,Double)] with about 10,000 objects in it. I get >>>>>>> > a >>>>>>> > StackOverFlowError each time I try to run the following code (the >>>>>>> > code >>>>>>> > itself is just representative of other logic where I need to pass >>>>>>> > in a >>>>>>> > variable). I tried broadcasting the variable too, but no luck .. >>>>>>> > missing >>>>>>> > something basic here - >>>>>>> > >>>>>>> > val rdd = sc.makeRDD(List(<Data read from file>) >>>>>>> > val a=10 >>>>>>> > rdd.map(r => if (a==10) 1 else 0) >>>>>>> > This throws - >>>>>>> > >>>>>>> > java.lang.StackOverflowError >>>>>>> > at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318) >>>>>>> > at >>>>>>> > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) >>>>>>> > at >>>>>>> > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) >>>>>>> > at >>>>>>> > >>>>>>> > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) >>>>>>> > ... >>>>>>> > ... >>>>>>> > >>>>>>> > More experiments .. this works - >>>>>>> > >>>>>>> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList >>>>>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>>>>>> > >>>>>>> > But below doesn't and throws the StackoverflowError - >>>>>>> > >>>>>>> > val lst = MutableList[(String,String,Double)]() >>>>>>> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double))) >>>>>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) >>>>>>> > >>>>>>> > Any help appreciated! >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Ashish >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > View this message in context: >>>>>>> > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html >>>>>>> > Sent from the Apache Spark User List mailing list archive at >>>>>>> > Nabble.com. >>>>>>> > >>>>>>> > >>>>>>> > --------------------------------------------------------------------- >>>>>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> > For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> > >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>> >>>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org