Re: Spark shell and StackOverFlowError

Sean Owen Sun, 30 Aug 2015 11:53:15 -0700

I'm not sure how to reproduce it? this code does not produce an error in master.


On Sun, Aug 30, 2015 at 7:26 PM, Ashish Shrowty
<ashish.shro...@gmail.com> wrote:
> Do you think I should create a JIRA?
>
>
> On Sun, Aug 30, 2015 at 12:56 PM Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> I got StackOverFlowError as well :-(
>>
>> On Sun, Aug 30, 2015 at 9:47 AM, Ashish Shrowty <ashish.shro...@gmail.com>
>> wrote:
>>>
>>> Yep .. I tried that too earlier. Doesn't make a difference. Are you able
>>> to replicate on your side?
>>>
>>>
>>> On Sun, Aug 30, 2015 at 12:08 PM Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>> I see.
>>>>
>>>> What about using the following in place of variable a ?
>>>>
>>>> http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
>>>>
>>>> Cheers
>>>>
>>>> On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty
>>>> <ashish.shro...@gmail.com> wrote:
>>>>>
>>>>> @Sean - Agree that there is no action, but I still get the
>>>>> stackoverflowerror, its very weird
>>>>>
>>>>> @Ted - Variable a is just an int - val a = 10 ... The error happens
>>>>> when I try to pass a variable into the closure. The example you have above
>>>>> works fine since there is no variable being passed into the closure from 
>>>>> the
>>>>> shell.
>>>>>
>>>>> -Ashish
>>>>>
>>>>> On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>
>>>>>> Using Spark shell :
>>>>>>
>>>>>> scala> import scala.collection.mutable.MutableList
>>>>>> import scala.collection.mutable.MutableList
>>>>>>
>>>>>> scala> val lst = MutableList[(String,String,Double)]()
>>>>>> lst: scala.collection.mutable.MutableList[(String, String, Double)] =
>>>>>> MutableList()
>>>>>>
>>>>>> scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>>>>>>
>>>>>> scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>>>>> <console>:27: error: not found: value a
>>>>>>        val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>>>>>                                           ^
>>>>>>
>>>>>> scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0)
>>>>>> rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at
>>>>>> <console>:27
>>>>>>
>>>>>> scala> rdd.count()
>>>>>> ...
>>>>>> 15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at
>>>>>> <console>:30, took 0.478350 s
>>>>>> res1: Long = 10000
>>>>>>
>>>>>> Ashish:
>>>>>> Please refine your example to mimic more closely what your code
>>>>>> actually did.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <so...@cloudera.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> That can't cause any error, since there is no action in your first
>>>>>>> snippet. Even calling count on the result doesn't cause an error. You
>>>>>>> must be executing something different.
>>>>>>>
>>>>>>> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shro...@gmail.com>
>>>>>>> wrote:
>>>>>>> > I am running the Spark shell (1.2.1) in local mode and I have a
>>>>>>> > simple
>>>>>>> > RDD[(String,String,Double)] with about 10,000 objects in it. I get
>>>>>>> > a
>>>>>>> > StackOverFlowError each time I try to run the following code (the
>>>>>>> > code
>>>>>>> > itself is just representative of other logic where I need to pass
>>>>>>> > in a
>>>>>>> > variable). I tried broadcasting the variable too, but no luck ..
>>>>>>> > missing
>>>>>>> > something basic here -
>>>>>>> >
>>>>>>> > val rdd = sc.makeRDD(List(<Data read from file>)
>>>>>>> > val a=10
>>>>>>> > rdd.map(r => if (a==10) 1 else 0)
>>>>>>> > This throws -
>>>>>>> >
>>>>>>> > java.lang.StackOverflowError
>>>>>>> >     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
>>>>>>> >     at
>>>>>>> > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>>>> >     at
>>>>>>> > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>>>> >     at
>>>>>>> >
>>>>>>> > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>>>> > ...
>>>>>>> > ...
>>>>>>> >
>>>>>>> > More experiments  .. this works -
>>>>>>> >
>>>>>>> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
>>>>>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>>>>>> >
>>>>>>> > But below doesn't and throws the StackoverflowError -
>>>>>>> >
>>>>>>> > val lst = MutableList[(String,String,Double)]()
>>>>>>> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>>>>>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>>>>>> >
>>>>>>> > Any help appreciated!
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Ashish
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > View this message in context:
>>>>>>> > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
>>>>>>> > Sent from the Apache Spark User List mailing list archive at
>>>>>>> > Nabble.com.
>>>>>>> >
>>>>>>> >
>>>>>>> > ---------------------------------------------------------------------
>>>>>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>> >
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>
>>>>>>
>>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark shell and StackOverFlowError

Reply via email to