Thanks Sean...let me get the latest code..do you know which PR was it ?

But will the executors run fine with say 32 gb or 64 gb of memory ? Does
not JVM shows up issues when the max memory goes beyond certain limit...

Also the failure is due to GC limits from jblas...and I was thinking that
jblas is going to call native malloc right ? May be 64 gb is not a big deal
then...I will try increasing to 32 and then 64...

java.lang.OutOfMemoryError (java.lang.OutOfMemoryError: GC overhead limit
exceeded)
org.jblas.DoubleMatrix.<init>(DoubleMatrix.java:323)
org.jblas.DoubleMatrix.zeros(DoubleMatrix.java:471)
org.jblas.DoubleMatrix.zeros(DoubleMatrix.java:476)
com.verizon.bigdata.mllib.recommendation.ALSQR$$anonfun$17.apply(ALSQR.scala:366)
com.verizon.bigdata.mllib.recommendation.ALSQR$$anonfun$17.apply(ALSQR.scala:366)
scala.Array$.fill(Array.scala:267)
com.verizon.bigdata.mllib.recommendation.ALSQR.updateBlock(ALSQR.scala:366)
com.verizon.bigdata.mllib.recommendation.ALSQR$$anonfun$com$verizon$bigdata$mllib$recommendation$ALSQR$$updateFeatures$2.apply(ALSQR.scala:346)
com.verizon.bigdata.mllib.recommendation.ALSQR$$anonfun$com$verizon$bigdata$mllib$recommendation$ALSQR$$updateFeatures$2.apply(ALSQR.scala:345)
org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:32)
org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:32)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:149)
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:147)
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:147)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:242)
org.apache.spark.rdd.RDD.iterator(RDD.scala:233)
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:32)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:242)
org.apache.spark.rdd.RDD.iterator(RDD.scala:233)
org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:32)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:242)
org.apache.spark.rdd.RDD.iterator(RDD.scala:233)
org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:242)
org.apache.spark.rdd.RDD.iterator(RDD.scala:233)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
org.apache.spark.scheduler.Task.run(Task.scala:53)
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)



On Sun, Mar 16, 2014 at 11:42 AM, Sean Owen <so...@cloudera.com> wrote:

> Are you using HEAD or 0.9.0? I know there was a memory issue fixed a few
> weeks ago that made ALS need a lot more memory than is needed.
>
> https://github.com/apache/incubator-spark/pull/629
>
> Try the latest code.
>
> --
> Sean Owen | Director, Data Science | London
>
>
> On Sun, Mar 16, 2014 at 11:40 AM, Debasish Das 
> <debasish.da...@gmail.com>wrote:
>
>> Hi,
>>
>> I gave my spark job 16 gb of memory and it is running on 8 executors.
>>
>> The job needs more memory due to ALS requirements (20M x 1M matrix)
>>
>> On each node I do have 96 gb of memory and I am using 16 gb out of it. I
>> want to increase the memory but I am not sure what is the right way to do
>> that...
>>
>> On 8 executor if I give 96 gb it might be a issue due to GC...
>>
>> Ideally on 8 nodes, I would run with 48 executors and each executor will
>> get 16 gb of memory..Total  48 JVMs...
>>
>> Is it possible to increase executors per node ?
>>
>> Thanks.
>> Deb
>>
>
>

Reply via email to