On 27 Feb 2014, at 07:22, Aaron Davidson <[email protected]> wrote:

> Setting spark.executor.memory is indeed the correct way to do this. If you 
> want to configure this in spark-env.sh, you can use
> export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
> (make sure to append the variable if you've been using SPARK_JAVA_OPTS 
> previously)
> 
> 
> On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <[email protected]> wrote:
> Hi Mohit,
> 
> You can still set SPARK_MEM in spark-env.sh, but that is deprecated. This is 
> from SparkContext.scala:
> 
> if (!conf.contains("spark.executor.memory") && sys.env.contains("SPARK_MEM")) 
> {
>     logWarning("Using SPARK_MEM to set amount of memory to use per executor 
> process is " +
>       "deprecated, instead use spark.executor.memory")
>   }
> 
> Thanks,
> Bryn
> 
> 
> On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <[email protected]> wrote:
> Hi Bryn,
>   Thanks for responding. Is there a way I can permanently configure this 
> setting?
> like SPARK_EXECUTOR_MEMORY or somethign like that?
> 
> 
> 
> On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <[email protected]> wrote:
> Hi Mohit,
> 
> Try increasing the executor memory instead of the worker memory - the most 
> appropriate place to do this is actually when you're creating your 
> SparkContext, something like:
> 
> conf = pyspark.SparkConf()
>                        .setMaster("spark://master:7077")
>                        .setAppName("Example")
>                        .setSparkHome("/your/path/to/spark")
>                        .set("spark.executor.memory", "20G")
>                        .set("spark.logConf", "true")
> sc = pyspark.SparkConf(conf = conf)
> 
> Hope that helps,
> Bryn
> 
> 
> 
> On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <[email protected]> wrote:
> Hi,
>   I am experimenting with pyspark lately...
> Every now and then, I see this error bieng streamed to pyspark shell .. and 
> most of the times.. the computation/operation completes.. and sometimes, it 
> just gets stuck...
> My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's) per 
> node.
> This enviornment is shared by general hadoop and hadoopy stuff..with recent 
> spark addition...
> 
> java.lang.OutOfMemoryError: Java heap space
>     at 
> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>     at 
> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>     at 
> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>     at 
> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>     at 
> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>     at com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>     at 
> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>     at 
> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>     at 
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at 
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>     at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>     at 
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>     at 
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>     at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>     at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>     at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>     at 
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
> 
> 
> 
> Most of the settings in spark are default.. So i was wondering if maybe, 
> there is some configuration that needs to happen?
> There is this one config I have addded to spark_env file
> SPARK_WORKER_MEMORY=20g
> 
> Also, I see tons of these errors as well..
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to 
> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID 1792 on 
> executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070 
> bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to 
> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID 1793 on 
> executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070 
> bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to 
> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID 1794 on 
> executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070 
> bytes in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to 
> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID 1795 on 
> executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070 bytes 
> in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
> 
> 
> and then...
> 
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove executor 12 
> from BlockManagerMaster.
> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in 
> removeExecutor
> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor 12 
> (0/379, false)
> 
> 
> which looks like warnings..
> 
> 
> The code I tried to run was:
> subs_count = complex_key.map( lambda x: (x[0],int(x[1])).reduceByKey(lambda 
> a,b:a+b))
> subs_count.take(20)
> 
> Thanks
> 
> -- 
> Mohit
> 
> "When you want success as badly as you want the air, then you will get it. 
> There is no other secret of success."
> -Socrates
> 
> 
> 
> 
> -- 
> Mohit
> 
> "When you want success as badly as you want the air, then you will get it. 
> There is no other secret of success."
> -Socrates
> 
> 

Reply via email to