http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism
On Fri, Aug 1, 2014 at 1:29 PM, Haiyang Fu <haiyangfu...@gmail.com> wrote: > Hi, > here are two tips for you, > 1. increase the parallism level > 2.increase the driver memory > > > On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak <ssti...@live.com> wrote: > >> Hi everyone, >> I have the following configuration. I am currently running my app in >> local mode. >> >> val conf = new >> SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory", >> "3g").set("spark.storage.memoryFraction", "0.1") >> >> I am getting the following error. I tried setting up spark.executor.memory >> and memory fraction setting, however my UI does not show the increase and I >> still get these errors. I am loading a TSV file from HDFS (around 5 GB). >> Does this mean, I should update these settings and add more memory or is it >> somethign else? Spark master has 24 GB physical memory and workers have 16 >> GB, but we are running other services (CDH 5.1) on these nodes as well. >> >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Getting 2 non-empty blocks out of 2 blocks >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Getting 2 non-empty blocks out of 2 blocks >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Started 0 remote fetches in 6 ms >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Started 0 remote fetches in 6 ms >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> maxBytesInFlight: 50331648, targetRequestSize: 10066329 >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> maxBytesInFlight: 50331648, targetRequestSize: 10066329 >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Getting 2 non-empty blocks out of 2 blocks >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Getting 2 non-empty blocks out of 2 blocks >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Started 0 remote fetches in 1 ms >> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: >> Started 0 remote fetches in 1 ms >> 14/07/31 09:48:17 ERROR Executor: Exception in task ID 5 >> java.lang.OutOfMemoryError: Java heap space >> at java.util.Arrays.copyOf(Arrays.java:2271) >> at >> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) >> at >> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> 14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught >> exception in thread Thread[Executor task launch worker-3,5,main] >> java.lang.OutOfMemoryError: Java heap space >> at java.util.Arrays.copyOf(Arrays.java:2271) >> at >> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) >> at >> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> 14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0) >> 14/07/31 09:48:17 WARN TaskSetManager: Loss was due to >> java.lang.OutOfMemoryError >> java.lang.OutOfMemoryError: Java heap space >> at java.util.Arrays.copyOf(Arrays.java:2271) >> at >> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) >> at >> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> 14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times; >> aborting job >> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 1 >> 14/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at >> ComputeScores.scala:76 >> 14/07/31 09:48:17 INFO Executor: Executor is trying to kill task 6 >> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled >> > >