http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism



On Fri, Aug 1, 2014 at 1:29 PM, Haiyang Fu <haiyangfu...@gmail.com> wrote:

> Hi,
> here are two tips for you,
> 1. increase the parallism level
> 2.increase the driver memory
>
>
> On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak <ssti...@live.com> wrote:
>
>> Hi everyone,
>> I have the following configuration. I am currently running my app in
>> local mode.
>>
>>   val conf = new
>> SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory",
>> "3g").set("spark.storage.memoryFraction", "0.1")
>>
>> I am getting the following error. I tried setting up spark.executor.memory
>> and memory fraction setting, however my UI does not show the increase and I
>> still get these errors. I am loading a TSV file from HDFS (around 5 GB).
>> Does this mean, I should update these settings and add more memory or is it
>> somethign else? Spark master has 24 GB physical memory and workers have 16
>> GB, but we are running other services (CDH 5.1) on these nodes as well.
>>
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 6 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 6 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> maxBytesInFlight: 50331648, targetRequestSize: 10066329
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> maxBytesInFlight: 50331648, targetRequestSize: 10066329
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 1 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 1 ms
>> 14/07/31 09:48:17 ERROR Executor: Exception in task ID 5
>> java.lang.OutOfMemoryError: Java heap space
>> at java.util.Arrays.copyOf(Arrays.java:2271)
>>  at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught
>> exception in thread Thread[Executor task launch worker-3,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at java.util.Arrays.copyOf(Arrays.java:2271)
>>  at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0)
>> 14/07/31 09:48:17 WARN TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError
>> java.lang.OutOfMemoryError: Java heap space
>>  at java.util.Arrays.copyOf(Arrays.java:2271)
>> at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>>  at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times;
>> aborting job
>> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 1
>> 14/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at
>> ComputeScores.scala:76
>> 14/07/31 09:48:17 INFO Executor: Executor is trying to kill task 6
>> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled
>>
>
>

Reply via email to