Hi Sean, thx for the tip.  I'm just running my app via spark-submit on CLI
ie >spark-submit --class X --master local[*] assembly.jar so I'll now add
to CLI args ie: spark-submit --class X --master local[*]
--driver-memory 8g assembly.jar
etc.

Unless I have this wrong?

Thx


On Thu, Jul 1, 2021 at 1:43 PM Sean Owen <sro...@gmail.com> wrote:

> You need to set driver memory before the driver starts, on the CLI or
> however you run your app, not in the app itself. By the time the driver
> starts to run your app, its heap is already set.
>
> On Thu, Jul 1, 2021 at 12:10 AM javaguy Java <javagu...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm getting Java OOM errors even though I'm setting my driver memory to
>> 24g and I'm executing against local[*]
>>
>> I was wondering if anyone can give me any insight.  The server this job is 
>> running on has more than enough memory as does the spark driver.
>>
>> The final result does write 3 csv files that are 300MB each so there's no 
>> way its coming close to the 24g
>>
>> From the OOM, I don't know about the internals of Spark itself to tell me 
>> where this is failing + how I should refactor or change anything
>>
>> Would appreciate any advice on how I can resolve
>>
>> Thx
>>
>>
>> Parameters here:
>>
>> val spark = SparkSession
>>   .builder
>>   .master("local[*]")
>>   .appName("OOM")
>>   .config("spark.driver.host", "localhost")
>>   .config("spark.driver.maxResultSize", "0")
>>   .config("spark.sql.caseSensitive", "false")
>>   .config("spark.sql.adaptive.enabled", "true")
>>   .config("spark.sql.adaptive.coalescePartitions.enabled", "true")
>>   .config("spark.driver.memory", "24g")
>>   .getOrCreate()
>>
>>
>> My OOM errors are below:
>>
>> driver): java.lang.OutOfMemoryError: Java heap space
>>      at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter$ManualCloseBufferedOutputStream$1.<init>(DiskBlockObjectWriter.scala:109)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:110)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:118)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:245)
>>      at 
>> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:158)
>>      at 
>> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
>>      at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>>      at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
>>      at org.apache.spark.scheduler.Task.run(Task.scala:127)
>>      at 
>> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
>>      at 
>> org.apache.spark.executor.Executor$TaskRunner$$Lambda$1792/1058609963.apply(Unknown
>>  Source)
>>      at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
>>      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>      at java.lang.Thread.run(Thread.java:748)
>>      
>>      
>>      
>>      
>> driver): java.lang.OutOfMemoryError: Java heap space
>>      at 
>> net.jpountz.lz4.LZ4BlockOutputStream.<init>(LZ4BlockOutputStream.java:102)
>>      at 
>> org.apache.spark.io.LZ4CompressionCodec.compressedOutputStream(CompressionCodec.scala:145)
>>      at 
>> org.apache.spark.serializer.SerializerManager.wrapForCompression(SerializerManager.scala:158)
>>      at 
>> org.apache.spark.serializer.SerializerManager.wrapStream(SerializerManager.scala:133)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:122)
>>      at 
>> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:245)
>>      at 
>> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:158)
>>      at 
>> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
>>      at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>>      at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
>>      at org.apache.spark.scheduler.Task.run(Task.scala:127)
>>      at 
>> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
>>      at 
>> org.apache.spark.executor.Executor$TaskRunner$$Lambda$1792/249605067.apply(Unknown
>>  Source)
>>      at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
>>      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>      at java.lang.Thread.run(Thread.java:748)
>>
>>
>>

Reply via email to