Re: Will it lead to OOM error?

2022-06-22 Thread Sid
tal of 100 TB RAM and 100TB disk. So If I do something like >> this >> >> spark.read.option("header","true").csv(filepath).show(false) >> >> Will it lead to an OOM error since it doesn't have enough memory? or it >> will spill data onto the disk and process it? >> >> Thanks, >> Sid >> >

Re: Will it lead to OOM error?

2022-06-22 Thread Yong Walt
path).show(false) > > Will it lead to an OOM error since it doesn't have enough memory? or it > will spill data onto the disk and process it? > > Thanks, > Sid >

Re: Will it lead to OOM error?

2022-06-22 Thread Enrico Minack
might lead to OOM error? Thanks, Sid On Wed, Jun 22, 2022 at 6:40 PM Enrico Minack wrote: The RAM and disk memory consumtion depends on what you do with the data after reading them. Your particular action will read 20 lines from the first partition and show them. So it will no

Re: Will it lead to OOM error?

2022-06-22 Thread Sid
Hi Enrico, Thanks for the insights. Could you please help me to understand with one example of compressed files where the file wouldn't be split in partitions and will put load on a single partition and might lead to OOM error? Thanks, Sid On Wed, Jun 22, 2022 at 6:40 PM Enrico Minack

Re: Will it lead to OOM error?

2022-06-22 Thread Enrico Minack
quot;true").csv(filepath).show(false) Will it lead to an OOM error since it doesn't have enough memory? or it will spill data onto the disk and process it? Thanks, Sid -- Thanks Deepak www.bigdatabig.com <http://www.bigdatabig.com> www.keosha.net <http://www.keosha.net>

Re: Will it lead to OOM error?

2022-06-22 Thread Deepak Sharma
v(filepath).show(false) > > Will it lead to an OOM error since it doesn't have enough memory? or it > will spill data onto the disk and process it? > > Thanks, > Sid > -- Thanks Deepak www.bigdatabig.com www.keosha.net

Will it lead to OOM error?

2022-06-22 Thread Sid
I have a 150TB CSV file. I have a total of 100 TB RAM and 100TB disk. So If I do something like this spark.read.option("header","true").csv(filepath).show(false) Will it lead to an OOM error since it doesn't have enough memory? or it will spill data onto the disk and process it? Thanks, Sid

Re: OOM Error

2019-09-07 Thread Ankit Khettry
s >>>>>>> Ankit Khettry >>>>>>> >>>>>>> On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, < >>>>>>> 028upasana...@gmail.com> wrote: >>>>>>> >>>>>>>> Is it a streaming job? &g

Re: OOM Error

2019-09-07 Thread Sunil Kalra
: >>>>>>> >>>>>>>> I have a Spark job that consists of a large number of Window >>>>>>>> operations and hence involves large shuffles. I have roughly 900 GiBs >>>>>>>> of >>>>>>&

Re: OOM Error

2019-09-07 Thread Chris Teoh
;>>>> wrote: >>>>>> >>>>>>> I have a Spark job that consists of a large number of Window >>>>>>> operations and hence involves large shuffles. I have roughly 900 GiBs of >>>>>>> data, although I am using a large enough clust

Re: OOM Error

2019-09-07 Thread Chris Teoh
gt;>>>> have tried various other combinations without any success. >>>>> >>>>> spark.yarn.driver.memoryOverhead 6g >>>>> spark.storage.memoryFraction 0.1 >>>>> spark.executor.cores 6 >>>>> spark.executor.memory 36g

Re: OOM Error

2019-09-07 Thread Ankit Khettry
cess. >>>> >>>> spark.yarn.driver.memoryOverhead 6g >>>> spark.storage.memoryFraction 0.1 >>>> spark.executor.cores 6 >>>> spark.executor.memory 36g >>>> spark.memory.offHeap.size 8g >>>> spark.

Re: OOM Error

2019-09-07 Thread Chris Teoh
;> spark.executor.cores 6 >>> spark.executor.memory 36g >>> spark.memory.offHeap.size 8g >>> spark.memory.offHeap.enabled true >>> spark.executor.instances 10 >>> spark.driver.memory 14g >>> spark.yarn.executor.memoryOverhead 10g >>> >>&

Re: OOM Error

2019-09-06 Thread Ankit Khettry
es 10 >> spark.driver.memory 14g >> spark.yarn.executor.memoryOverhead 10g >> >> I keep running into the following OOM error: >> >> org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384 >> bytes of memory, got 0 >> at >> org.ap

Re: OOM Error

2019-09-06 Thread Upasana Sharma
> spark.memory.offHeap.size 8g > spark.memory.offHeap.enabled true > spark.executor.instances 10 > spark.driver.memory 14g > spark.yarn.executor.memoryOverhead 10g > > I keep running into the following OOM error: > > org.apache.spark.memory.SparkOutOfMemoryErr

OOM Error

2019-09-06 Thread Ankit Khettry
spark.yarn.executor.memoryOverhead 10g I keep running into the following OOM error: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384 bytes of memory, got 0 at org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157) at org.apache.spark.memory.MemoryConsumer.allocateArray

Re: Spark 2.0.0 OOM error at beginning of RDD map on AWS

2016-08-24 Thread Arun Luthra
hese maps into a two-level Map i.e. >> Map[String, Map[String, Int]] ? Or would this still count against me? >> >> What if I manually split them up into numerous Map variables? >> >> On Mon, Aug 15, 2016 at 2:12 PM, Arun Luthra >> wrote: >> >>> I got t

Re: Spark 2.0.0 OOM error at beginning of RDD map on AWS

2016-08-23 Thread Arun Luthra
variables? > > On Mon, Aug 15, 2016 at 2:12 PM, Arun Luthra > wrote: > >> I got this OOM error in Spark local mode. The error seems to have been at >> the start of a stage (all of the stages on the UI showed as complete, there >> were more stages to do but had not sho

Re: Spark 2.0.0 OOM error at beginning of RDD map on AWS

2016-08-18 Thread Arun Luthra
ainst me? What if I manually split them up into numerous Map variables? On Mon, Aug 15, 2016 at 2:12 PM, Arun Luthra wrote: > I got this OOM error in Spark local mode. The error seems to have been at > the start of a stage (all of the stages on the UI showed as complete, there > were

Spark 2.0.0 OOM error at beginning of RDD map on AWS

2016-08-15 Thread Arun Luthra
I got this OOM error in Spark local mode. The error seems to have been at the start of a stage (all of the stages on the UI showed as complete, there were more stages to do but had not showed up on the UI yet). There appears to be ~100G of free memory at the time of the error. Spark 2.0.0 200G

OOM error in Spark worker

2015-10-01 Thread varun sharma
My workers are going OOM over time. I am running a streaming job in spark 1.4.0. Here is the heap dump of workers. *16,802 instances of "org.apache.spark.deploy.worker.ExecutorRunner", loaded by "sun.misc.Launcher$AppClassLoader @ 0xdff94088" occupy 488,249,688 (95.80%) bytes. These instance

OOM error in Spark worker

2015-09-29 Thread varun sharma
s from Kafka topic and are pending to be scheduled because of delay in processing... Will my force killing the streaming job lose that data which is not yet scheduled? Please help ASAP. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/OOM-error-in-Spark-worker

Re: OOM error with GMMs on 4GB dataset

2015-05-05 Thread Xiangrui Meng
llelism", "300").set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer").set("spark.kryoserializer.buffer.mb", > "500").set("spark.akka.frameSize", "256").set("spark.akka.timeout", "300") > > Howev

OOM error with GMMs on 4GB dataset

2015-05-04 Thread Vinay Muttineni
uot;).set("spark.akka.frameSize", "256").set("spark.akka.timeout", "300") However, at the aggregate step (Line 168) val sums = breezeData.aggregate(ExpectationSum.zero(k, d))(compute.value, _ += _) I get OOM error and the application hangs indefini

Re: OOM error

2015-02-17 Thread Harshvardhan Chauhan
Thanks for the pointer it led me to http://spark.apache.org/docs/1.2.0/tuning.html increasing parallelism resolved the issue. On Mon, Feb 16, 2015 at 11:57 PM, Akhil Das wrote: > Increase your executor memory, Also you can play around with increasing > the number of partitions/parallelism etc.

Re: OOM error

2015-02-16 Thread Akhil Das
Increase your executor memory, Also you can play around with increasing the number of partitions/parallelism etc. Thanks Best Regards On Tue, Feb 17, 2015 at 3:39 AM, Harshvardhan Chauhan wrote: > Hi All, > > > I need some help with Out Of Memory errors in my application. I am using > Spark 1.1

OOM error

2015-02-16 Thread Harshvardhan Chauhan
Hi All, I need some help with Out Of Memory errors in my application. I am using Spark 1.1.0 and my application is using Java API. I am running my app on EC2 25 m3.xlarge (4 Cores 15GB Memory) instances. The app only fails sometimes. Lots of mapToPair tasks a failing. My app is configured to ru