Saurav,
We have the same issue. Our application runs fine on 32 nodes with 4 cores
each and 256 partitions but gives an OOM on the driver when run on 64 nodes
with 512 partitions. Did you get to know the reason behind this behavior or
the relation between number of partitions and driver RAM usage?
Cache defaults to MEMORY_ONLY. Can you try with different storage levels
,i.e., MEMORY_ONLY_SER or even DISK_ONLY. you may want to use persist( )
instead of cache.
Or there is an experimental storage level OFF_HEAP which might also help.
On Tue, Jul 19, 2016 at 11:08 PM, Saurav Sinha
wrote:
> H
Hi,
I have set driver memory 10 GB and job ran with intermediate failure which
is recovered back by spark.
But I still what to know if no of parts increases git driver ram need to be
increased and what is ration of no of parts/RAM.
@RK : I am using cache on RDD. Is this reason of high RAM utiliz
Just want to see if this helps.
Are you doing heavy collects and persist that? If that is so, you might
want to parallelize that collection by converting to an RDD.
Thanks,
RK
On Tue, Jul 19, 2016 at 12:09 AM, Saurav Sinha
wrote:
> Hi Mich,
>
>1. In what mode are you running the spark stan
Hi Mich,
1. In what mode are you running the spark standalone, yarn-client, yarn
cluster etc
Ans: spark standalone
1. You have 4 nodes with each executor having 10G. How many actual
executors do you see in UI (Port 4040 by default)
Ans: There are 4 executor on which am using 8 cores
can you please clarify:
1. In what mode are you running the spark standalone, yarn-client, yarn
cluster etc
2. You have 4 nodes with each executor having 10G. How many actual
executors do you see in UI (Port 4040 by default)
3. What is master memory? Are you referring to diver memo
I have set --drive-memory 5g. I need to understand that as no of partition
increase drive-memory need to be increased. What will be best ration of No
of partition/drive-memory.
On Mon, Jul 18, 2016 at 4:07 PM, Zhiliang Zhu wrote:
> try to set --drive-memory xg , x would be as large as can be set
try to set --drive-memory xg , x would be as large as can be set .
On Monday, July 18, 2016 6:31 PM, Saurav Sinha
wrote:
Hi,
I am running spark job.
Master memory - 5Gexecutor memort 10G(running on 4 node)
My job is getting killed as no of partition increase to 20K.
16/07/18 14:53:13 I
Hi,
I am running spark job.
Master memory - 5G
executor memort 10G(running on 4 node)
My job is getting killed as no of partition increase to 20K.
16/07/18 14:53:13 INFO DAGScheduler: Got job 17 (foreachPartition at
WriteToKafka.java:45) with 13524 output partitions (allowLocal=false)
16/07/18