Hi,
In one of the application we have made which had no clone stuff, we have
set the value of spark.storage.memoryFraction to very low, and yes that
gave us performance benefits.
Regarding that issue, you should also look at the data you are trying to
broadcast, as sometimes creating that data structure at executor's itself
as singleton helps.
Thanks,
On Tue, Apr 14, 2015 at 12:23 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
You could try leaving all the configuration values to default and running
your application and see if you are still hitting the heap issue, If so try
adding a Swap space to the machines which will definitely help. Another way
would be to set the heap space manually (export _JAVA_OPTIONS=-Xmx5g)
Thanks
Best Regards
On Wed, Apr 8, 2015 at 12:45 AM, Shuai Zheng szheng.c...@gmail.com
wrote:
Hi All,
I am a bit confused on spark.storage.memoryFraction, this is used to set
the area for RDD usage, will this RDD means only for cached and persisted
RDD? So if my program has no cached RDD at all (means that I have no
.cache() or .persist() call on any RDD), then I can set this
spark.storage.memoryFraction to a very small number or even zero?
I am writing a program which consume a lot of memory (broadcast value,
runtime, etc). But I have no cached RDD, so should I just turn off this
spark.storage.memoryFraction to 0 (which will help me to improve the
performance)?
And I have another issue on the broadcast, when I try to get a broadcast
value, it throws me out of memory error, which part of memory should I
allocate more (if I can’t increase my overall memory size).
java.lang.OutOfMemoryError: Java heap spac
e
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$DoubleA
rraySerializer.read(DefaultArraySerializers.java:218)
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$DoubleA
rraySerializer.read(DefaultArraySerializers.java:200)
at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699)
at
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.rea
d(FieldSerializer.java:611)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSeria
lizer.java:221)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
at
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.rea
d(FieldSerializer.java:605)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSeria
lizer.java:221)
at
com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
at
org.apache.spark.serializer.KryoDeserializationStream.readObject(Kryo
Serializer.scala:138)
at
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Ser
ializer.scala:133)
at
org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at
org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:2
48)
at
org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:13
6)
at
org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:5
49)
at
org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:431
)
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlo
ck$1.apply(TorrentBroadcast.scala:167)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1152)
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(Torren
tBroadcast.scala:164)
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(Torrent
Broadcast.scala:64)
at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.s
cala:64)
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast
.scala:87)
Regards,
Shuai