Hi,

*In a reduce operation I am trying to accumulate a list of SparseVectors.
The code is given below;*
                val WNode = trainingData.reduce{(node1:Node,node2:Node) =>
                  val wNode = new Node(num1,num2)
                  wNode.WhatList ++= (node1.WList)
                  wNode.WList ++= (node2.WList)
                  wNode
                }

where Whatlist is a list of SparseVectors. The average size of a
SparseVector is 21000 and the approximate number of 
elements in the final list at the end of the reduce operation varies between
20 to 100.

*However, at run time I am getting the following error messages from some of
the executor machines.*
14/10/20 22:38:41 INFO BlockManagerInfo: Added taskresult_30 in memory on
cse-hadoop-113:34602 (size: 789.0 MB, free: 22.2 GB)
14/10/20 22:38:41 INFO TaskSetManager: Starting task 1.0:12 as TID 34 on
executor 6: cse-hadoop-113 (PROCESS_LOCAL)
14/10/20 22:38:41 INFO TaskSetManager: Serialized task 1.0:12 as 2170 bytes
in 2 ms
14/10/20 22:38:41 INFO SendingConnection: Initiating connection to
[cse-hadoop-113/192.168.0.113:34602]
14/10/20 22:38:41 INFO SendingConnection: Connected to
[cse-hadoop-113/192.168.0.113:34602], 1 messages pending
14/10/20 22:38:41 INFO ConnectionManager: Accepted connection from
[cse-hadoop-113/192.168.0.113]
Exception in thread "pool-5-thread-3" java.lang.OutOfMemoryError: Java heap
space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
        at org.apache.spark.network.Message$.create(Message.scala:88)
        at
org.apache.spark.network.ReceivingConnection$Inbox.org$apache$spark$network$ReceivingConnection$Inbox$$createNewMessage$1(Connection.scala:438)
        at
org.apache.spark.network.ReceivingConnection$Inbox$$anonfun$1.apply(Connection.scala:448)
        at
org.apache.spark.network.ReceivingConnection$Inbox$$anonfun$1.apply(Connection.scala:448)
        at
scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189)
        at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91)
        at
org.apache.spark.network.ReceivingConnection$Inbox.getChunk(Connection.scala:448)
        at 
org.apache.spark.network.ReceivingConnection.read(Connection.scala:525)
        at
org.apache.spark.network.ConnectionManager$$anon$6.run(ConnectionManager.scala:176)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
*Please help.*



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-Java-heap-space-during-reduce-operation-tp16835.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to