[jira] [Assigned] (GIRAPH-104) Save half of maximum memory used from messaging

2011-12-13 Thread Avery Ching (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avery Ching reassigned GIRAPH-104:
--

Assignee: Avery Ching

 Save half of maximum memory used from messaging
 ---

 Key: GIRAPH-104
 URL: https://issues.apache.org/jira/browse/GIRAPH-104
 Project: Giraph
  Issue Type: Improvement
Reporter: Avery Ching
Assignee: Avery Ching
Priority: Critical

 Currently, the amount of memory that Giraph uses for messaging is huge.  This 
 JIRA will reduce the messaging memory by half and provide periodic updates of 
 memory for debugging.  Details are below:
 Refactored RandomMessageBenchmark to an internal vertex class.  Added 
 aggregators to RandomMessagesBenchmark to track bytes, messages, and time for 
 the messaging.  Adjusted the postSuperstep() to be called after the flush() 
 for more accurate timings.
 Added periodic minute updates for message flushing (which can take a while, 
 especially on the memory benchmark).  This helps to see how progress is going 
 and gives an ETA.
 Memory optimizations include:
 - Clear the message list after computation 
 - Free vertex messages on the source as the flush is going on 
 - TreeMap - HashMap for VertexMutations
 - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: Save half of maximum memory used from messaging

2011-12-13 Thread Avery Ching

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3175/
---

Review request for giraph.


Summary
---

Currently, the amount of memory that Giraph uses for messaging is huge. This 
JIRA will reduce the messaging memory by half and provide periodic updates of 
memory for debugging. Details are below:

Refactored RandomMessageBenchmark to an internal vertex class. Added 
aggregators to RandomMessagesBenchmark to track bytes, messages, and time for 
the messaging. Adjusted the postSuperstep() to be called after the flush() for 
more accurate timings.

Added periodic minute updates for message flushing (which can take a while, 
especially on the memory benchmark). This helps to see how progress is going 
and gives an ETA.

Memory optimizations include:

-Clear the message list after computation
-Free vertex messages on the source as the flush is going on
-TreeMap - HashMap for VertexMutations
-Sizing the ArrayList properly in transientInMessages


This addresses bug GIRAPH-104.
https://issues.apache.org/jira/browse/GIRAPH-104


Diffs
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/benchmark/RandomMessageBenchmark.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/LongSumAggregator.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/WorkerContext.java
 1213849 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/utils/MemoryUtils.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/3175/diff


Testing
---

Passed local and Hadoop unittests.  RandomMessageBenchmark was run at scale on 
a real cluster.


Thanks,

Avery



[jira] [Updated] (GIRAPH-104) Save half of maximum memory used from messaging

2011-12-13 Thread Avery Ching (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avery Ching updated GIRAPH-104:
---

Attachment: GIRAPH-104.diff

 Save half of maximum memory used from messaging
 ---

 Key: GIRAPH-104
 URL: https://issues.apache.org/jira/browse/GIRAPH-104
 Project: Giraph
  Issue Type: Improvement
Reporter: Avery Ching
Assignee: Avery Ching
Priority: Critical
 Attachments: GIRAPH-104.diff


 Currently, the amount of memory that Giraph uses for messaging is huge.  This 
 JIRA will reduce the messaging memory by half and provide periodic updates of 
 memory for debugging.  Details are below:
 Refactored RandomMessageBenchmark to an internal vertex class.  Added 
 aggregators to RandomMessagesBenchmark to track bytes, messages, and time for 
 the messaging.  Adjusted the postSuperstep() to be called after the flush() 
 for more accurate timings.
 Added periodic minute updates for message flushing (which can take a while, 
 especially on the memory benchmark).  This helps to see how progress is going 
 and gives an ETA.
 Memory optimizations include:
 - Clear the message list after computation 
 - Free vertex messages on the source as the flush is going on 
 - TreeMap - HashMap for VertexMutations
 - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

2011-12-13 Thread Claudio Martella (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168794#comment-13168794
 ] 

Claudio Martella commented on GIRAPH-104:
-

supposing the messaging pattern doesn't change between superstep 6 and 
superstep 8 :)

this looks like a great improvement, great work. I went through the review, 
frankly quite quickly, and it looks very good.

I'll check it out better tomorrow and will +1.

 Save half of maximum memory used from messaging
 ---

 Key: GIRAPH-104
 URL: https://issues.apache.org/jira/browse/GIRAPH-104
 Project: Giraph
  Issue Type: Improvement
Reporter: Avery Ching
Assignee: Avery Ching
Priority: Critical
 Attachments: GIRAPH-104.diff


 Currently, the amount of memory that Giraph uses for messaging is huge.  This 
 JIRA will reduce the messaging memory by half and provide periodic updates of 
 memory for debugging.  Details are below:
 Refactored RandomMessageBenchmark to an internal vertex class.  Added 
 aggregators to RandomMessagesBenchmark to track bytes, messages, and time for 
 the messaging.  Adjusted the postSuperstep() to be called after the flush() 
 for more accurate timings.
 Added periodic minute updates for message flushing (which can take a while, 
 especially on the memory benchmark).  This helps to see how progress is going 
 and gives an ETA.
 Memory optimizations include:
 - Clear the message list after computation 
 - Free vertex messages on the source as the flush is going on 
 - TreeMap - HashMap for VertexMutations
 - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (GIRAPH-57) Provide PutMsgs RPC call

2011-12-13 Thread Avery Ching (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avery Ching reassigned GIRAPH-57:
-

Assignee: Avery Ching

 Provide PutMsgs RPC call
 

 Key: GIRAPH-57
 URL: https://issues.apache.org/jira/browse/GIRAPH-57
 Project: Giraph
  Issue Type: Improvement
Reporter: Jakob Homan
Assignee: Avery Ching

 Right now messages are sent to a vertex one at a time.  It would be good to 
 have a putMsgs call that could send messages to multiple vertices (all hosted 
 on the same worker).  We'd save a huge number of individual RPC calls at the 
 expense of having smaller calls with larger payloads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira