----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5987/ -----------------------------------------------------------
(Updated Aug. 15, 2012, 4:49 p.m.) Review request for giraph and Avery Ching. Description ------- I replaced the HashMap that stores partitions in a worker with a WorkerPartitionMap. A WorkerPartitionMap has a normal in-memory map, and the ability to store entire partitions to the local FS when memory is low. In order to provide the normal views of the contents of a map, we operate lazily by loading the out-of-core partitions as we iterate. We always add the requested partition to the in-memory map (moving another one to disk to make room) in order to allow modification. The option "giraph.outOfCoreGraph" controls whether we use WorkerPartitionMap or a normal HashMap as before. "giraph.minFreeMemoryRatio" controls how much free memory we want to preserve out of the maximum available memory for the program. If out-of-core is enabled and the memory limit is exceeded, we start spilling partitions to disk. My main concern now is with input splitting: we may have to reorganize that phase so that messages are not all accumulated in temporary in-memory storage before they are transferred to the (disk-backed) WorkerPartitionMap. See http://bit.ly/NwwFGI and subsequent comments for more context. This addresses bug GIRAPH-249. https://issues.apache.org/jira/browse/GIRAPH-249 Diffs (updated) ----- http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClientServer.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerServer.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ServerData.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WorkerServer.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/DiskBackedPartitionStore.java PRE-CREATION http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/Partition.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java PRE-CREATION http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/SimplePartitionStore.java PRE-CREATION http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/WorkerGraphPartitioner.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/TestMutateGraphVertex.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/RequestTest.java 1373331 http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/graph/partition/TestPartitionStores.java PRE-CREATION Diff: https://reviews.apache.org/r/5987/diff/ Testing ------- - Unit test. - "mvn verify" with different settings in order to trigger the out-of-core mechanism. - Plan on doing some benchmarking. Thanks, Alessandro Presta
