----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9278/#review16069 -----------------------------------------------------------
This looks great. I see the data structures I was concerned with are only produced one set per disk store rather than one per partition, and I see from the test numbers this does not seem to have too adverse an effect on our heap usage. In fact, I'm very impressed at how the in-memory and heavily cached to disk job runs produce similar numbers on runtime and resource usage. Great work! - Eli Reisman On Feb. 3, 2013, 2:54 p.m., Claudio Martella wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9278/ > ----------------------------------------------------------- > > (Updated Feb. 3, 2013, 2:54 p.m.) > > > Review request for giraph. > > > Description > ------- > > Currently, the out-of-core partitions are assigned to memory or to disk > statically. Using an LRU cache should help keeping in-memory only the > partitions that are actively accessed, given a job that does not access all > the graph at each superstep (traversals) and a good data partitioning (non > random). > > > Diffs > ----- > > > giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java > 30d4462 > > giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java > e2866fd > giraph-core/src/main/java/org/apache/giraph/graph/ComputeCallable.java > 042fd47 > > giraph-core/src/main/java/org/apache/giraph/partition/DiskBackedPartitionStore.java > 09e5d75 > giraph-core/src/main/java/org/apache/giraph/partition/PartitionStore.java > 3e8dda9 > > giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionStore.java > 7bd0bb1 > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java > f542344 > giraph-core/src/test/java/org/apache/giraph/comm/RequestTest.java 7187928 > > giraph-core/src/test/java/org/apache/giraph/partition/TestPartitionStores.java > b02ed3a > > Diff: https://reviews.apache.org/r/9278/diff/ > > > Testing > ------- > > passes mvn verify. > > hadoop jar giraph-0.2-SNAPSHOT-for-hadoop-1.0.2-jar-with-dependencies.jar > org.apache.giraph.benchmark.PageRankBenchmark -w 60 -c 2 -e 100 -V 10000000 > -v -s 10 > > trunk: > 13/01/29 20:40:53 INFO mapred.JobClient: Giraph Timers > 13/01/29 20:40:53 INFO mapred.JobClient: Total (milliseconds)=492403 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 3 (milliseconds)=40243 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 4 (milliseconds)=45430 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 10 (milliseconds)=713 > 13/01/29 20:40:53 INFO mapred.JobClient: Setup (milliseconds)=20832 > 13/01/29 20:40:53 INFO mapred.JobClient: Shutdown (milliseconds)=56 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 7 (milliseconds)=36753 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 9 (milliseconds)=36363 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 0 (milliseconds)=39558 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 8 (milliseconds)=44548 > 13/01/29 20:40:53 INFO mapred.JobClient: Input superstep > (milliseconds)=59184 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 6 (milliseconds)=40777 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 5 (milliseconds)=43962 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 2 (milliseconds)=37325 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep 1 (milliseconds)=46655 > 13/01/29 20:40:53 INFO mapred.JobClient: Giraph Stats > 13/01/29 20:40:53 INFO mapred.JobClient: Aggregate edges=1000000000 > 13/01/29 20:40:53 INFO mapred.JobClient: Superstep=11 > 13/01/29 20:40:53 INFO mapred.JobClient: Last checkpointed superstep=0 > 13/01/29 20:40:53 INFO mapred.JobClient: Current workers=60 > 13/01/29 20:40:53 INFO mapred.JobClient: Current master task partition=0 > 13/01/29 20:40:53 INFO mapred.JobClient: Sent messages=0 > 13/01/29 20:40:53 INFO mapred.JobClient: Aggregate finished > vertices=10000000 > 13/01/29 20:40:53 INFO mapred.JobClient: Aggregate vertices=10000000 > 13/01/29 20:40:53 INFO mapred.JobClient: File Output Format Counters > 13/01/29 20:40:53 INFO mapred.JobClient: Bytes Written=0 > 13/01/29 20:40:53 INFO mapred.JobClient: FileSystemCounters > 13/01/29 20:40:53 INFO mapred.JobClient: HDFS_BYTES_READ=2684 > 13/01/29 20:40:53 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1388228 > 13/01/29 20:40:53 INFO mapred.JobClient: File Input Format Counters > 13/01/29 20:40:53 INFO mapred.JobClient: Bytes Read=0 > 13/01/29 20:40:53 INFO mapred.JobClient: Map-Reduce Framework > 13/01/29 20:40:53 INFO mapred.JobClient: Map input records=61 > 13/01/29 20:40:53 INFO mapred.JobClient: Physical memory (bytes) > snapshot=71703965696 > 13/01/29 20:40:53 INFO mapred.JobClient: Spilled Records=0 > 13/01/29 20:40:53 INFO mapred.JobClient: CPU time spent (ms)=15141630 > 13/01/29 20:40:53 INFO mapred.JobClient: Total committed heap usage > (bytes)=58151337984 > 13/01/29 20:40:53 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=371313995776 > 13/01/29 20:40:53 INFO mapred.JobClient: Map output records=0 > 13/01/29 20:40:53 INFO mapred.JobClient: SPLIT_RAW_BYTES=2684 > > GIRAPH-439: > in memory: > 13/01/29 19:35:53 INFO mapred.JobClient: Giraph Timers > 13/01/29 19:35:53 INFO mapred.JobClient: Total (milliseconds)=427511 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 3 (milliseconds)=37341 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 4 (milliseconds)=35458 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 10 (milliseconds)=852 > 13/01/29 19:35:53 INFO mapred.JobClient: Setup (milliseconds)=24825 > 13/01/29 19:35:53 INFO mapred.JobClient: Shutdown (milliseconds)=50 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 7 (milliseconds)=37557 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 9 (milliseconds)=33961 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 0 (milliseconds)=33048 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 8 (milliseconds)=36345 > 13/01/29 19:35:53 INFO mapred.JobClient: Input superstep > (milliseconds)=44420 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 6 (milliseconds)=33635 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 5 (milliseconds)=41885 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 2 (milliseconds)=35046 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep 1 (milliseconds)=33083 > 13/01/29 19:35:53 INFO mapred.JobClient: Giraph Stats > 13/01/29 19:35:53 INFO mapred.JobClient: Aggregate edges=1000000000 > 13/01/29 19:35:53 INFO mapred.JobClient: Superstep=11 > 13/01/29 19:35:53 INFO mapred.JobClient: Last checkpointed superstep=0 > 13/01/29 19:35:53 INFO mapred.JobClient: Current workers=60 > 13/01/29 19:35:53 INFO mapred.JobClient: Current master task partition=0 > 13/01/29 19:35:53 INFO mapred.JobClient: Sent messages=0 > 13/01/29 19:35:53 INFO mapred.JobClient: Aggregate finished > vertices=10000000 > 13/01/29 19:35:53 INFO mapred.JobClient: Aggregate vertices=10000000 > 13/01/29 19:35:53 INFO mapred.JobClient: File Output Format Counters > 13/01/29 19:35:53 INFO mapred.JobClient: Bytes Written=0 > 13/01/29 19:35:53 INFO mapred.JobClient: FileSystemCounters > 13/01/29 19:35:53 INFO mapred.JobClient: HDFS_BYTES_READ=2684 > 13/01/29 19:35:53 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1388228 > 13/01/29 19:35:53 INFO mapred.JobClient: File Input Format Counters > 13/01/29 19:35:53 INFO mapred.JobClient: Bytes Read=0 > 13/01/29 19:35:53 INFO mapred.JobClient: Map-Reduce Framework > 13/01/29 19:35:53 INFO mapred.JobClient: Map input records=61 > 13/01/29 19:35:53 INFO mapred.JobClient: Physical memory (bytes) > snapshot=71627419648 > 13/01/29 19:35:53 INFO mapred.JobClient: Spilled Records=0 > 13/01/29 19:35:53 INFO mapred.JobClient: CPU time spent (ms)=15020990 > 13/01/29 19:35:53 INFO mapred.JobClient: Total committed heap usage > (bytes)=57611911168 > 13/01/29 19:35:53 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=371123154944 > 13/01/29 19:35:53 INFO mapred.JobClient: Map output records=0 > 13/01/29 19:35:53 INFO mapred.JobClient: SPLIT_RAW_BYTES=2684 > > ooh graph (2 partitions in memory out of 49): > 13/01/29 19:54:57 INFO mapred.JobClient: Giraph Timers > 13/01/29 19:54:57 INFO mapred.JobClient: Total (milliseconds)=508004 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 3 (milliseconds)=38085 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 4 (milliseconds)=40789 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 10 (milliseconds)=811 > 13/01/29 19:54:57 INFO mapred.JobClient: Setup (milliseconds)=25612 > 13/01/29 19:54:57 INFO mapred.JobClient: Shutdown (milliseconds)=699 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 7 (milliseconds)=44806 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 9 (milliseconds)=41873 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 0 (milliseconds)=46329 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 8 (milliseconds)=46272 > 13/01/29 19:54:57 INFO mapred.JobClient: Input superstep > (milliseconds)=52395 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 6 (milliseconds)=44337 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 5 (milliseconds)=39379 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 2 (milliseconds)=40452 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep 1 (milliseconds)=46155 > 13/01/29 19:54:57 INFO mapred.JobClient: Giraph Stats > 13/01/29 19:54:57 INFO mapred.JobClient: Aggregate edges=1000000000 > 13/01/29 19:54:57 INFO mapred.JobClient: Superstep=11 > 13/01/29 19:54:57 INFO mapred.JobClient: Last checkpointed superstep=0 > 13/01/29 19:54:57 INFO mapred.JobClient: Current workers=60 > 13/01/29 19:54:57 INFO mapred.JobClient: Current master task partition=0 > 13/01/29 19:54:57 INFO mapred.JobClient: Sent messages=0 > 13/01/29 19:54:57 INFO mapred.JobClient: Aggregate finished > vertices=10000000 > 13/01/29 19:54:57 INFO mapred.JobClient: Aggregate vertices=10000000 > 13/01/29 19:54:57 INFO mapred.JobClient: File Output Format Counters > 13/01/29 19:54:57 INFO mapred.JobClient: Bytes Written=0 > 13/01/29 19:54:57 INFO mapred.JobClient: FileSystemCounters > 13/01/29 19:54:57 INFO mapred.JobClient: HDFS_BYTES_READ=2684 > 13/01/29 19:54:57 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1388228 > 13/01/29 19:54:57 INFO mapred.JobClient: File Input Format Counters > 13/01/29 19:54:57 INFO mapred.JobClient: Bytes Read=0 > 13/01/29 19:54:57 INFO mapred.JobClient: Map-Reduce Framework > 13/01/29 19:54:57 INFO mapred.JobClient: Map input records=61 > 13/01/29 19:54:57 INFO mapred.JobClient: Physical memory (bytes) > snapshot=71368736768 > 13/01/29 19:54:57 INFO mapred.JobClient: Spilled Records=0 > 13/01/29 19:54:57 INFO mapred.JobClient: CPU time spent (ms)=15289390 > 13/01/29 19:54:57 INFO mapred.JobClient: Total committed heap usage > (bytes)=57278595072 > 13/01/29 19:54:57 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=370911342592 > 13/01/29 19:54:57 INFO mapred.JobClient: Map output records=0 > 13/01/29 19:54:57 INFO mapred.JobClient: SPLIT_RAW_BYTES=2684 > > in memory (2 compute threads per worker): > 13/01/29 20:30:49 INFO mapred.JobClient: Giraph Timers > 13/01/29 20:30:49 INFO mapred.JobClient: Total (milliseconds)=487379 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 3 (milliseconds)=46092 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 4 (milliseconds)=44840 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 10 (milliseconds)=745 > 13/01/29 20:30:49 INFO mapred.JobClient: Setup (milliseconds)=23013 > 13/01/29 20:30:49 INFO mapred.JobClient: Shutdown (milliseconds)=126 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 7 (milliseconds)=40620 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 9 (milliseconds)=39630 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 0 (milliseconds)=38221 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 8 (milliseconds)=40406 > 13/01/29 20:30:49 INFO mapred.JobClient: Input superstep > (milliseconds)=49762 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 6 (milliseconds)=45054 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 5 (milliseconds)=40220 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 2 (milliseconds)=40817 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep 1 (milliseconds)=37830 > 13/01/29 20:30:49 INFO mapred.JobClient: Giraph Stats > 13/01/29 20:30:49 INFO mapred.JobClient: Aggregate edges=1000000000 > 13/01/29 20:30:49 INFO mapred.JobClient: Superstep=11 > 13/01/29 20:30:49 INFO mapred.JobClient: Last checkpointed superstep=0 > 13/01/29 20:30:49 INFO mapred.JobClient: Current workers=60 > 13/01/29 20:30:49 INFO mapred.JobClient: Current master task partition=0 > 13/01/29 20:30:49 INFO mapred.JobClient: Sent messages=0 > 13/01/29 20:30:49 INFO mapred.JobClient: Aggregate finished > vertices=10000000 > 13/01/29 20:30:49 INFO mapred.JobClient: Aggregate vertices=10000000 > 13/01/29 20:30:49 INFO mapred.JobClient: File Output Format Counters > 13/01/29 20:30:49 INFO mapred.JobClient: Bytes Written=0 > 13/01/29 20:30:49 INFO mapred.JobClient: FileSystemCounters > 13/01/29 20:30:49 INFO mapred.JobClient: HDFS_BYTES_READ=2684 > 13/01/29 20:30:49 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1388228 > 13/01/29 20:30:49 INFO mapred.JobClient: File Input Format Counters > 13/01/29 20:30:49 INFO mapred.JobClient: Bytes Read=0 > 13/01/29 20:30:49 INFO mapred.JobClient: Map-Reduce Framework > 13/01/29 20:30:49 INFO mapred.JobClient: Map input records=61 > 13/01/29 20:30:49 INFO mapred.JobClient: Physical memory (bytes) > snapshot=71895678976 > 13/01/29 20:30:49 INFO mapred.JobClient: Spilled Records=0 > 13/01/29 20:30:49 INFO mapred.JobClient: CPU time spent (ms)=15134650 > 13/01/29 20:30:49 INFO mapred.JobClient: Total committed heap usage > (bytes)=57982255104 > 13/01/29 20:30:49 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=371448213504 > 13/01/29 20:30:49 INFO mapred.JobClient: Map output records=0 > 13/01/29 20:30:49 INFO mapred.JobClient: SPLIT_RAW_BYTES=2684 > > ooh graph (2 partitions in memory out of 49, 2 compute threads per worker): > 13/01/29 20:11:28 INFO mapred.JobClient: Giraph Timers > 13/01/29 20:11:28 INFO mapred.JobClient: Total (milliseconds)=506380 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 3 (milliseconds)=41677 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 4 (milliseconds)=41285 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 10 (milliseconds)=764 > 13/01/29 20:11:28 INFO mapred.JobClient: Setup (milliseconds)=24574 > 13/01/29 20:11:28 INFO mapred.JobClient: Shutdown (milliseconds)=82 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 7 (milliseconds)=43183 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 9 (milliseconds)=46654 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 0 (milliseconds)=50955 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 8 (milliseconds)=40413 > 13/01/29 20:11:28 INFO mapred.JobClient: Input superstep > (milliseconds)=43584 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 6 (milliseconds)=46638 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 5 (milliseconds)=46107 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 2 (milliseconds)=39321 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep 1 (milliseconds)=41139 > 13/01/29 20:11:28 INFO mapred.JobClient: Giraph Stats > 13/01/29 20:11:28 INFO mapred.JobClient: Aggregate edges=1000000000 > 13/01/29 20:11:28 INFO mapred.JobClient: Superstep=11 > 13/01/29 20:11:28 INFO mapred.JobClient: Last checkpointed superstep=0 > 13/01/29 20:11:28 INFO mapred.JobClient: Current workers=60 > 13/01/29 20:11:28 INFO mapred.JobClient: Current master task partition=0 > 13/01/29 20:11:28 INFO mapred.JobClient: Sent messages=0 > 13/01/29 20:11:28 INFO mapred.JobClient: Aggregate finished > vertices=10000000 > 13/01/29 20:11:28 INFO mapred.JobClient: Aggregate vertices=10000000 > 13/01/29 20:11:28 INFO mapred.JobClient: File Output Format Counters > 13/01/29 20:11:28 INFO mapred.JobClient: Bytes Written=0 > 13/01/29 20:11:28 INFO mapred.JobClient: FileSystemCounters > 13/01/29 20:11:28 INFO mapred.JobClient: HDFS_BYTES_READ=2684 > 13/01/29 20:11:28 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1388228 > 13/01/29 20:11:28 INFO mapred.JobClient: File Input Format Counters > 13/01/29 20:11:28 INFO mapred.JobClient: Bytes Read=0 > 13/01/29 20:11:28 INFO mapred.JobClient: Map-Reduce Framework > 13/01/29 20:11:28 INFO mapred.JobClient: Map input records=61 > 13/01/29 20:11:28 INFO mapred.JobClient: Physical memory (bytes) > snapshot=71620620288 > 13/01/29 20:11:28 INFO mapred.JobClient: Spilled Records=0 > 13/01/29 20:11:28 INFO mapred.JobClient: CPU time spent (ms)=15279810 > 13/01/29 20:11:28 INFO mapred.JobClient: Total committed heap usage > (bytes)=57294782464 > 13/01/29 20:11:28 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=370988941312 > 13/01/29 20:11:28 INFO mapred.JobClient: Map output records=0 > 13/01/29 20:11:28 INFO mapred.JobClient: SPLIT_RAW_BYTES=2684 > > > Thanks, > > Claudio Martella > >
