Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47902 --- Ship it! Thanks Sergey, +1, I'll commit it! - Maja Kabiljo On July 16, 2014, 3:59 a.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 16, 2014, 3:59 a.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 15, 2014, 9:08 p.m.) Review request for giraph. Changes --- Fixed CR issues Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs (updated) - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
On July 2, 2014, 1:53 a.m., Maja Kabiljo wrote: giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java, line 19 https://reviews.apache.org/r/23140/diff/2/?file=622266#file622266line19 Why did you move this file? On July 2, 2014, 1:53 a.m., Maja Kabiljo wrote: giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java, lines 817-818 https://reviews.apache.org/r/23140/diff/2/?file=622249#file622249line817 Interesting, where do we rely on this? I don't remember it right now, will run some experiments later - Sergey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47169 --- On July 15, 2014, 9:08 p.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 15, 2014, 9:08 p.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47838 --- Looks great, a few final comments about the test. giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java https://reviews.apache.org/r/23140/#comment84071 I'm a bit concerned that this test would have passed even if restart from checkpoint didn't actually happen but app run from beginning. Can we somehow ensure it did? giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java https://reviews.apache.org/r/23140/#comment84066 Can you reuse the same conf and just add one setting (or at least create a method which creates conf) giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java https://reviews.apache.org/r/23140/#comment84068 You can extend DefaultWorkerContext to avoid overriding empty methods - Maja Kabiljo On July 15, 2014, 11:33 p.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 15, 2014, 11:33 p.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 16, 2014, 3:59 a.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs (updated) - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 2, 2014, 12:57 a.m.) Review request for giraph. Changes --- I removed aggregators serialization from MasterCompute and workers. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs (updated) - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47169 --- Thanks, much shorter now. Should we add some tests to make sure things don't get broken again? giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java https://reviews.apache.org/r/23140/#comment82778 Why ignore superstep 0? For example there might be a lot of filtering going on during input superstep and it's cheaper to restart from checkpoint than read all the data again giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java https://reviews.apache.org/r/23140/#comment82781 Interesting, where do we rely on this? giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java https://reviews.apache.org/r/23140/#comment82777 Nice bug ;-) giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java https://reviews.apache.org/r/23140/#comment82787 This is what output threads are called, please name these differently giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java https://reviews.apache.org/r/23140/#comment82775 We are not using Serializable - what's transient here for? giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java https://reviews.apache.org/r/23140/#comment82772 Why did you move this file? - Maja Kabiljo On July 2, 2014, 12:57 a.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 2, 2014, 12:57 a.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47023 --- I see a lot of the changes are related to aggregators, and you write them now from master, worker and MasterCompute - can't we write them just once and go through normal path of distributing them in the beginning of the superstep? - Maja Kabiljo On June 27, 2014, 8:48 p.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated June 27, 2014, 8:48 p.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/Aggregator.java 514e470 giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorHandler.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/aggregators/BasicAggregator.java 07a4100 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/comm/aggregators/AllAggregatorServerData.java 177e738 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/DefaultMasterCompute.java bfb6f0e giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/master/MasterCompute.java d77a9b5 giraph-core/src/main/java/org/apache/giraph/master/WritableMasterAggregatorUsage.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerAggregatorHandler.java 9bfd7b5 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/main/java/org/apache/giraph/worker/WorkerThreadAggregatorUsage.java 194127e giraph-core/src/main/java/org/apache/giraph/worker/WritableWorkerAggregatorUsage.java PRE-CREATION giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/Aggregator.java 514e470 giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorHandler.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/aggregators/BasicAggregator.java 07a4100 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/comm/aggregators/AllAggregatorServerData.java 177e738 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/DefaultMasterCompute.java bfb6f0e giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/master/MasterCompute.java d77a9b5 giraph-core/src/main/java/org/apache/giraph/master/WritableMasterAggregatorUsage.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerAggregatorHandler.java 9bfd7b5 giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/main/java/org/apache/giraph/worker/WorkerThreadAggregatorUsage.java 194127e giraph-core/src/main/java/org/apache/giraph/worker/WritableWorkerAggregatorUsage.java PRE-CREATION giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov