> On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote: > > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java, > > lines 1009-1011 > > <https://reviews.apache.org/r/17336/diff/1/?file=450670#file450670line1009> > > > > Should we make it true by default if there isn't much overhead?
Sure, changed > On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote: > > giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgressWriter.java, > > line 32 > > <https://reviews.apache.org/r/17336/diff/1/?file=450683#file450683line32> > > > > every 5 seconds seems quite often no? ZooKeeper is limited in its > > write throughtput. Maybe every 10 or 15 seconds is a bit better? Modified, I made it a random value between 10 and 20 seconds so not all workers would try to write at the same time > On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote: > > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, > > line 1012 > > <https://reviews.apache.org/r/17336/diff/1/?file=450679#file450679line1012> > > > > Maybe a better way would be time based, since the rest of the update > > logic is time-based? I.e. if the internal / 2 seconds have passed...update. Time based is for Zookeeper, this just updates local counters, other places are per number of vertices as well (input, compute). - Maja ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17336/#review32958 ----------------------------------------------------------- On Jan. 28, 2014, 5:36 p.m., Maja Kabiljo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17336/ > ----------------------------------------------------------- > > (Updated Jan. 28, 2014, 5:36 p.m.) > > > Review request for giraph. > > > Bugs: GIRAPH-792 > https://issues.apache.org/jira/browse/GIRAPH-792 > > > Repository: giraph-git > > > Description > ------- > > Currently we print nothing about job progress to command line. We should > track which stage are we in and how far in it are we. > > > Diffs > ----- > > giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 86823ed > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java > c8b7d36 > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java > 63f38df > giraph-core/src/main/java/org/apache/giraph/graph/ComputeCallable.java > 1fe1d10 > giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java > f31d99e > giraph-core/src/main/java/org/apache/giraph/job/CombinedWorkerProgress.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/job/GiraphJob.java 40670bb > giraph-core/src/main/java/org/apache/giraph/job/HaltApplicationUtils.java > 28b5781 > giraph-core/src/main/java/org/apache/giraph/job/JobProgressTracker.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java > 78487ef > giraph-core/src/main/java/org/apache/giraph/utils/CounterUtils.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java > bc29b03 > > giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java > 8ec0453 > > giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java > 01a6fc5 > giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgress.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgressWriter.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/17336/diff/ > > > Testing > ------- > > mvn clean verify > > run on a cluster, checked there is no overhead (with 50 workers); sample > output: > 14/01/24 14:42:33 INFO job.JobProgressTracker: Data from 50 workers - Loading > data: 315250000 vertices loaded, 0 vertex input splits loaded; 0 edges > loaded, 0 edge input splits loaded > 14/01/24 14:42:42 INFO job.JobProgressTracker: Data from 50 workers - Loading > data: 441000000 vertices loaded, 64 vertex input splits loaded; 0 edges > loaded, 0 edge input splits loaded > 14/01/24 14:42:51 INFO job.JobProgressTracker: Data from 50 workers - Loading > data: 494250000 vertices loaded, 234 vertex input splits loaded; 0 edges > loaded, 0 edge input splits loaded > 14/01/24 14:43:00 INFO job.JobProgressTracker: Data from 50 workers - Loading > data: 498750000 vertices loaded, 247 vertex input splits loaded; 0 edges > loaded, 0 edge input splits loaded > 14/01/24 14:43:09 INFO job.JobProgressTracker: Data from 50 workers - Loading > data: 499500000 vertices loaded, 249 vertex input splits loaded; 0 edges > loaded, 0 edge input splits loaded > 14/01/24 14:43:18 INFO job.JobProgressTracker: Data from 47 workers - Compute > superstep 0: 6800000 out of 470000000 vertices computed; 0 out of 2350 > partitions computed > 14/01/24 14:43:27 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 133200000 out of 500000000 vertices computed; 332 out of 2500 > partitions computed > 14/01/24 14:43:36 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 304500000 out of 500000000 vertices computed; 1080 out of 2500 > partitions computed > 14/01/24 14:43:46 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 467300000 out of 500000000 vertices computed; 2203 out of 2500 > partitions computed > 14/01/24 14:43:54 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:04 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:13 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:13 INFO mapred.ExpireTasks: Starting launching task sweep > 14/01/24 14:44:21 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:30 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:39 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 > partitions computed > 14/01/24 14:44:48 INFO job.JobProgressTracker: Data from 13 workers - Compute > superstep 1: 0 out of 130000000 vertices computed; 0 out of 650 partitions > computed > 14/01/24 14:44:57 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 1: 112600000 out of 500000000 vertices computed; 159 out of 2500 > partitions computed > 14/01/24 14:45:06 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 1: 268600000 out of 500000000 vertices computed; 1003 out of 2500 > partitions computed > 14/01/24 14:45:15 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 1: 418600000 out of 500000000 vertices computed; 2002 out of 2500 > partitions computed > 14/01/24 14:45:24 INFO job.JobProgressTracker: Data from 50 workers - Compute > superstep 1: 499400000 out of 500000000 vertices computed; 2494 out of 2500 > partitions computed > > > Thanks, > > Maja Kabiljo > >
