> On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote:
> > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java, 
> > lines 1009-1011
> > <https://reviews.apache.org/r/17336/diff/1/?file=450670#file450670line1009>
> >
> >     Should we make it true by default if there isn't much overhead?

Sure, changed


> On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgressWriter.java,
> >  line 32
> > <https://reviews.apache.org/r/17336/diff/1/?file=450683#file450683line32>
> >
> >     every 5 seconds seems quite often no?  ZooKeeper is limited in its 
> > write throughtput.  Maybe every 10 or 15 seconds is a bit better?

Modified, I made it a random value between 10 and 20 seconds so not all workers 
would try to write at the same time


> On Jan. 28, 2014, 6:50 a.m., Avery Ching wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, 
> > line 1012
> > <https://reviews.apache.org/r/17336/diff/1/?file=450679#file450679line1012>
> >
> >     Maybe a better way would be time based, since the rest of the update 
> > logic is time-based?  I.e. if the internal / 2 seconds have passed...update.

Time based is for Zookeeper, this just updates local counters, other places are 
per number of vertices as well (input, compute). 


- Maja


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17336/#review32958
-----------------------------------------------------------


On Jan. 28, 2014, 5:36 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17336/
> -----------------------------------------------------------
> 
> (Updated Jan. 28, 2014, 5:36 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-792
>     https://issues.apache.org/jira/browse/GIRAPH-792
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> Currently we print nothing about job progress to command line. We should 
> track which stage are we in and how far in it are we.
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 86823ed 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 
> c8b7d36 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 
> 63f38df 
>   giraph-core/src/main/java/org/apache/giraph/graph/ComputeCallable.java 
> 1fe1d10 
>   giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java 
> f31d99e 
>   giraph-core/src/main/java/org/apache/giraph/job/CombinedWorkerProgress.java 
> PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/job/GiraphJob.java 40670bb 
>   giraph-core/src/main/java/org/apache/giraph/job/HaltApplicationUtils.java 
> 28b5781 
>   giraph-core/src/main/java/org/apache/giraph/job/JobProgressTracker.java 
> PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 
> 78487ef 
>   giraph-core/src/main/java/org/apache/giraph/utils/CounterUtils.java 
> PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 
> bc29b03 
>   
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java
>  8ec0453 
>   
> giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java
>  01a6fc5 
>   giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgress.java 
> PRE-CREATION 
>   
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgressWriter.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17336/diff/
> 
> 
> Testing
> -------
> 
> mvn clean verify
> 
> run on a cluster, checked there is no overhead (with 50 workers); sample 
> output:
> 14/01/24 14:42:33 INFO job.JobProgressTracker: Data from 50 workers - Loading 
> data: 315250000 vertices loaded, 0 vertex input splits loaded; 0 edges 
> loaded, 0 edge input splits loaded
> 14/01/24 14:42:42 INFO job.JobProgressTracker: Data from 50 workers - Loading 
> data: 441000000 vertices loaded, 64 vertex input splits loaded; 0 edges 
> loaded, 0 edge input splits loaded
> 14/01/24 14:42:51 INFO job.JobProgressTracker: Data from 50 workers - Loading 
> data: 494250000 vertices loaded, 234 vertex input splits loaded; 0 edges 
> loaded, 0 edge input splits loaded
> 14/01/24 14:43:00 INFO job.JobProgressTracker: Data from 50 workers - Loading 
> data: 498750000 vertices loaded, 247 vertex input splits loaded; 0 edges 
> loaded, 0 edge input splits loaded
> 14/01/24 14:43:09 INFO job.JobProgressTracker: Data from 50 workers - Loading 
> data: 499500000 vertices loaded, 249 vertex input splits loaded; 0 edges 
> loaded, 0 edge input splits loaded
> 14/01/24 14:43:18 INFO job.JobProgressTracker: Data from 47 workers - Compute 
> superstep 0: 6800000 out of 470000000 vertices computed; 0 out of 2350 
> partitions computed
> 14/01/24 14:43:27 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 133200000 out of 500000000 vertices computed; 332 out of 2500 
> partitions computed
> 14/01/24 14:43:36 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 304500000 out of 500000000 vertices computed; 1080 out of 2500 
> partitions computed
> 14/01/24 14:43:46 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 467300000 out of 500000000 vertices computed; 2203 out of 2500 
> partitions computed
> 14/01/24 14:43:54 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:04 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:13 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:13 INFO mapred.ExpireTasks: Starting launching task sweep
> 14/01/24 14:44:21 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:30 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:39 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 0: 500000000 out of 500000000 vertices computed; 2500 out of 2500 
> partitions computed
> 14/01/24 14:44:48 INFO job.JobProgressTracker: Data from 13 workers - Compute 
> superstep 1: 0 out of 130000000 vertices computed; 0 out of 650 partitions 
> computed
> 14/01/24 14:44:57 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 1: 112600000 out of 500000000 vertices computed; 159 out of 2500 
> partitions computed
> 14/01/24 14:45:06 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 1: 268600000 out of 500000000 vertices computed; 1003 out of 2500 
> partitions computed
> 14/01/24 14:45:15 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 1: 418600000 out of 500000000 vertices computed; 2002 out of 2500 
> partitions computed
> 14/01/24 14:45:24 INFO job.JobProgressTracker: Data from 50 workers - Compute 
> superstep 1: 499400000 out of 500000000 vertices computed; 2494 out of 2500 
> partitions computed
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>

Reply via email to