Rob Vesse created GIRAPH-808:
--------------------------------

             Summary: Giraph should report progress more accurately when 
running on Map/Reduce
                 Key: GIRAPH-808
                 URL: https://issues.apache.org/jira/browse/GIRAPH-808
             Project: Giraph
          Issue Type: Improvement
    Affects Versions: 1.0.0, 1.1.0
            Reporter: Rob Vesse


The current way that Giraph reports progress when running on Map/Reduce seems 
rather flawed.  When running a Giraph program the map tasks are launched and 
after initialisation their progress almost immediately goes to 100% and stays 
there throughout.  So the only way to monitor progress is to 

I appreciate that there is no way for Giraph to report accurate progress since 
it does not know in advance how many super steps there will be but it could 
report progress in a more useful way.

For example:
- First N percent of progress is the input phase, this part could likely be 
accurately calculated by using standard Hadoop input APIs which Giraph input is 
built on
- Next N percent of progress is an estimation that trends towards the final 
value but does not reach it until the computation has halted i.e. (Superstep + 
1) / (N - 1) so this will naturally trend towards N.  Once the computation 
halts then the value becomes N
- Last N percentage of progress is the output phase, again this part could 
likely be accurately calculated easily since Giraph knows how many items it has 
to output

What does anybody else think?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to