Rob Vesse created GIRAPH-808:
--------------------------------
Summary: Giraph should report progress more accurately when
running on Map/Reduce
Key: GIRAPH-808
URL: https://issues.apache.org/jira/browse/GIRAPH-808
Project: Giraph
Issue Type: Improvement
Affects Versions: 1.0.0, 1.1.0
Reporter: Rob Vesse
The current way that Giraph reports progress when running on Map/Reduce seems
rather flawed. When running a Giraph program the map tasks are launched and
after initialisation their progress almost immediately goes to 100% and stays
there throughout. So the only way to monitor progress is to
I appreciate that there is no way for Giraph to report accurate progress since
it does not know in advance how many super steps there will be but it could
report progress in a more useful way.
For example:
- First N percent of progress is the input phase, this part could likely be
accurately calculated by using standard Hadoop input APIs which Giraph input is
built on
- Next N percent of progress is an estimation that trends towards the final
value but does not reach it until the computation has halted i.e. (Superstep +
1) / (N - 1) so this will naturally trend towards N. Once the computation
halts then the value becomes N
- Last N percentage of progress is the output phase, again this part could
likely be accurately calculated easily since Giraph knows how many items it has
to output
What does anybody else think?
--
This message was sent by Atlassian JIRA
(v6.1#6144)