[
https://issues.apache.org/jira/browse/GIRAPH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837594#comment-13837594
]
Rob Vesse commented on GIRAPH-808:
----------------------------------
The middle segment of progress could be better approximated using the number of
vertices to be processed i.e.
CurrentSuperstepProgress = VerticesProcessed / TotalVertices
SuperstepsProgress = (CurrentSuperstep + 1 / SuperstepsSoFar)
ComputationProgress = CurrentSuperstepProgress * (SuperstepsProgress / (N - 1))
This would have the effect that the progress would not be a linear trend since
it would trend towards N during a super step and then drop back down at the
start of the next super step but providing Hadoop allows progress to change in
this way it would be a much better way of reporting the progression of a Giraph
computation.
> Giraph should report progress more accurately when running on Map/Reduce
> ------------------------------------------------------------------------
>
> Key: GIRAPH-808
> URL: https://issues.apache.org/jira/browse/GIRAPH-808
> Project: Giraph
> Issue Type: Improvement
> Affects Versions: 1.0.0, 1.1.0
> Reporter: Rob Vesse
>
> The current way that Giraph reports progress when running on Map/Reduce seems
> rather flawed. When running a Giraph program the map tasks are launched and
> after initialisation their progress almost immediately goes to 100% and stays
> there throughout. So the only way to monitor progress is to
> I appreciate that there is no way for Giraph to report accurate progress
> since it does not know in advance how many super steps there will be but it
> could report progress in a more useful way.
> For example:
> - First N percent of progress is the input phase, this part could likely be
> accurately calculated by using standard Hadoop input APIs which Giraph input
> is built on
> - Next N percent of progress is an estimation that trends towards the final
> value but does not reach it until the computation has halted i.e. (Superstep
> + 1) / (N - 1) so this will naturally trend towards N. Once the computation
> halts then the value becomes N
> - Last N percentage of progress is the output phase, again this part could
> likely be accurately calculated easily since Giraph knows how many items it
> has to output
> What does anybody else think?
--
This message was sent by Atlassian JIRA
(v6.1#6144)