> The disadvantage may be the client programme need to explicitly tell the
> order of superstep.
If user want to call a sync() method repeatedly in the loops while or
until a condition is true, how to program it?
bsp() {
while (condition is true) {
doLocalComputation();
communicationWith(others);
sync();
}
}
I think, current BSP programming interface is very good. If it's just
only for recovery, we have to find another way.
2011/9/19 ChiaHung Lin <[email protected]>:
> Currently we have bsp() where users can code for performing thier tasks. For
> instance,
>
> ... bsp() ...{
> ... // some computation
> sync();
> ... // some other computation
> sync();
> ...
> }
>
> However, this is difficult for recovery because 1st, it requires checkpointed
> messages to be recovered so that the computation can be resumed from where it
> fails; 2nd, the recovery procedure needs to know from which super step to
> restart. With the current bsp(), it seems a common choice is preprocessing;
> but this may not be good because when internally something goes wrong it, it
> is not easy to find out the problem.
>
> I come up with an alternative method but this would have change to the way of
> our current procedure. So I think it would be good to discuss it first. It is
> proposed as below:
>
> 1. we divide bsp() into smaller computation unit called e.g. step() or
> superstep(), within which user still write their own logic.
>
> 2. in main, user composes the order of supersteps.
>
> ... class Superstep1 extends BSPSuperstep {
> ... superstep() ... {...}
> }
> ... class Superstep2 extends BSPSuperstep {
> ... superstep() ... {...}
> }
>
> BSPJob bsp = new BSP(...);
> bsp.compose(Superstep1.class).compose(Superstep2.class)...;
>
> Therefore, when recovery, in BSPTask run() we can have
>
> List<BSPSuperstep> steps = BSPJob.supersteps();
>
> for(BSPSuperstep step: steps) {
> if(checkpointed) {
> // restore checkpointed messages e.g. adding checkpointed msg (in hdfs)
> back to queues
> }
> step.superstep(...);
> step.sync();
> }
>
> The advantage is easier for recovery procedure.
> The disadvantage may be the client programme need to explicitly tell the
> order of superstep.
>
> Any thought?
>
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan
>
--
Best Regards, Edward J. Yoon
@eddieyoon