Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Chia-Hung Lin
Sorry don't catch the point.

What's difference between pure BSP and FT BSP? Any concrete example?


On 9 April 2014 08:29, Edward J. Yoon edwardy...@apache.org wrote:
 In my eyes, SuperstepPiEstimator[1] look like totally new programming
 model, very similar with Pregel.

 I personally would like to suggest that we provide both pure BSP and
 fault tolerant BSP model, instead of replace.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 --
 Edward J. Yoon (@eddieyoon)
 Chief Executive Officer
 DataSayer, Inc.


Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Edward J. Yoon
As you can see here[1], the sync() method never called, and an classes
of all superstars were needed to be declared within Job configuration.
Therefore, I thought it's similar with Pregel style on BSP model. It's
quite different from legacy model in my eyes.

According to HAMA-505, superstep API seems used for FT job processing
(I didn't read closely yet). Right? In here, I have an questions. What
happens if I call the sync() method within compute() method? In this
case, framework guarantees the checkpoint/recovery? And how can I
implement the http://wiki.apache.org/hama/SerializePrinting using
superstep API?

 What's difference between pure BSP and FT BSP? Any concrete example?

I was mean the traditional BSP programming model.

1. 
http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

On Wed, Apr 9, 2014 at 4:25 PM, Chia-Hung Lin cli...@googlemail.com wrote:
 Sorry don't catch the point.

 What's difference between pure BSP and FT BSP? Any concrete example?


 On 9 April 2014 08:29, Edward J. Yoon edwardy...@apache.org wrote:
 In my eyes, SuperstepPiEstimator[1] look like totally new programming
 model, very similar with Pregel.

 I personally would like to suggest that we provide both pure BSP and
 fault tolerant BSP model, instead of replace.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 --
 Edward J. Yoon (@eddieyoon)
 Chief Executive Officer
 DataSayer, Inc.



-- 
Edward J. Yoon (@eddieyoon)
CEO at DataSayer Co., Ltd.


Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Chia-Hung Lin
Not very sure if we sync at the same page. And sorry I am not very
familiar with Superstep implementation.

I assume that traditional bsp model means the original bsp interface
where there is a bsp function and user can freely call peer.sync(),
etc. methods

 bsp(BSPPeer ... peer) {
// whatever computation
peer.sync();
}

And the superstep style is with Superstep abstract class.

If this is the case, SuperstepBSP.java has already call sync, as
below, outside each Superstep.compute(). So it looks like even
SuperstepPiEstimator doesn't call sync() method, barrier sync will be
executed because each Superstep is viewed as a superstep in original
BSP definition.

  @Override
  public void bsp(BSPPeerK1, V1, K2, V2, M peer) throws IOException,
  SyncException, InterruptedException {
for (int index = startSuperstep; index  supersteps.length; index++) {
  SuperstepK1, V1, K2, V2, M superstep = supersteps[index];
  superstep.compute(peer);
  if (superstep.haltComputation(peer)) {
break;
  }
  peer.sync();
  startSuperstep = 0;
}
  }

Within the Superstep.compute(), if sync is called again, I would think
that another barrier sync will be executed.

SuperstepBSP.java

for(...) {
  superstep .compute() - { // in compute method
...
peer.sync()
  }
  ...
  peer.sync()
}

IIRC each call to sync may raise the checkpoint (no recovery) method
serialize message to hdfs.

For SerializePrinting, following code snippet  may move

for (String otherPeer : bspPeer.getAllPeerNames()) {
bspPeer.send(otherPeer, new IntegerMessage(bspPeer.getPeerName(), i));
}

to Superstep.compute()

And the outer for loop is what is programmed in SuperstepBSP.java

for (int i = 0; i  NUM_SUPERSTEPS; i++) {
// code that should be moved to Superstep.compute()
}
bspPeer.sync();



On 9 April 2014 16:17, Edward J. Yoon edwardy...@apache.org wrote:
 As you can see here[1], the sync() method never called, and an classes
 of all superstars were needed to be declared within Job configuration.
 Therefore, I thought it's similar with Pregel style on BSP model. It's
 quite different from legacy model in my eyes.

 According to HAMA-505, superstep API seems used for FT job processing
 (I didn't read closely yet). Right? In here, I have an questions. What
 happens if I call the sync() method within compute() method? In this
 case, framework guarantees the checkpoint/recovery? And how can I
 implement the http://wiki.apache.org/hama/SerializePrinting using
 superstep API?

 What's difference between pure BSP and FT BSP? Any concrete example?

 I was mean the traditional BSP programming model.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 On Wed, Apr 9, 2014 at 4:25 PM, Chia-Hung Lin cli...@googlemail.com wrote:
 Sorry don't catch the point.

 What's difference between pure BSP and FT BSP? Any concrete example?


 On 9 April 2014 08:29, Edward J. Yoon edwardy...@apache.org wrote:
 In my eyes, SuperstepPiEstimator[1] look like totally new programming
 model, very similar with Pregel.

 I personally would like to suggest that we provide both pure BSP and
 fault tolerant BSP model, instead of replace.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 --
 Edward J. Yoon (@eddieyoon)
 Chief Executive Officer
 DataSayer, Inc.



 --
 Edward J. Yoon (@eddieyoon)
 CEO at DataSayer Co., Ltd.


Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Edward J. Yoon
I just found this: https://issues.apache.org/jira/browse/HAMA-503 and HAMA-639.

Do you still think superstep API is essential for checkpoint/recovery?
If not, we can drop it. I don't think it's good idea.

On Wed, Apr 9, 2014 at 7:43 PM, Chia-Hung Lin cli...@googlemail.com wrote:
 Not very sure if we sync at the same page. And sorry I am not very
 familiar with Superstep implementation.

 I assume that traditional bsp model means the original bsp interface
 where there is a bsp function and user can freely call peer.sync(),
 etc. methods

  bsp(BSPPeer ... peer) {
 // whatever computation
 peer.sync();
 }

 And the superstep style is with Superstep abstract class.

 If this is the case, SuperstepBSP.java has already call sync, as
 below, outside each Superstep.compute(). So it looks like even
 SuperstepPiEstimator doesn't call sync() method, barrier sync will be
 executed because each Superstep is viewed as a superstep in original
 BSP definition.

   @Override
   public void bsp(BSPPeerK1, V1, K2, V2, M peer) throws IOException,
   SyncException, InterruptedException {
 for (int index = startSuperstep; index  supersteps.length; index++) {
   SuperstepK1, V1, K2, V2, M superstep = supersteps[index];
   superstep.compute(peer);
   if (superstep.haltComputation(peer)) {
 break;
   }
   peer.sync();
   startSuperstep = 0;
 }
   }

 Within the Superstep.compute(), if sync is called again, I would think
 that another barrier sync will be executed.

 SuperstepBSP.java

 for(...) {
   superstep .compute() - { // in compute method
 ...
 peer.sync()
   }
   ...
   peer.sync()
 }

 IIRC each call to sync may raise the checkpoint (no recovery) method
 serialize message to hdfs.

 For SerializePrinting, following code snippet  may move

 for (String otherPeer : bspPeer.getAllPeerNames()) {
 bspPeer.send(otherPeer, new IntegerMessage(bspPeer.getPeerName(), i));
 }

 to Superstep.compute()

 And the outer for loop is what is programmed in SuperstepBSP.java

 for (int i = 0; i  NUM_SUPERSTEPS; i++) {
 // code that should be moved to Superstep.compute()
 }
 bspPeer.sync();



 On 9 April 2014 16:17, Edward J. Yoon edwardy...@apache.org wrote:
 As you can see here[1], the sync() method never called, and an classes
 of all superstars were needed to be declared within Job configuration.
 Therefore, I thought it's similar with Pregel style on BSP model. It's
 quite different from legacy model in my eyes.

 According to HAMA-505, superstep API seems used for FT job processing
 (I didn't read closely yet). Right? In here, I have an questions. What
 happens if I call the sync() method within compute() method? In this
 case, framework guarantees the checkpoint/recovery? And how can I
 implement the http://wiki.apache.org/hama/SerializePrinting using
 superstep API?

 What's difference between pure BSP and FT BSP? Any concrete example?

 I was mean the traditional BSP programming model.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 On Wed, Apr 9, 2014 at 4:25 PM, Chia-Hung Lin cli...@googlemail.com wrote:
 Sorry don't catch the point.

 What's difference between pure BSP and FT BSP? Any concrete example?


 On 9 April 2014 08:29, Edward J. Yoon edwardy...@apache.org wrote:
 In my eyes, SuperstepPiEstimator[1] look like totally new programming
 model, very similar with Pregel.

 I personally would like to suggest that we provide both pure BSP and
 fault tolerant BSP model, instead of replace.

 1. 
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

 --
 Edward J. Yoon (@eddieyoon)
 Chief Executive Officer
 DataSayer, Inc.



 --
 Edward J. Yoon (@eddieyoon)
 CEO at DataSayer Co., Ltd.



-- 
Edward J. Yoon (@eddieyoon)
Chief Executive Officer
DataSayer Co., Ltd.


Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Suraj Menon
I don't like my patch in HAMA-639 myself, eventhough I believe it satisfies
all the mentioned requirements. The usage of superstep chaining API
implementation in the patch is too complicated. A superstep here is like a
transformation function you define on an RDD in Spark. So if you look into
FT design of Spark, on failure, they rerun the operations on the RDD to get
to the current state. This is similar to what we have in mind using
checkpointing. The challenge is in getting the same messages replayed to
newly spawned task on checkpointed data. If you don't use the Superstep(or
any other abstraction representing a function) you cannot start processing
from a line of code where the failure occurred. (Java does not support goto
line number.)

-Suraj


On Wed, Apr 9, 2014 at 7:29 AM, Edward J. Yoon edwardy...@apache.orgwrote:

 I just found this: https://issues.apache.org/jira/browse/HAMA-503 and
 HAMA-639.

 Do you still think superstep API is essential for checkpoint/recovery?
 If not, we can drop it. I don't think it's good idea.

 On Wed, Apr 9, 2014 at 7:43 PM, Chia-Hung Lin cli...@googlemail.com
 wrote:
  Not very sure if we sync at the same page. And sorry I am not very
  familiar with Superstep implementation.
 
  I assume that traditional bsp model means the original bsp interface
  where there is a bsp function and user can freely call peer.sync(),
  etc. methods
 
   bsp(BSPPeer ... peer) {
  // whatever computation
  peer.sync();
  }
 
  And the superstep style is with Superstep abstract class.
 
  If this is the case, SuperstepBSP.java has already call sync, as
  below, outside each Superstep.compute(). So it looks like even
  SuperstepPiEstimator doesn't call sync() method, barrier sync will be
  executed because each Superstep is viewed as a superstep in original
  BSP definition.
 
@Override
public void bsp(BSPPeerK1, V1, K2, V2, M peer) throws IOException,
SyncException, InterruptedException {
  for (int index = startSuperstep; index  supersteps.length; index++)
 {
SuperstepK1, V1, K2, V2, M superstep = supersteps[index];
superstep.compute(peer);
if (superstep.haltComputation(peer)) {
  break;
}
peer.sync();
startSuperstep = 0;
  }
}
 
  Within the Superstep.compute(), if sync is called again, I would think
  that another barrier sync will be executed.
 
  SuperstepBSP.java
 
  for(...) {
superstep .compute() - { // in compute method
  ...
  peer.sync()
}
...
peer.sync()
  }
 
  IIRC each call to sync may raise the checkpoint (no recovery) method
  serialize message to hdfs.
 
  For SerializePrinting, following code snippet  may move
 
  for (String otherPeer : bspPeer.getAllPeerNames()) {
  bspPeer.send(otherPeer, new
 IntegerMessage(bspPeer.getPeerName(), i));
  }
 
  to Superstep.compute()
 
  And the outer for loop is what is programmed in SuperstepBSP.java
 
  for (int i = 0; i  NUM_SUPERSTEPS; i++) {
  // code that should be moved to Superstep.compute()
  }
  bspPeer.sync();
 
 
 
  On 9 April 2014 16:17, Edward J. Yoon edwardy...@apache.org wrote:
  As you can see here[1], the sync() method never called, and an classes
  of all superstars were needed to be declared within Job configuration.
  Therefore, I thought it's similar with Pregel style on BSP model. It's
  quite different from legacy model in my eyes.
 
  According to HAMA-505, superstep API seems used for FT job processing
  (I didn't read closely yet). Right? In here, I have an questions. What
  happens if I call the sync() method within compute() method? In this
  case, framework guarantees the checkpoint/recovery? And how can I
  implement the http://wiki.apache.org/hama/SerializePrinting using
  superstep API?
 
  What's difference between pure BSP and FT BSP? Any concrete example?
 
  I was mean the traditional BSP programming model.
 
  1.
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java
 
  On Wed, Apr 9, 2014 at 4:25 PM, Chia-Hung Lin cli...@googlemail.com
 wrote:
  Sorry don't catch the point.
 
  What's difference between pure BSP and FT BSP? Any concrete example?
 
 
  On 9 April 2014 08:29, Edward J. Yoon edwardy...@apache.org wrote:
  In my eyes, SuperstepPiEstimator[1] look like totally new programming
  model, very similar with Pregel.
 
  I personally would like to suggest that we provide both pure BSP and
  fault tolerant BSP model, instead of replace.
 
  1.
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java
 
  --
  Edward J. Yoon (@eddieyoon)
  Chief Executive Officer
  DataSayer, Inc.
 
 
 
  --
  Edward J. Yoon (@eddieyoon)
  CEO at DataSayer Co., Ltd.



 --
 Edward J. Yoon (@eddieyoon)
 Chief Executive Officer
 DataSayer Co., Ltd.



Re: [DISCUSS] Fault tolerant BSP job

2014-04-09 Thread Chia-Hung Lin
That's why I proposed to use Superstep api instead, though I prefer
plain bsp function. Unless we want to instrument the source code,
which I believe is not what we, including users, want.

With Superstep api we can resume the message from the latest (the new
refactored code should base on this as well) checkpointed message,
under some precondition.

Alternative we can implement our own code (not Java or probably in
Java 8) to perform checkpoint, but that would take very long time in
accomplishing those tasks. I would put that issue in the future
roadmap because personally I perform plain bsp  function instead of
Superstep.


On 9 April 2014 23:56, Suraj Menon surajsme...@apache.org wrote:
 I don't like my patch in HAMA-639 myself, eventhough I believe it satisfies
 all the mentioned requirements. The usage of superstep chaining API
 implementation in the patch is too complicated. A superstep here is like a
 transformation function you define on an RDD in Spark. So if you look into
 FT design of Spark, on failure, they rerun the operations on the RDD to get
 to the current state. This is similar to what we have in mind using
 checkpointing. The challenge is in getting the same messages replayed to
 newly spawned task on checkpointed data. If you don't use the Superstep(or
 any other abstraction representing a function) you cannot start processing
 from a line of code where the failure occurred. (Java does not support goto
 line number.)

 -Suraj


 On Wed, Apr 9, 2014 at 7:29 AM, Edward J. Yoon edwardy...@apache.orgwrote:

 I just found this: https://issues.apache.org/jira/browse/HAMA-503 and
 HAMA-639.

 Do you still think superstep API is essential for checkpoint/recovery?
 If not, we can drop it. I don't think it's good idea.

 On Wed, Apr 9, 2014 at 7:43 PM, Chia-Hung Lin cli...@googlemail.com
 wrote:
  Not very sure if we sync at the same page. And sorry I am not very
  familiar with Superstep implementation.
 
  I assume that traditional bsp model means the original bsp interface
  where there is a bsp function and user can freely call peer.sync(),
  etc. methods
 
   bsp(BSPPeer ... peer) {
  // whatever computation
  peer.sync();
  }
 
  And the superstep style is with Superstep abstract class.
 
  If this is the case, SuperstepBSP.java has already call sync, as
  below, outside each Superstep.compute(). So it looks like even
  SuperstepPiEstimator doesn't call sync() method, barrier sync will be
  executed because each Superstep is viewed as a superstep in original
  BSP definition.
 
@Override
public void bsp(BSPPeerK1, V1, K2, V2, M peer) throws IOException,
SyncException, InterruptedException {
  for (int index = startSuperstep; index  supersteps.length; index++)
 {
SuperstepK1, V1, K2, V2, M superstep = supersteps[index];
superstep.compute(peer);
if (superstep.haltComputation(peer)) {
  break;
}
peer.sync();
startSuperstep = 0;
  }
}
 
  Within the Superstep.compute(), if sync is called again, I would think
  that another barrier sync will be executed.
 
  SuperstepBSP.java
 
  for(...) {
superstep .compute() - { // in compute method
  ...
  peer.sync()
}
...
peer.sync()
  }
 
  IIRC each call to sync may raise the checkpoint (no recovery) method
  serialize message to hdfs.
 
  For SerializePrinting, following code snippet  may move
 
  for (String otherPeer : bspPeer.getAllPeerNames()) {
  bspPeer.send(otherPeer, new
 IntegerMessage(bspPeer.getPeerName(), i));
  }
 
  to Superstep.compute()
 
  And the outer for loop is what is programmed in SuperstepBSP.java
 
  for (int i = 0; i  NUM_SUPERSTEPS; i++) {
  // code that should be moved to Superstep.compute()
  }
  bspPeer.sync();
 
 
 
  On 9 April 2014 16:17, Edward J. Yoon edwardy...@apache.org wrote:
  As you can see here[1], the sync() method never called, and an classes
  of all superstars were needed to be declared within Job configuration.
  Therefore, I thought it's similar with Pregel style on BSP model. It's
  quite different from legacy model in my eyes.
 
  According to HAMA-505, superstep API seems used for FT job processing
  (I didn't read closely yet). Right? In here, I have an questions. What
  happens if I call the sync() method within compute() method? In this
  case, framework guarantees the checkpoint/recovery? And how can I
  implement the http://wiki.apache.org/hama/SerializePrinting using
  superstep API?
 
  What's difference between pure BSP and FT BSP? Any concrete example?
 
  I was mean the traditional BSP programming model.
 
  1.
 http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java
 
  On Wed, Apr 9, 2014 at 4:25 PM, Chia-Hung Lin cli...@googlemail.com
 wrote:
  Sorry don't catch the point.
 
  What's difference between pure BSP and FT BSP? Any concrete example?
 
 
  On 9 April 2014 08:29, Edward J. Yoon 

[DISCUSS] Fault tolerant BSP job

2014-04-08 Thread Edward J. Yoon
In my eyes, SuperstepPiEstimator[1] look like totally new programming
model, very similar with Pregel.

I personally would like to suggest that we provide both pure BSP and
fault tolerant BSP model, instead of replace.

1. 
http://svn.apache.org/repos/asf/hama/trunk/examples/src/main/java/org/apache/hama/examples/SuperstepPiEstimator.java

-- 
Edward J. Yoon (@eddieyoon)
Chief Executive Officer
DataSayer, Inc.