I know about MapReduce/Hadoop and trying to get myself around
BSP/Hama-Giraph by comparing MR and BSP.
- Map Phase in MR is similar to Computation Phase in BSP. BSP allows for
process to exchange data in the communication phase, but there is no
communication between the mappers in the Map Phase. Though the data flows
from Map tasks to Reducer tasks. Please correct me if I am wrong. Any other
- After going through the documentation for Hama and Giraph, noticed that
they both use Hadoop as the underlying framework. In both Hama and Giraph
an MR Job is submitted. Does each superstep in BSP correspond to a Job in
MR? Where are the incoming, outgoing messages and state stored - HDFS or
HBase or Local or pluggable?
- If a Vertex is deactivated and again activated after receiving a message,
does is run on the same node or a different node in the cluster?