Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change 
notification.

The "DynamicGraphs" page has been changed by AnastasisAndronidis:
https://wiki.apache.org/hama/DynamicGraphs

New page:
== Dynamic Addition and Removal of Vertexes in Graph API ==

=== Introduction ===
Nowadays more and more people turn to distributed environments to store and 
analyze Big Data. For that reason there is a growing need in both achieving 
efficiency and new features. Hama is an upcoming project that gives to 
researchers and analysts a way to handle big amounts of data, through the BSP 
computation model.
 
A very interesting feature of the Hama project is the Graph API for graph 
analysis. A lot of scientists and companies represent and manage their data 
with the use of one or more of the many different kinds of graphs (e.g. 
incremental graphs).
 
This article will present some new features on the Graph API of Hama, on how to 
create a dynamic graphs on run time. It also will serve as a technical document 
of the internals of those features.

=== Description ===
At the time being, we will discuss only two different operations (addition and 
deletion) from both points of the end-user and internal architecture.

Some points that we need to keep in mind, are:
 1. Deletion/Addition is happening after a super step. ''We are providing 
methods inside a vertex instance that create/delete a vertex''
 2. New vertexes needs to be distributed through partitioner to be placed on 
right peers.
 3. Keep in mind that a new created vertex and old vertexes, will have the same 
super step counter. ''Various algorithms change their behavior though time, an 
example with an internal counter to serve as a state is needed''

=== Implementation Details ===
The `GraphJobRunner` class, contains the major steps for the graph computation. 
It starts with an initial setup, where the vertexes are loaded to memory, then 
there is the main computation loop, where the supersteps occur, and in the end 
there is the cleanup, where of the graph and the results are written to HDFS.

Our main target (concerning the operations +/-) is to enable the functionality 
of `GraphJobRunner::loadVertices()` function during a superstep so the user 
will be able to call a function during a superstep (e.g. `addVertex(value, 
List<> edges)`) to add a vertex. The user will also be able to call a delete 
method (e.g. `deleteVertex(vertexID)`) to remove vertexes.

An important note here is that adding or removing vertexes is happening in 
every sync of the peers. This means that during a superstep the additions or 
deletions will be locally stored and after the end of the superstep will be 
committed, through messages, to the peers through the partitioner. 

Reply via email to