Just my personal viewpoint. For small size of global information,
considering to store the state in ZooKeeper might be a reasonable
solution.

On 13 July 2013 21:28, andronat_asf <[email protected]> wrote:
> Hello everyone,
>
> I'm working on HAMA-767 and I have some concerns on counters and scalability. 
> Currently, every peer has a set of vertices and a variable that is keeping 
> the total number of vertices through all peers. In my case, I'm trying to add 
> and remove vertices during the runtime of a job, which means that I have to 
> update all those variables.
>
> My problem is that this is not efficient because in every operation (add or 
> remove a vertex) I need to update all peers, so I need to send lots of 
> messages to make those updates (see GraphJobRunner#countGlobalVertexCount 
> method) and I believe this is not correct and scalable. An other problem is 
> that, even if I update all those variable (with the cost of sending lots of 
> messages to every peer) those variables will be updated on the next superstep.
>
> e.g.:
>
> Peer 1:                            Peer 2:
>   Vert_1                              Vert_2
> (Total_V = 2)                  (Total_V = 2)
> addVertex()
> (Total_V = 3)
>                                          getNumberOfV() => 2
>
> ------------------------ Sync ------------------------
>
>                                          getNumberOfV() => 3
>
>
> Is there something like global counters or shared memory that it can address 
> this issue?
>
> P.S. I have a small feeling that we don't need to track the total amount of 
> vertices because vertex centered algorithms rarely need total numbers, they 
> only depend on neighbors (I might be wrong though).
>
> Thanks,
> Anastasis

Reply via email to