Hello everyone,
I'm working on HAMA-767 and I have some concerns on counters and scalability.
Currently, every peer has a set of vertices and a variable that is keeping the
total number of vertices through all peers. In my case, I'm trying to add and
remove vertices during the runtime of a job, which means that I have to update
all those variables.
My problem is that this is not efficient because in every operation (add or
remove a vertex) I need to update all peers, so I need to send lots of messages
to make those updates (see GraphJobRunner#countGlobalVertexCount method) and I
believe this is not correct and scalable. An other problem is that, even if I
update all those variable (with the cost of sending lots of messages to every
peer) those variables will be updated on the next superstep.
e.g.:
Peer 1: Peer 2:
Vert_1 Vert_2
(Total_V = 2) (Total_V = 2)
addVertex()
(Total_V = 3)
getNumberOfV() => 2
------------------------ Sync ------------------------
getNumberOfV() => 3
Is there something like global counters or shared memory that it can address
this issue?
P.S. I have a small feeling that we don't need to track the total amount of
vertices because vertex centered algorithms rarely need total numbers, they
only depend on neighbors (I might be wrong though).
Thanks,
Anastasis