You guys seems totally misunderstood what I am saying. Every BSP processor accesses to ZK's counter concurrently? Do you think it is possible to determine the current total number of vertices in every step without barrier synchronization?
As I mentioned before, there is already additional barrier synchronization steps for aggregating and broadcasting global updated vertex count. You can use this steps without *no additional barrier synchronization*. On Wed, Jul 17, 2013 at 5:01 AM, andronat_asf <[email protected]> wrote: > Thank you everyone, > > +1 for Tommaso, I will see what I can do about that :) > > I also believe that ZK is very similar sync() mechanism that Edward is > saying, but if we need to sync more info we might need ZK. > > Thanks again, > Anastasis > > On 15 Ιουλ 2013, at 5:55 μ.μ., Edward J. Yoon <[email protected]> wrote: > >> andronat_asf, >> >> To aggregate and broadcast the global count of updated vertices, we >> calls sync() twice. See the doAggregationUpdates() method in >> GraphJobRunner. You can solve your problem the same way, and there >> will be no additional cost. >> >> Use of Zookeeper is not bad idea. But IMO, it's not much different >> with sync() mechanism. >> >> On Mon, Jul 15, 2013 at 10:05 PM, Chia-Hung Lin <[email protected]> >> wrote: >>> +1 for Tommaso's solution. >>> >>> If not every algorithm needs counter service, having an interface with >>> different implementations (in-memory, zk, etc.) should reduce the side >>> effect. >>> >>> >>> On 15 July 2013 15:51, Tommaso Teofili <[email protected]> wrote: >>>> what about introducing a proper API for counting vertices, something like >>>> an interface VertexCounter with 2-3 implementations like >>>> InMemoryVertexCounter (basically the current one), a >>>> DistributedVertexCounter to implement the scenario where we use a separate >>>> BSP superstep to count them and a ZKVertexCounter which handles vertices >>>> counts as per Chian-Hung's suggestion. >>>> >>>> Also we may introduce something like a configuration variable to define if >>>> all the vertices are needed or just the neighbors (and/or some other >>>> strategy). >>>> >>>> My 2 cents, >>>> Tommaso >>>> >>>> 2013/7/14 Chia-Hung Lin <[email protected]> >>>> >>>>> Just my personal viewpoint. For small size of global information, >>>>> considering to store the state in ZooKeeper might be a reasonable >>>>> solution. >>>>> >>>>> On 13 July 2013 21:28, andronat_asf <[email protected]> wrote: >>>>>> Hello everyone, >>>>>> >>>>>> I'm working on HAMA-767 and I have some concerns on counters and >>>>> scalability. Currently, every peer has a set of vertices and a variable >>>>> that is keeping the total number of vertices through all peers. In my >>>>> case, >>>>> I'm trying to add and remove vertices during the runtime of a job, which >>>>> means that I have to update all those variables. >>>>>> >>>>>> My problem is that this is not efficient because in every operation (add >>>>> or remove a vertex) I need to update all peers, so I need to send lots of >>>>> messages to make those updates (see GraphJobRunner#countGlobalVertexCount >>>>> method) and I believe this is not correct and scalable. An other problem >>>>> is >>>>> that, even if I update all those variable (with the cost of sending lots >>>>> of >>>>> messages to every peer) those variables will be updated on the next >>>>> superstep. >>>>>> >>>>>> e.g.: >>>>>> >>>>>> Peer 1: Peer 2: >>>>>> Vert_1 Vert_2 >>>>>> (Total_V = 2) (Total_V = 2) >>>>>> addVertex() >>>>>> (Total_V = 3) >>>>>> getNumberOfV() => 2 >>>>>> >>>>>> ------------------------ Sync ------------------------ >>>>>> >>>>>> getNumberOfV() => 3 >>>>>> >>>>>> >>>>>> Is there something like global counters or shared memory that it can >>>>> address this issue? >>>>>> >>>>>> P.S. I have a small feeling that we don't need to track the total amount >>>>> of vertices because vertex centered algorithms rarely need total numbers, >>>>> they only depend on neighbors (I might be wrong though). >>>>>> >>>>>> Thanks, >>>>>> Anastasis >>>>> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon >> > -- Best Regards, Edward J. Yoon @eddieyoon
