[ 
https://issues.apache.org/jira/browse/HAMA-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221310#comment-13221310
 ] 

Thomas Jungblut commented on HAMA-515:
--------------------------------------

After a bit crunching of how it currently works with Apurv, it seems that we 
have to rely on Zookeeper and should cut down the whole counter over RPC stuff. 

Two reasons, RPC is not guranteed to be propagated after sync() to bspmaster 
and the way back because of heartbeat intervals and latency. It also does not 
feel right because of the massive overhead, you can DDoS your bspmaster with 
it, really.

Currently counters are only incremented on BSPMaster side when the task 
finishes, but all the way down they will be transfered. This uses bandwidth 
which can be used otherwise and and can delay heartbeats (depending on how many 
counters etc).

I guess Zookeeper would make a better fit for this task, however there are 
still problems:
Who is coordinating the increment? 
Maybe we should add a znode for each task and before entering barrier we set 
the counters, then after the enter barrier we loop through all task nodes and 
increment our counter object -> which also implies problems because it already 
contains counters. So we can create a new counter object with all the new 
summed up counters or we add a method which subtracts the foreign counters and 
just adds the difference.

Anyways we should add a method which let you reset a specific counter, here is 
also the question who is coordinating it. 
In any case we should favor a collective approach before a coordination 
approach.

                
> Counter should be synchronized by sync()
> ----------------------------------------
>
>                 Key: HAMA-515
>                 URL: https://issues.apache.org/jira/browse/HAMA-515
>             Project: Hama
>          Issue Type: New Feature
>    Affects Versions: 0.4.0
>            Reporter: Thomas Jungblut
>            Priority: Minor
>             Fix For: 0.5.0
>
>
> We should synchronize the counters in all tasks after a sync() call. 
> Then someone can use it for flow control. E.G. graph related algorithms, to 
> get the number of globally updated vertices.
> Two options to solve this:
>  - Sync the whole stuff over RPC from the BSPMaster
>  - Use Zookeeper for counter handling

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to