Hi, I like the idea of reducing write-bandwidth used by topology. I'd sum it into three possible levels though:
1) keep the (topology-connector) announcement's lastHeartbeat as a separate property and only update that (on receiving a connector-heartbeat) instead of updating the entire announcement-json as is now. 2) we might even be able to not having to store the announcement's lastHeartbeat when the logic is changed, such that the announcement is valid as long as the recipient of the announcement (ie the owner) is alive. This would increase the reaction time on crash of a remote instance longer though. 3) avoid repository (ie cluster-local) heartbeats entirely for the single-node case (in which case keeping the announcement in memory is feasible). I see level 1 as something we should do, level 2 to be further analyzed (verify the implications, but I think it's possible). But I have my reservations re level 3, as this would complicate the 'cluster first' goal: we'd have to detect situations where a single-node is 'suddenly' accompanied by another node to form a cluster, as this would have to be detected by discovery.impl. And I fear that this might in the end-effect again result in some sort of heartbeat (maybe for a limited time after startup only though). Question is, whether it's a "problem" to have cluster-heartbeats stored every say 30 sec and whether that justifies complicating the algorithm for this case. Cheers, Stefan On 2/7/14 2:44 PM, "Jörg Hoh" <[email protected]> wrote: >Hi, > >I am thinking if we reduce the amount of data persisted in the repository >with every topology heartbeat. > >For example we could just update the timestamp of the of announcement >hearbeat, if the topology hasn't changed at all (instead of writing the >complete announcement). > >A more radical approach would be to avoid the persisting of topology >information to repo completely, if this node isn't part of a cluster at >all. All the state could be kept in memory, and in case of crash/restart >the topology needs to gathered again. Of course this would require some >more logic in case if a single node is being promoted to a member of an >cluster, as then the current behaviour should be used. > >WDYT? > >Jörg > > >-- > >http://cqdump.wordpress.com >Twitter: @joerghoh
