I pushed a wip-osd-hb branch that vastly simplifies the OSD heartbeats.  
The problem was that ages ago I went for a model with asymmetric 
heartbeats (at the time, replicas -> primaries) because it was elegant and 
seemed more efficient.  The reality was that the asynchrony between osdmap 
versions on different modes made this a huge pain to make reobust, 
particularly when it came to managing the persistence of sessions between 
nodes that are going up/down.

The new branch throws that all out and uses a simple ping/reply model.  
The retry behavior is simple, robust, and all the failure issues go away.  
The downside is that there are more messages moving around... but, they 
are tiny, so who cares...

This should address the problems Wido was seeing in #2116.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to