> #3 is, I think, the right answer. It make our system simpler and it > makes the behavior in failure conditions more predictable and safe.
Any thoughts on time-to-self-heal? My impression browsing the code, and it seems to be confirmed by some wiki material, is that anti-entropy is triggered only during full compactations. While hinted handoff is never a guarantee, doing without it completely probably increases the urgency of anti-entropy. In general, what are people's thoughts on the appropriate mechanism to gain confidence that the cluster as a whole is reasonably consistent? In particular in relation to performing maintenance that may require popping nodes in and out in some kind of rolling fashion. Are full compactations expected to be something you would want to trigger semi-regularly on production clusters by hand? -- / Peter Schuller aka scode