Good catch Sravya - these do sound like significant problems. Great job thinking through the design, I'll need to think through it a bit more but at a high level it makes sense. It might be good to put together a small design doc on this to review and so we all understand the protocol and failures points. A more specific point on the HMS leader election implementation - I think you need to use LeaderSelector rather than LeaderLatch so HMS plugin can get notified (via callback) when the leader changes. I don't think that's possible with the LeaderLatch. Are there any JIRAs I can follow to track/review this work?
Thanks, Lenni On Fri, Aug 28, 2015 at 8:15 PM, Sravya Tirukkovalur <[email protected]> wrote: > Hi fellow developer, > > Looks like there are some problems with current design when Sentry is in HA > / HMS is in HA. Here are some of the problems I have identified some > problems and would like to propose some solutions. Please let me know what > you think. > > Problem 1: Zookeeper might blow up if HMS meta data is too big > See https://cwiki.apache.org/confluence/display/CURATOR/TN4 > > Problem 2: Both HMSs send full updates to sentry > There is a chance that these two full updates might actually be different. > This is true if there are some meta data operations while the full update > is being built on one server. > > Proposed design: > For HMS HA: > We will pick a leader using curator's Leader latch and only this HMS would > be responsible for sending the path updates to Sentry > For the propagation of path updates from the follower, we will use > PathChildrenCacheListener recipe of curator, where the follower can post > the updates it sees to ZK path. And the leader listens in this path, and > processes these updates and sends to Sentry. > > For Sentry HA: > - Leader sends the path updates to both the sentry servers. > - And for permission updates, sentry servers use zookeeper similar to HMS > to propagate updates to each other. > > Regards, > -- > Sravya Tirukkovalur >
