EdColeman commented on PR #2569: URL: https://github.com/apache/accumulo/pull/2569#issuecomment-1109876093
The metrics can be removed - but they were not intended to necessarily be user facing. The multiple caching layers make reasoning about performance difficult. It could be that using a single ZooKeeper node vs a node for each property reduces the original need for the caching because it is no longer necessary to walk the entire directory with multiple ZooKeeper calls to re-hydrate the configuration. Without metrics or some other way to measure in context, it is not possible to predict which caching can be simplified nor the impact. Is one node sufficient that the entire PropStore as a replacement for ZooCache under utilized and unnecessary? Is the secondary caching with the derivers still providing benefits? How about the ServerConfigurationFactory? I suspect that much of this could be simplified (esp. removal of ServerConfigutationFactory) and having measurements in place to measure impacts would help make an informed choice. It also seems that most of the performance impacts would occur in things like long-running scans - having cache metrics along with other scan metrics seems like it could be informative for making such comparisons. Overall goal of this PR was to provide an equivalent replacement for ZooCache property storage while maintaining the same caching schemes until it can be determined where other simplifications can be made without impacting performance. I agree there is a lot of complexity for managing changes of properties that if they change, they are changed only rarely. I see this PR as establishing a baseline for future optimizations. It attempts to maintain the existing structures until they can be evaluated and simplified. Metrics should form a base of future improvements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
