ctubbsii commented on issue #3178: URL: https://github.com/apache/accumulo/issues/3178#issuecomment-1421300022
@keith-turner wrote: > In the kubernetes case this is not so easy when you have an auto-scaled group of tservers that is dynamically allocated ip addresses. When running scan servers and compactors in kubernetes it was so easy to start them with a group specified on the command line and have the rest of the system respond automatically as those different groups scaled up and down. It seems like it would be easy to make a balancer that could detect some tag/label associated with a particular kubernetes group. I'm not that familiar with running stuff in kubernetes, but I've seen the concept of "genders" or "flavors" or "groups" in scalable architectures before, so I'd be surprised if kubernetes didn't support something like that that the user could leverage easily on their own. @dlmarion wrote: > While the HostRegexTableLoadBalancer might work in an environment like kubernetes, I don't think it's user friendly. I agree. I don't necessarily think that balancer should be used. I think a different balancer that could make use of different labels would be warrented. However, given this is a niche use case, I still think this would be better suited to having a custom balancer for that situation... which is why we have balancers pluggable... rather than architecting a custom solution inside Accumulo. I don't think we need to be responsible for all user conveniences... making pluggable endpoints that empower users is often sufficient; there are trade-offs that grow our code complexity and increase our maintenance burden the more we try to take on every possible user deployment use case. If it's going to increase our code complexity, I'm going to favor more towards empowering users to customize things using our pluggable endpoints, rather than adding more APIs and config to Accumulo. I think it's worth considering keeping this a pluggable solution rather than a built-in one. > If we have the ability to assign group labels to TabletServers (like we do for ScanServers, and "queues" for Compactors), then for a kubernetes deployment the user has to: > > 1. Create different yaml descriptor files for the Accumulo processes to assign different groups > 2. Create a load balancer that balances tables using the groups (a variation of the HostRegexTableLoadBalancer) > 3. Configure this new balancer, assigning groups to tables. Right, but couldn't these groups just be kubernetes metadata? Isn't that already something that kubernetes supports today? Do we need to add baked-in metadata for tservers to define tserver groups or can users already get that today with some kind of kubernetes labels? If they already label tservers today in kubernetes, then growing the complexity of Accumulo to achieve this is much less appealing to me. But, I would be in favor of writing a blog post to show off a balancer implementation that would be aware of an existing kubernetes label feature. I think that would be very useful to empower users by showing them what our pluggable endpoints are capable of. > I have been working on a solution for this, I should be able to put it up today for comments. I will take a look at #3189. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
