capistrant commented on pull request #10284: URL: https://github.com/apache/druid/pull/10284#issuecomment-674232612
> Would it make sense to perform the segment limiting for each `ServerHolder` individually rather than short circuiting the entire selection flow? Say if set `maxSegmentsToConsiderPerMove` to 50, it will consider only 50 segments per `ServerHolder` after which it will run the selection process for the next `ServerHolder`. That way, we can ensure that ReservoirSegmentSampler considers all the given servers before making the selection. If the `ServerHolder` objects were passed in an indeterminate order, I would agree. But since they are not, I think that doing this could start to work against what I am trying to accomplish here. As an admin, I have determined that we are wasting our time bothering to consider segments once we have considered a certain amount of them. Because, after that point, the probability of choosing one of these segments is so low that I have deemed it to be not worth the resources used to consider it. My implementation says, if under normal balancing conditions (no limit), that all the segments beyond some point (probably) wouldn't be chosen anyways, let's just bail out early. Your suggestion here somewhat changes the fundamentals of the balancing strategy by now considering segments that would under normal balancing conditions, have had near zero chance of being chosen to now having a chance of being chosen, that I as an admin, have determined is likely enough to spend the resources to fi nd out. And worse yet, those segments that may be chosen from `ServerHolders` at the end of the list are from Servers that I would probably prefer to be receiving segments rather than giving them up (in terms of consumption, not locality/performance), unless my cluster is very well balanced by consumption already. Perhaps what we really want here is a way to configure the balancer to use one of many(?) implementations of "pick a segment from a list of `ServerHolder` objects". Instead of always using `ReservoirSegmentSampler#getRandomBalancerSegmentHolder` there could be multiple implementations of this action and an option for the admin to choose one with the current one being the default. I did slightly consider this in my head when I was thinking up the solution, but decided it may be too much of a heavy lift if the problem I am addressing is unique to certain large deployments who have tenants stressing over the execution time of a coordination cycle. Instead of combing up with a way to configure a selector and then add my slightly different selector with static coordinator configs instead of the dynamic config, I kind of wedged my implementation desire into the existing implementation. > Also it might be more convenient for the user to express this parameter as a percentage of segments to select from rather than a number, but i guess that might take away some of the performance gain that we're trying to achieve here. I had not considered this initially, but it is a good idea. I'd be okay with going this route if it is trivial to calculate the number of segments in the cluster including replication. I'll have to look at that. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
