kfaraz commented on issue #6329:
URL: https://github.com/apache/druid/issues/6329#issuecomment-1208955556
This is now fixed because the implementation takes the reservoir size and
returns an iterator over those segments in a one-shot sampling.
```java
default Iterator<BalancerSegmentHolder> pickSegmentsToMove(
List<ServerHolder> serverHolders,
Set<String> broadcastDatasources,
int reservoirSize
);
```
```java
static List<BalancerSegmentHolder> getRandomBalancerSegmentHolders(
final List<ServerHolder> serverHolders,
Set<String> broadcastDatasources,
int k
)
{
List<BalancerSegmentHolder> holders = new ArrayList<>(k);
int numSoFar = 0;
for (ServerHolder server : serverHolders) {
if (!server.getServer().getType().isSegmentReplicationTarget()) {
// if the server only handles broadcast segments (which don't need
to be rebalanced), we have nothing to do
continue;
}
for (DataSegment segment : server.getServer().iterateAllSegments()) {
if (broadcastDatasources.contains(segment.getDataSource())) {
// we don't need to rebalance segments that were assigned via
broadcast rules
continue;
}
if (numSoFar < k) {
holders.add(new BalancerSegmentHolder(server.getServer(),
segment));
numSoFar++;
continue;
}
int randNum = ThreadLocalRandom.current().nextInt(numSoFar + 1);
if (randNum < k) {
holders.set(randNum, new BalancerSegmentHolder(server.getServer(),
segment));
}
numSoFar++;
}
}
return holders;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]