AmatyaAvadhanula commented on issue #10940:
URL: https://github.com/apache/druid/issues/10940#issuecomment-1229514991
@mounikanakkala @liubo-it
I think the reason this occurs is because you had a datasource with a higher
granularity (maybe monthly etc) and a new datasource with a lower granularity
(hourly) was later being loaded.
When new historicals were added, the cost for these was much lower than that
for the historicals with higher granularity segments because of the nature of
the cost function. This is why the older historicals were ignored and data was
being assigned to the new ones as part of the loading phase.
This is mitigated to a certain extent because of the following extra
normalization in cachingCost:
` return cost * (server.getMaxSize() / server.getAvailableSize());
`
I think the cluster should eventually balance itself in both cases but there
may be a temporary state of uneven distribution
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]