himanshug opened a new issue #8846: VersionedIntervalTimeline performance corner case with high number of overlapping segments URL: https://github.com/apache/incubator-druid/issues/8846 Recently all of our historical node restarted on a cluster which was serving about 50000 segments of following nature. each segment's interval would be 24 hours and each successive segment overlaps with previous one for 1439 minutes (1440 minutes is 24 hours), for example segment intervals and versions might look like 2019-01-01T00:00:00.000Z - 2019-01-01T23:59:00.000Z , v1 2019-01-01T00:01:00.000Z - 2019-01-02T00:00:00.000Z , v2 2019-01-01T00:02:00.000Z - 2019-01-02T00:01:00.000Z , v3 2019-01-01T00:03:00.000Z - 2019-01-02T00:02:00.000Z , v4 ... ... that triggered a sequence of `VersionIntervalTimeline.remove(..)` calls for each segment one by one and broker/coordinator never recovered and needed a forced restart because `VersionIntervalTimeline.remove(..)` becomes very expensive for above scenario and never finished. I did a quick prototype to batch multiple `VersionIntervalTimeline.remove(..)` calls into a single `VersionIntervalTimeline.removeAll(..)` call which could be used when data servers go down which had few optimizations possible. Batched call would first remove all entries `allTimelineEntries` and then from `complete/incompletePartitionTimeline` and then adjust them based on the state of `allTimelineEntries` , with batched version `allTimelineEntries` has significantly fewer entries and no unnecessary corrections are to be made to `complete/incompletePartitionTimeline` which happens in non-batched removals. ..creating this issue to discuss other proposed solutions.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org