clintropolis opened a new pull request #9312: Logging large segment list handling URL: https://github.com/apache/druid/pull/9312 ### Description I went through a ton of logs to try and spot places where we could be potentially logging a large number of segments or other unnecessarily large things to try and ensure that logging doesn't ever destroy us, in response to such a thing happening from log containing [`SegmentTransactionalInsertAction.toString`](https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/SegmentTransactionalInsertAction.java#L308) with a very large number of segments (exaggerated by the fact that it was using the whole `DataSegment` rather than the `SegmentId`). To avoid this happening even with `SegmentId`, I added a utility method to `SegmentUtils` that allows logging chunks of 64 `SegmentIds` at a time to ensure we will never create a gigantic string to log containing all of the segments no matter how many segments are involved because each message should be sized in the tens of kilobytes. There remain two areas that exceptions thrown in `SegmentLockHelper`, one in `tryLockSegments` the other in `verifySegmentGranularity` which could I guess create giant strings if you have absurd numbers of segments you are trying to lock, but I was unsure how likely this is to happen and if worth adding a logger here. I will admit that it is possible this PR is over-correcting, and just ensuring that we always use `commaSeparatedIdentifiers` would be sufficient for all but absurdly large numbers of segments... but I guess does take logs as the culprit out of the equation if we do decide it necessary support such scenarios. <hr> This PR has: - [ ] been self-reviewed. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
