gianm opened a new issue #9027: GenericIndexedWriter can fail when writing large values into large columns URL: https://github.com/apache/incubator-druid/issues/9027 In GenericIndexedWriter, at the point of potentially transitioning from single -> multiple files: 1. A value is written to `valuesOut`. 2. The size of `valuesOut` is written to the header as an int (with a checked cast). 3. `getSerializedSize()`, which includes `valuesOut` and some other stuff, is called to see if it is larger than the `fileSizeLimit`, which is very close to the max int value. This works great if `valuesOut` grows slowly enough that "other stuff" in `getSerializedSize()`, plus the slack space between `fileSizeLimit` and `Integer.MAX_VALUE`, is larger than any one particular addition to `valuesOut`. Otherwise, the checked cast in (2) will fail.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org