jasonk000 opened a new pull request, #13545: URL: https://github.com/apache/druid/pull/13545
### Description Use a smaller default buffer in the TextReader during parsing. By default, the [LineIterator creates a BufferedReader if one is not provided](https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/LineIterator.java#L84-L88); the default [BufferedReader creates an 8KB buffer](https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9a751dc19fae78ce58fb0eb176522070c992fb6f/jdk/src/share/classes/java/io/BufferedReader.java#L88). Improve this by creating a `FastLineIterator` which minimises the repeated creation of buffers. Before vs after ``` Benchmark Mode Cnt Score Error Units JsonLineReaderBenchmark.baseline avgt 15 3022.053 ± 51.286 us/op JsonLineReaderBenchmark.baseline avgt 15 3459.871 ± 106.175 us/op ``` Existing tests cover the change (JsonLineReader, CSVReader). Replacement for #12302. ### Changes - Introduces a new benchmark `JsonLineReaderBenchmark` - Adds a `FastLineIterator` and switches the `TextReader` to use it. This PR has: - [x] been self-reviewed. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [x] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
