jasonk000 opened a new pull request, #13545:
URL: https://github.com/apache/druid/pull/13545

   ### Description
   
   Use a smaller default buffer in the TextReader during parsing. By default, 
the [LineIterator creates a BufferedReader if one is not 
provided](https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/LineIterator.java#L84-L88);
 the default [BufferedReader creates an 8KB 
buffer](https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/9a751dc19fae78ce58fb0eb176522070c992fb6f/jdk/src/share/classes/java/io/BufferedReader.java#L88).
   
   Improve this by creating a `FastLineIterator` which minimises the repeated 
creation of buffers.
   
   Before vs after
   ```
   Benchmark                         Mode  Cnt     Score    Error  Units
   JsonLineReaderBenchmark.baseline  avgt   15  3022.053 ± 51.286  us/op
   JsonLineReaderBenchmark.baseline  avgt   15  3459.871 ± 106.175  us/op
   ```
   Existing tests cover the change (JsonLineReader, CSVReader). Replacement for 
#12302.
   
   ### Changes
   
   - Introduces a new benchmark `JsonLineReaderBenchmark`
   - Adds a `FastLineIterator` and switches the `TextReader` to use it.
   
   This PR has:
   
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to