Hi all,

Debugging some perf issue I realized we still have that old string parsing
which basically does:

if (we need more char and buffer is full) growBuffer()

Kind of ArrayList behavior.

However it is really inefficient and with a correctly sized buffer (8k for
ex) we are 10x slower than some other parser.

Indeed we can workaround locally by using a buffer big enough to avoid the
growBuffer() call but it reduces the scalability cause the needed
allocation becomes overkill (you can end up needing 2G of mem just for the
parser).

My idea without digging more would be to rework the readString() method to
not use the growBuffer() behavior but track each *char[]* segment in a list
("pendingStrings") and replace getString() by something like
*pendingStrings.stream().map(String::new).collect(joining());*.

This way we:

* enable to optimize the memory usage by keeping "small" buffers
* we can share buffers between threads avoiding the 2G allocation -
ultimately we can limit the number of buffers we want to release in the
pool to a max, not sure it is useful
* we can aim at being 6-7 times faster

I compared with jackson and we are 10 times slower with a 8k buffer and
twice faster with a big enough buffer but jackson default buffer but
autoadjusted to 64k after 2 segments.
Guess we can use such a strategy too.

FYI the test I do use this generator:

private static void doGenerate() throws IOException { // 5.1M
    try (final var out =
Files.newBufferedWriter(Path.of("/tmp/test/input.json"))) {
        out.write("{\"data\":\"");
        IntStream.range(0, 5 * 1024 * 1024 + 1 /*don't be
/1024*/).forEach(i -> {
            try {
                out.write('a');
            } catch (final IOException e) {
                throw new IllegalStateException(e);
            }
        });
        out.write("\"}");
    }
}


Wdyt? Does it make sense for you to update the parser? Do you have some
better idea?
Happy if anyone wants to have some look too and play with it.

Best,
Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

Reply via email to