ASF GitHub Bot commented on KAFKA-4293:

GitHub user radai-rosenblatt opened a pull request:


    KAFKA-4293 - improve ByteBufferMessageSet.deepIterator() performance by 
relying on underlying stream's available() implementation

    provided better available() for ByteBufferInputStream
    provided better available() for KafkaLZ4BlockInputStream
    added KafkaGZIPInputStream with a better available()
    fixed KafkaLZ4BlockOutputStream.close() to properly flush
    Signed-off-by: radai-rosenblatt <radai.rosenbl...@gmail.com>

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/radai-rosenblatt/kafka suchwow

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2025


> ByteBufferMessageSet.deepIterator burns CPU catching EOFExceptions
> ------------------------------------------------------------------
>                 Key: KAFKA-4293
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4293
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions:
>            Reporter: radai rosenblatt
>            Assignee: radai rosenblatt
> around line 110:
> {noformat}
> try {
>     while (true)
>         innerMessageAndOffsets.add(readMessageFromStream(compressed))
> } catch {
>     case eofe: EOFException =>
>     // we don't do anything at all here, because the finally
>     // will close the compressed input stream, and we simply
>     // want to return the innerMessageAndOffsets
> {noformat}
> the only indication the code has that the end of the oteration was reached is 
> by catching EOFException (which will be thrown inside 
> readMessageFromStream()).
> profiling runs performed at linkedIn show 10% of the total broker CPU time 
> taken up by Throwable.fillInStack() because of this behaviour.
> unfortunately InputStream.available() cannot be relied upon (concrete example 
> - GZipInputStream will not correctly return 0) so the fix would probably be a 
> wire format change to also encode the number of messages.

This message was sent by Atlassian JIRA

Reply via email to