EOFException is thrown during normal operation
----------------------------------------------

                 Key: AVRO-813
                 URL: https://issues.apache.org/jira/browse/AVRO-813
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.5.0
            Reporter: Bruno Dumon
         Attachments: avro-813-patch.txt

In an application that uses Avro as RPC mechanism (with the NettyTransceiver, 
but that's irrelevant), I've noticed in jprofiler that during normal operation 
quite some time was spent creating EOFExceptions:

{noformat}
  5.4% - 2,004 ms org.apache.avro.ipc.generic.GenericResponder.readRequest
  5.0% - 1,871 ms org.apache.avro.generic.GenericDatumReader.read
  4.9% - 1,832 ms org.apache.avro.generic.GenericDatumReader.read
  4.9% - 1,832 ms org.apache.avro.generic.GenericDatumReader.readRecord
  4.5% - 1,670 ms org.apache.avro.generic.GenericDatumReader.read
  4.5% - 1,670 ms org.apache.avro.generic.GenericDatumReader.readRecord
  4.3% - 1,596 ms org.apache.avro.generic.GenericDatumReader.read
  2.8% - 1,048 ms org.apache.avro.generic.GenericDatumReader.readArray
  1.3% - 477 ms org.apache.avro.io.ValidatingDecoder.arrayNext
  1.3% - 471 ms org.apache.avro.io.BinaryDecoder.arrayNext
  1.3% - 466 ms org.apache.avro.io.BinaryDecoder.doReadItemCount
  1.3% - 466 ms org.apache.avro.io.BinaryDecoder.readLong
  1.3% - 466 ms org.apache.avro.io.BinaryDecoder.ensureBounds
  1.3% - 466 ms org.apache.avro.io.BinaryDecoder$ByteSource.compactAndFill
  1.3% - 466 ms 
org.apache.avro.io.BinaryDecoder$InputStreamByteSource.tryReadRaw
  1.3% - 466 ms org.apache.avro.util.ByteBufferInputStream.read
  1.3% - 466 ms org.apache.avro.util.ByteBufferInputStream.getBuffer
  1.3% - 466 ms java.io.EOFException.<init>
  1.3% - 466 ms java.io.IOException.<init>
  1.2% - 460 ms java.lang.Exception.<init>
  1.2% - 460 ms java.lang.Throwable.<init>
  1.2% - 460 ms java.lang.Throwable.fillInStackTrace
{noformat}


These exceptions are produced by the ByteBufferInputStream (which modifies 
InputStream's contract: return -1 at eof), but are catched higher up by the 
tryReadRaw method.

What happens is this:

The message in question has an (empty) array at the end of its message, thus 
the reader tries to read the size of this array in BinaryDecoder.readLong. This 
calls ensureBounds(10), whose contract is that it should read 10 bytes if they 
are available, and otherwise be quiet. ensureBounds calls via compactAndFill 
the tryReadRaw method. It is this method which catches the EOFException, 
because it only 'tries' to read so many bytes.

Note that InputStreamByteSource.readRaw (without the 'try' part) does itself 
check if read < 0 in order to throw EOFException, making the throwing of 
EOFException in ByteBufferInputStream unnecessary (for this particular usage).

There was some talk about EOFException in AVRO-392 too, though it seems this 
particular common case was not mentioned there. When using Avro RPC, or more in 
general, when using Avro to read small messages rather than large files, it 
seems like one can very easily run into this EOFException situation, which 
hurts performance.

I'll attach a patch which simply removes the throwing of EOFException in 
ByteBufferInputStream, but this will likely break other cases which rely on the 
EOFException being thrown (haven't researched this to the bottom).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to