[
https://issues.apache.org/jira/browse/AVRO-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Scott Carey updated AVRO-753:
-----------------------------
Attachment: AVRO-753.v3.patch
Updated patch, v3:
Cleaner design, breaks the Encoder API with respect to initialization and
configuration of encoders.
All Encoders have no public constructors, and go through EncoderFactory.
BinaryEncoder is an abstract type, with three subtypes:
DirectBinaryEncoder, BufferedBinaryEncoder, and BlockingBinaryEncoder.
Encoder.init(OutputStream) is removed, all construction and configuration flow
thorugh EncoderFactory. Encoder's API is strictly about writing Avro
primitives.
Much JavaDoc.
Intended CHANGES.txt message included. I think this is ready.
> Java: Improve BinaryEncoder Performance
> ----------------------------------------
>
> Key: AVRO-753
> URL: https://issues.apache.org/jira/browse/AVRO-753
> Project: Avro
> Issue Type: Improvement
> Components: java
> Reporter: Scott Carey
> Assignee: Scott Carey
> Fix For: 1.5.0
>
> Attachments: AVRO-753.v1.patch, AVRO-753.v2.patch, AVRO-753.v3.patch
>
>
> BinaryEncoder has not had a performance improvement pass like BinaryDecoder
> did. It still mostly writes directly to the underlying OutputStream which is
> not optimal for performance. I like to use a rule that if you are writing to
> an OutputStream or reading from an InputStream in chunks smaller than 128
> bytes, you have a performance problem.
> Measurements indicate that optimizing BinaryEncoder yields a 2.5x to 6x
> performance improvement. The process is significantly simpler than
> BinaryDecoder because 'pushing' is easier than 'pulling' -- and also because
> we do not need a 'direct' variant because BinaryEncoder already buffers
> sometimes.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira