[ 
https://issues.apache.org/jira/browse/AVRO-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092067#comment-16092067
 ] 

Doug Cutting commented on AVRO-2052:
------------------------------------

There may be significant performance differences between a 
BufferedBinaryEncoder and a DirectBinaryEncoder writing to a buffered output 
stream.  Ints, longs, doubles and floats are all buffered internally in 
DirectBinaryEncoder, so removing the BufferedBinaryEncoder's buffering doesn't 
in fact reduce the number of bytes copied for these types but rather increases 
the number of invocations of the byte copier.  This was deemed significant in 
the past, but is perhaps worth re-benchmarking.  Perf.java (in ipc/.../io) 
could be used for this.  This doesn't likely matter for vout, but may be 
significant for bufOut.  This shouldn't be committed without such benchmarking.

> Remove org.apache.avro.file.DataFileWriter Double Buffering
> -----------------------------------------------------------
>
>                 Key: AVRO-2052
>                 URL: https://issues.apache.org/jira/browse/AVRO-2052
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.7, 1.8.2
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Trivial
>         Attachments: AVRO-2052.1.patch
>
>
> {code:title=org.apache.avro.file.DataFileWriter}
>   private void init(OutputStream outs) throws IOException {
>     this.underlyingStream = outs;
>     this.out = new BufferedFileOutputStream(outs);
>     EncoderFactory efactory = new EncoderFactory();
>     this.vout = efactory.binaryEncoder(out, null);
>     dout.setSchema(schema);
>     buffer = new NonCopyingByteArrayOutputStream(
>         Math.min((int)(syncInterval * 1.25), Integer.MAX_VALUE/2 -1));
>     this.bufOut = efactory.binaryEncoder(buffer, null);
>     if (this.codec == null) {
>       this.codec = CodecFactory.nullCodec().createInstance();
>     }
>     this.isOpen = true;
>   }
> {code}
> It's clear here that both streams are writing to a buffered destination, {{ 
> BufferedFileOutputStream}} and {{ByteArrayOutputStream}} therefore there is 
> no reason to need a buffered encoder and instead, write directly to the 
> buffered streams with {{directBinaryEncoder}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to