[ 
https://issues.apache.org/jira/browse/AVRO-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Skraba updated AVRO-3183:
------------------------------
    Fix Version/s: 1.11.0

> Do Not Double Buffer Data in DataFileWriter
> -------------------------------------------
>
>                 Key: AVRO-3183
>                 URL: https://issues.apache.org/jira/browse/AVRO-3183
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.10.0
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Minor
>             Fix For: 1.11.0
>
>
> {code:java|title=DataFileWriter.java}
>   private void init(OutputStream outs) throws IOException {
>     this.underlyingStream = outs;
>     this.out = new BufferedFileOutputStream(outs);
>     EncoderFactory efactory = new EncoderFactory();
>     // binaryEncoder returns a buffered Encoder and is wrapping a 
> BufferedFileOutputStream
>     this.vout = efactory.binaryEncoder(out, null);
>     dout.setSchema(schema);
>     buffer = new NonCopyingByteArrayOutputStream(Math.min((int) (syncInterval 
> * 1.25), Integer.MAX_VALUE / 2 - 1));
>     // binaryEncoder returns a buffered Encoder and is wrapping a 
> NonCopyingByteArrayOutputStream
>     this.bufOut = efactory.binaryEncoder(buffer, null);
>     if (this.codec == null) {
>       this.codec = CodecFactory.nullCodec().createInstance();
>     }
>     this.isOpen = true;
>   }
> {code}
> The {{FileWriter}} is double-buffering the output which just adds redundant 
> overhead and truthfully the buffering offered by the object returned by 
> {{binaryEncoder}} is a bit simplistic and does not do as good of a job as the 
> buffering in {{BufferedFileOutputStream}}.
> Remove this double buffering by using a 'direct' {{binaryEncoder}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to