[ 
https://issues.apache.org/jira/browse/AVRO-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dave irving updated AVRO-1041:
------------------------------

    Fix Version/s: 1.6.3
           Status: Patch Available  (was: Open)

Patch just addresses resize bug.
                
> Utf8 allocates new byte array unnessisarily
> -------------------------------------------
>
>                 Key: AVRO-1041
>                 URL: https://issues.apache.org/jira/browse/AVRO-1041
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.6.2
>            Reporter: dave irving
>            Priority: Minor
>             Fix For: 1.6.3
>
>         Attachments: AVRO-1041.patch
>
>
> When a {{Utf8}} instance is about to receive new data (i.e. in 
> {{BinaryDecoder}}), {{Utf8::setByteLength}} is invoked to essentially ensure 
> capacity of the backing byte array.
> However, the logical length of the current instance is compared against the 
> required size rather than the existing byte array size.
> This causes needless allocations of a new backing byte array: If you read a 
> 10 byte string followed by an 8 byte string followed by a 9 byte string, the 
> 3rd read will cause a new backing array allocation even though the instance 
> already has a 10 byte array at its disposal.
> At a minimum we should replace:
> {code}
>   public Utf8 setByteLength(int newLength) {
>     if (this.length < newLength) {
>       byte[] newBytes = new byte[newLength];
>       System.arraycopy(bytes, 0, newBytes, 0, this.length);
>       this.bytes = newBytes;
>     }
>     ...
>   }
> {code}
> with:
> {code}
>   public Utf8 setByteLength(int newLength) {
>     if (this.bytes.length < newLength) {
>       byte[] newBytes = new byte[newLength];
>       System.arraycopy(bytes, 0, newBytes, 0, this.length);
>       this.bytes = newBytes;
>     }
>     ...
>   }
> {code}
> We may also wish to consider setting a maximum size limit to the utf8 
> instance: If we allocate over this, we drop the backing array the next time 
> we get a resize for a data length smaller than this (so we aren't forced to 
> keep memory for the largest utf8 encountered in memory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to