[jira] [Created] (AVRO-1041) Utf8 allocates new byte array unnessisarily

dave irving (Created) (JIRA) Thu, 01 Mar 2012 13:34:24 -0800

Utf8 allocates new byte array unnessisarily
-------------------------------------------


                 Key: AVRO-1041
                 URL: https://issues.apache.org/jira/browse/AVRO-1041
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.6.2
            Reporter: dave irving
            Priority: Minor


When a {{Utf8}} instance is about to receive new data (i.e. in 
{{BinaryDecoder}}), {{Utf8::setByteLength}} is invoked to essentially ensure 
capacity of the backing byte array.
However, the logical length of the current instance is compared against the 
required size rather than the existing byte array size.
This causes needless allocations of a new backing byte array: If you read a 10 
byte string followed by an 8 byte string followed by a 9 byte string, the 3rd 
read will cause a new backing array allocation even though the instance already 
has a 10 byte array at its disposal.
At a minimum we should replace:
{code}
  public Utf8 setByteLength(int newLength) {
    if (this.length < newLength) {
      byte[] newBytes = new byte[newLength];
      System.arraycopy(bytes, 0, newBytes, 0, this.length);
      this.bytes = newBytes;
    }
    ...
  }
{code}

with:
{code}
  public Utf8 setByteLength(int newLength) {
    if (this.bytes.length < newLength) {
      byte[] newBytes = new byte[newLength];
      System.arraycopy(bytes, 0, newBytes, 0, this.length);
      this.bytes = newBytes;
    }
    ...
  }
{code}

We may also wish to consider setting a maximum size limit to the utf8 instance: 
If we allocate over this, we drop the backing array the next time we get a 
resize for a data length smaller than this (so we aren't forced to keep memory 
for the largest utf8 encountered in memory).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (AVRO-1041) Utf8 allocates new byte array unnessisarily

Reply via email to