[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Kannan Muthukkaruppan (JIRA) Fri, 05 Nov 2010 11:39:05 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928741#action_12928741
 ]


Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

For (i), the thought was to subclass ByteArrayOutputStream, and override the 
write method to something like:

{code}
  public synchronized void write(byte b[], int off, int len) {
    if ((off < 0) || (off > b.length) || (len < 0) ||
        ((off + len) > b.length) || ((off + len) < 0)) {
      throw new IndexOutOfBoundsException();
    } else if (len == 0) {
      return;
    }

    int newcount = count + len;
    if (newcount > buf.length) {
      int newSize = (int)Math.min((((long)buf.length) << 1),                
<<<<<<<< proposed change.
                                  (long)(Integer.MAX_VALUE));      
      buf = Arrays.copyOf(buf, Math.max(newSize, newcount));
    }
    System.arraycopy(b, off, buf, count, len);
    count = newcount;
  }
}
{code}

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a 
> couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was 
> oscillating so much when dealing with very large buffers rather than OOM'ing 
> or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling 
> them. Also, it is supposed to be able to handle "int" sized buffers (2G). The 
> code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
>       if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
>           throw new IndexOutOfBoundsException();
>       } else if (len == 0) {
>           return;
>       }
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 
> 1G, and "newcount" will instead dictate the size of the buffer allocated. At 
> this point, all attempts to write to the buffer will grow linearly, and the 
> buffer will be resized by only the required amount on each write. 
> Effectively, each write will allocate a new 1G buffer + reqd size buffer, 
> copy the contents, and so on. This will put the process in heavy GC mode 
> (with jvm heap oscillating by several GBs rapidly), and render it practically 
> unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the 
> resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > 
> Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on 
> responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC 
> handlers. And make this configurable. Currently, any response buffer about 
> 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Reply via email to