large response handling: some fixups and cleanups
-------------------------------------------------

                 Key: HBASE-3199
                 URL: https://issues.apache.org/jira/browse/HBASE-3199
             Project: HBase
          Issue Type: Bug
            Reporter: Kannan Muthukkaruppan
            Assignee: Kannan Muthukkaruppan


This may not be common for many use cases, but it might be good to put a couple 
of safety nets as well as logging to protect against large responses.

(i) Aravind and I were trying to track down why JVM memory usage was 
oscillating so much when dealing with very large buffers rather than OOM'ing or 
hitting some Index out of bound type exception, and this is what we found.

java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. 
Also, it is supposed to be able to handle "int" sized buffers (2G). The code 
which handles "write" (in jdk 1.6) is along the lines of:

{code}
   public synchronized void write(byte b[], int off, int len) {
        if ((off < 0) || (off > b.length) || (len < 0) ||
            ((off + len) > b.length) || ((off + len) < 0)) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return;
        }
        int newcount = count + len;
        if (newcount > buf.length) {
            buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
        }
        System.arraycopy(b, off, buf, count, len);
        count = newcount;
    }
{code}

The "buf.length << 1" will start producing -ve values when buf.length reaches 
1G, and "newcount" will instead dictate the size of the buffer allocated. At 
this point, all attempts to write to the buffer will grow linearly, and the 
buffer will be resized by only the required amount on each write. Effectively, 
each write will allocate a new 1G buffer + reqd size buffer, copy the contents, 
and so on. This will put the process in heavy GC mode (with jvm heap 
oscillating by several GBs rapidly), and render it practically unusable.

(ii) When serializing a Result, the writeArray method doesn't assert that the 
resultant size does not overflow an "int".

{code}
    int bufLen = 0;
    for(Result result : results) {
      bufLen += Bytes.SIZEOF_INT;
      if(result == null || result.isEmpty()) {
        continue;
      }
      for(KeyValue key : result.raw()) {
        bufLen += key.getLength() + Bytes.SIZEOF_INT;
      }
    }
{code}

We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.

(iii) In HBaseServer.java on RPC responses, we could add some logging on 
responses above a certain thresholds.

(iv) Increase buffer size threshold for buffers that are reused by RPC 
handlers. And make this configurable. Currently, any response buffer about 16k 
is not reused on next response. (HBaseServer.java).





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to