[ 
https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ryan rawson updated HBASE-3199:
-------------------------------

    Attachment: HBASE-3199.txt

here is my base patch, i'll merge in the other in a moment here

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a 
> couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was 
> oscillating so much when dealing with very large buffers rather than OOM'ing 
> or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling 
> them. Also, it is supposed to be able to handle "int" sized buffers (2G). The 
> code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
>       if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
>           throw new IndexOutOfBoundsException();
>       } else if (len == 0) {
>           return;
>       }
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 
> 1G, and "newcount" will instead dictate the size of the buffer allocated. At 
> this point, all attempts to write to the buffer will grow linearly, and the 
> buffer will be resized by only the required amount on each write. 
> Effectively, each write will allocate a new 1G buffer + reqd size buffer, 
> copy the contents, and so on. This will put the process in heavy GC mode 
> (with jvm heap oscillating by several GBs rapidly), and render it practically 
> unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the 
> resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > 
> Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on 
> responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC 
> handlers. And make this configurable. Currently, any response buffer about 
> 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to