[ 
https://issues.apache.org/jira/browse/HBASE-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315190#comment-15315190
 ] 

Enis Soztutar commented on HBASE-15950:
---------------------------------------

Found a very nice library called Java-Object-Layout from OpenJDK. I think we 
should switch to using this to estimate the object sizes at runtime. 
http://openjdk.java.net/projects/code-tools/jol/  

{code}
objc[65069]: Class JavaLaunchHelper is implemented in both 
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java and 
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/libinstrument.dylib.
 One of the two will be used. Which one is undefined.
# WARNING: Unable to attach Serviceability Agent. You can try again with 
escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with 
sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# WARNING | Compressed references base/shifts are guessed by the experiment!
# WARNING | Therefore, computed addresses are just guesses, and ARE NOT 
RELIABLE.
# WARNING | Make sure to attach Serviceability Agent to get the reliable 
addresses.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

org.apache.hadoop.hbase.KeyValue object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                    VALUE
      0    12        (object header)                N/A
     12     4    int KeyValue.offset                N/A
     16     8   long KeyValue.seqId                 N/A
     24     4    int KeyValue.length                N/A
     28     4 byte[] KeyValue.bytes                 N/A
Instance size: 32 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

java.util.concurrent.ConcurrentSkipListMap$Node object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                    VALUE
      0    12        (object header)                N/A
     12     4 Object Node.key                       N/A
     16     4 Object Node.value                     N/A
     20     4   Node Node.next                      N/A
Instance size: 24 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

java.util.concurrent.ConcurrentSkipListMap$Index object internals:
 OFFSET  SIZE  TYPE DESCRIPTION                    VALUE
      0    12       (object header)                N/A
     12     4  Node Index.node                     N/A
     16     4 Index Index.down                     N/A
     20     4 Index Index.right                    N/A
Instance size: 24 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
{code}

Our KV.heapSize() is: 
{code}
  public long heapSize() {
    int sum = 0;
    sum += ClassSize.OBJECT;// the KeyValue object itself
    sum += ClassSize.REFERENCE;// pointer to "bytes"
    sum += ClassSize.align(ClassSize.ARRAY);// "bytes"
    sum += ClassSize.align(length);// number of bytes of data in the "bytes" 
array
    sum += 2 * Bytes.SIZEOF_INT;// offset, length
    sum += Bytes.SIZEOF_LONG;// memstoreTS
    return ClassSize.align(sum);
  }
{code}
 
Without MSLAB, this:
{code}
    sum += ClassSize.align(ClassSize.ARRAY);// "bytes"
    sum += ClassSize.align(length);// number of bytes of data in the "bytes" 
array
{code}

should be something like: 
{code}
    sum += ClassSize.align(ClassSize.ARRAY + length); 
{code}



> We are grossly overestimating the memstore size
> -----------------------------------------------
>
>                 Key: HBASE-15950
>                 URL: https://issues.apache.org/jira/browse/HBASE-15950
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0
>
>         Attachments: Screen Shot 2016-06-02 at 8.48.27 PM.png
>
>
> While testing something else, I was loading a region with a lot of data. 
> Writing 30M cells in 1M rows, with 1 byte values. 
> The memstore size turned out to be estimated as 4.5GB, while with the JFR 
> profiling I can see that we are using 2.8GB for all the objects in the 
> memstore (KV + KV byte[] + CSLM.Node + CSLM.Index). 
> This obviously means that there is room in the write cache that we are not 
> effectively using. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to