[ 
https://issues.apache.org/jira/browse/HBASE-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586749#comment-15586749
 ] 

stack commented on HBASE-10656:
-------------------------------

[~ikeda] Here is an interesting observation by a coworker 
[~mi...@cloudera.com]. I can open new issue to discuss but posting here for 
moment:

{quote}
To induce high load on MONITORING TOOL in my small 8-machine cluster, V 
suggested to create 10 hbase tables with 1K regions each - in this way, 
MONITORING TOOL gets 10K new entities to monitor. I've done that and it worked 
for MONITORING TOOL as expected. However, one thing that we noticed is that 
HBase Region Servers in my cluster are now constantly running GC....

I decided to take a quick look, took a heap dump from one of the region servers 
and analyzed it with the same tool (http://www.jxray.com) that I use in the 
MONITORING TOOL work. The output is attached.

One finding is that 41% of memory is occupied by instances of 
org.apache.hadoop.hbase.util.Counter$Cell class, and they seem to be actively 
"churned" by GC all the time. I looked at the code of this class, and one thing 
that immediately caught my eye is this:

  private static class Cell {
    // Pads are added around the value to avoid cache-line contention with
    // another cell's value. The cache-line size is expected to be equal to or
    // less than about 128 Bytes (= 64 Bits * 16).

    @SuppressWarnings("unused")
    volatile long p0, p1, p2, p3, p4, p5, p6;
    volatile long value;
    @SuppressWarnings("unused")
    volatile long q0, q1, q2, q3, q4, q5, q6;

So, as far as I understand, the only meaningful data field in this class, 
'value', is deliberately "padded" with empty fields just to make an instance of 
this class big enough to fit the entire 128-byte cache line.

This looks like a very extreme optimization that would work if there were very 
few objects in memory, or at least very few of Counter$Cell instances, so that 
they were kept in the cache all the time. But clearly in our case making these 
objects artificially large greatly increases the GC pressure and ultimately 
makes everything much slower.

Can somebody shed some light on this? In particular:

- Why do so many Counter instances are created and destroyed all the time 
despite the fact that there is no HBase activity going on?
- I don't think the setup with 10K regions is very unconventional. If so many 
Cell objects need to be maintained, then probably it's worth providing e.g. 
another implementation that's simply optimized for size rather than for memory 
cache performance?
{quote}

On Question #1, it is probably our metrics accounting that is going on. On #2, 
you might have input.

>  high-scale-lib's Counter depends on Oracle (Sun) JRE, and also has some bug
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-10656
>                 URL: https://issues.apache.org/jira/browse/HBASE-10656
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Hiroshi Ikeda
>            Assignee: Hiroshi Ikeda
>            Priority: Minor
>             Fix For: 0.96.2, 0.98.1, 0.99.0
>
>         Attachments: 10656-098.v2.txt, 10656-trunk.v2.patch, 10656.096v2.txt, 
> HBASE-10656-0.96.patch, HBASE-10656-addition.patch, HBASE-10656-trunk.patch, 
> MyCounter.java, MyCounter2.java, MyCounter3.java, MyCounterTest.java, 
> MyCounterTest.java, PerformanceTestApp.java, PerformanceTestApp2.java, 
> output.pdf, output.txt, output2.pdf, output2.txt
>
>
> Cliff's high-scale-lib's Counter is used in important classes (for example, 
> HRegion) in HBase, but Counter uses sun.misc.Unsafe, that is implementation 
> detail of the Java standard library and belongs to Oracle (Sun). That 
> consequently makes HBase depend on the specific JRE Implementation.
> To make matters worse, Counter has a bug and you may get wrong result if you 
> mix a reading method into your logic calling writing methods.
> In more detail, I think the bug is caused by reading an internal array field 
> without resolving memory caching, which is intentional the comment says, but 
> storing the read result into a volatile field. That field may be not changed 
> after you can see the true values of the array field, and also may be not 
> changed after updating the "next" CAT instance's values in some race 
> condition when extending CAT instance chain.
> Anyway, it is possible that you create a new alternative class which only 
> depends on the standard library. I know Java8 provides its alternative, but 
> HBase should support Java6 and Java7 for some time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to