[
https://issues.apache.org/jira/browse/HBASE-16747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang reopened HBASE-16747:
-------------------------------
Seems this breaks TestIOFencing.
https://builds.apache.org/job/PreCommit-HBASE-Build/4240/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompaction/
{noformat}
java.lang.AssertionError: Timed out waiting for the region to flush
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.apache.hadoop.hbase.TestIOFencing.doTest(TestIOFencing.java:291)
at
org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompaction(TestIOFencing.java:225)
{noformat}
> Track memstore data size and heap overhead separately
> ------------------------------------------------------
>
> Key: HBASE-16747
> URL: https://issues.apache.org/jira/browse/HBASE-16747
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-16747.patch, HBASE-16747.patch,
> HBASE-16747_V2.patch, HBASE-16747_V2.patch, HBASE-16747_V3.patch,
> HBASE-16747_V3.patch, HBASE-16747_V3.patch, HBASE-16747_V4.patch,
> HBASE-16747_WIP.patch
>
>
> We track the memstore size in 3 places.
> 1. Global at RS level in RegionServerAccounting. This tracks all memstore's
> size and used to calculate whether forced flushes needed because of global
> heap pressure
> 2. At region level in HRegion. This is sum of sizes of all memstores within
> this region. This is used to decide whether region reaches flush size (128 MB)
> 3. Segment level. This tracks the in memory flush/compaction decisions.
> All these use the Cell's heap size which include the data bytes# as well as
> Cell object heap overhead. Also we include the overhead because of addition
> of Cells into Segment's data structures (Like CSLM).
> Once we have off heap memstore, we will keep the cell data bytes in off heap
> area. So we can not track both data size and heap overhead as one entity. We
> need to separate them and track.
> Proposal here is to track both cell data size and heap overhead separately at
> global accounting layer. As of now we have only on heap memstore. So the
> global memstore boundary checks will consider both (adds up and check against
> global max memstore size)
> Track cell data size alone (This can be on heap or off heap) in region level.
> Region flushes use cell data size alone for the region flush decision. A
> user configuring 128 MB as flush size, normally he will expect to get a 128MB
> data flush size. But as we were including the heap overhead also, once the
> flush happens, the actual data size getting flushed is way behind this 128
> MB. Now with this change we will behave more like what a user thinks.
> Segment level in memory flush/compaction also considers cell data size alone.
> But we will need to track the heap overhead also. (Once the in memory flush
> or normal flush happens, we will have to adjust both cell data size and heap
> overhead)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)