[ 
https://issues.apache.org/jira/browse/HBASE-16747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602733#comment-15602733
 ] 

Hadoop QA commented on HBASE-16747:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HBASE-16747 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12834981/HBASE-16747.patch |
| JIRA Issue | HBASE-16747 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4165/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Track memstore data size and heap overhead separately 
> ------------------------------------------------------
>
>                 Key: HBASE-16747
>                 URL: https://issues.apache.org/jira/browse/HBASE-16747
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16747.patch, HBASE-16747_WIP.patch
>
>
> We track the memstore size in 3 places.
> 1. Global at RS level in RegionServerAccounting. This tracks all memstore's 
> size and used to calculate whether forced flushes needed because of global 
> heap pressure
> 2. At region level in HRegion. This is sum of sizes of all memstores within 
> this region. This is used to decide whether region reaches flush size (128 MB)
> 3. Segment level. This tracks the in memory flush/compaction decisions.
> All these use the Cell's heap size which include the data bytes# as well as 
> Cell object heap overhead.  Also we include the overhead because of addition 
> of Cells into Segment's data structures (Like CSLM).
> Once we have off heap memstore, we will keep the cell data bytes in off heap 
> area. So we can not track both data size and heap overhead as one entity. We 
> need to separate them and track.
> Proposal here is to track both cell data size and heap overhead separately at 
> global accounting layer.  As of now we have only on heap memstore. So the 
> global memstore boundary checks will consider both (adds up and check against 
> global max memstore size)
> Track cell data size alone (This can be on heap or off heap) in region level. 
>  Region flushes use cell data size alone for the region flush decision. A 
> user configuring 128 MB as flush size, normally he will expect to get a 128MB 
> data flush size. But as we were including the heap overhead also, once the 
> flush happens, the actual data size getting flushed is way behind this 128 
> MB.  Now with this change we will behave more like what a user thinks.
> Segment level in memory flush/compaction also considers cell data size alone. 
>  But we will need to track the heap overhead also. (Once the in memory flush 
> or normal flush happens, we will have to adjust both cell data size and heap 
> overhead)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to