[
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068179#comment-16068179
]
Anoop Sam John commented on HBASE-18294:
----------------------------------------
This is the per region flush decision. Yes that is based ONLY on DATA size.
Heap ovehead is not considered. This is changed with intention. The heap
overhead is playing a role when the global memstore size based flushes are
happening. We have some upper bound for this global size. When one say flush
region when 128 MB size is reached, the user simple expectation will be 128 MB
of data (key + value). But what we were seeing in the past is actual data size
might be some times even half of this size. Many a times we have seen Qs in
mail list why so.
The compacting memstore and its flattening and merge (index alone) helps to
reduce the global heap ovehead. The data merge mode (useful when duplicated
cells use case) will help to reduce the # region flushes.
> Flush policy checks data size instead of heap size
> --------------------------------------------------
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
> Issue Type: Bug
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
>
> A flush policy decides whether to flush a store by comparing the size of the
> store to a threshold (that can be configured with
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation compares the data size (key-value only) to the
> threshold where it should compare the heap size (which includes index size,
> and metadata).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)