[
https://issues.apache.org/jira/browse/HBASE-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965980#comment-14965980
]
Vladimir Rodionov commented on HBASE-14651:
-------------------------------------------
[~anoop.hbase]
{quote}
So the flushed files will have sizes of this memstore flush size.
{quote}
Nope. Memstore size in Java Heap != compressed (serialized) store file.
Usually, serialized representation is 3-4x smaller + compression, we can talk
about 8-10x reduction.
Example (7 flushes):
Memstore flush size = 128MB
Store file size = 15MB
Minimum compaction size = 128MB
15,
15,15
15,15,15
45
15,45
15,15,45
75
15,75
15,15,75
105
Read=225MB
Write=330MB
Memstore flush size = 128MB
Store file size = 15MB
Minimum compaction size = 64MB
15,
15,15
15,15,15
45
15,45
15,15,45
15,15,15,45 (will be selected because of file ratio = 1.2)
90
15,90
Read = 135MB
Write = 240MB
> Default minimum compaction size is too high
> -------------------------------------------
>
> Key: HBASE-14651
> URL: https://issues.apache.org/jira/browse/HBASE-14651
> Project: HBase
> Issue Type: New Feature
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Attachments: HBASE-14651-v1.patch
>
>
> *hbase.hstore.compaction.min.size* defines minimum selection size which is
> always eligible for minor compaction (no compaction ratio check is performed
> on such file selections). Default size is equals to memstore flush size
> (128MB). First of all, even this value is too high for some (many)
> deployments, especially for write intensive, because of a small sizes of a
> memstore flushes, and if user increases memstore flush size (they usually set
> it to at least 256MB), they have no idea how will it impact the overall
> compaction process efficiency. With 256MB of minimum size to compact,
> compactor most of the time skips necessary file ratio checks and this will
> result in increased read/write IO during compactions, because of the
> unbalanced selections where relatively large files can be mixed with a newly
> created small store files. I think we should set this default minimum to
> 64MB and not to link it to memstore flush size at all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)