[
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhangduo updated HBASE-10201:
-----------------------------
Attachment: HBASE-10201-0.98_2.patch
Running with TestPerColumnFamilyFlush.
3 CFs, 16B value for CF1, 256B value for CF2 and 4K value for CF3, 1M rows,
128M memstore flush size, 16M CF flush size.
Result without per CF flush:
NumStoreFiles: 7, StoreFileSize: 4336644762, NumCompactionsCompleted: 46,
NumFilesCompacted: 146, NumBytesCompacted: 11132103132
Write amplification: 2.57
Result with per CF flush:
NumStoreFiles: 10, StoreFileSize: 4482510274, NumCompactionsCompleted: 27,
NumFilesCompacted: 89, NumBytesCompacted: 10353603767
Write amplification: 2.31
Next I will run this benchmark on a real cluster instead of minicluster.
> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch,
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch
>
>
> Currently the flush decision is made using the aggregate size of all column
> families. When large and small column families co-exist, this causes many
> small flushes of the smaller CF. We need to make per-CF flush decisions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)