[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-----------------------------
    Attachment: HBASE-10201-0.98_2.patch

Running with TestPerColumnFamilyFlush.

3 CFs, 16B value for CF1, 256B value for CF2 and 4K value for CF3, 1M rows, 
128M memstore flush size, 16M CF flush size.

Result without per CF flush:
NumStoreFiles: 7, StoreFileSize: 4336644762, NumCompactionsCompleted: 46, 
NumFilesCompacted: 146, NumBytesCompacted: 11132103132
Write amplification: 2.57

Result with per CF flush:
NumStoreFiles: 10, StoreFileSize: 4482510274, NumCompactionsCompleted: 27, 
NumFilesCompacted: 89, NumBytesCompacted: 10353603767
Write amplification: 2.31

Next I will run this benchmark on a real cluster instead of minicluster.



> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to