[ 
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983312#action_12983312
 ] 

Nicolas Spiegelberg commented on HBASE-3149:
--------------------------------------------

Some interesting stats. We did some rough calculations internally to see what 
effect an uneven distribution of data into column families was having on our 
network IO. Our data distribution for 3 column families was 1:1:20. When we 
looked at the flush:minor-compaction ratio for each of the store files, the 
large column family had a 1:2 ratio but the small CFs both had a 1:20 ratio! We 
are looking at roughly a 10% network IO decrease if we can bring those other 2 
CFs down to a 1:2 ratio as well.

> Make flush decisions per column family
> --------------------------------------
>
>                 Key: HBASE-3149
>                 URL: https://issues.apache.org/jira/browse/HBASE-3149
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Karthik Ranganathan
>
> Today, the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to