[
https://issues.apache.org/jira/browse/HBASE-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018439#comment-13018439
]
Jean-Daniel Cryans commented on HBASE-3656:
-------------------------------------------
Talking on IRC with aph__ (yeah that's his handle), he was running imports on a
small 3 nodes cluster with heap set at 1.5GB and saw abysmal performance. While
we told him to get some real hardware because tuning his current setup would be
a massive waste of time, I was thinking that HBase could still do a much better
job out of the box. One thing that would make it better is somehow
automatically controlling regions size/number to not run into the situation
where you basically are filling up 1000 regions at the same rate and flush tiny
files (which then triggers compactions _ad vitam aeternam_).
The other thing that would help is this jira since you wouldn't be compacting
so much in the first place.
> Merging flush; merge a flush with one of the existing store files (the
> smallest?) so we skip creating a new store file on each flush
> ------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-3656
> URL: https://issues.apache.org/jira/browse/HBASE-3656
> Project: HBase
> Issue Type: Task
> Reporter: stack
>
> This behavior is described in the BT paper. Years ago I had a go at it but
> at the time it slowed flushing significantly -- and IIRC we had no barriers
> on writes when the memory pressue was high -- so it brought on OOMEs... so
> punted on it. Its time to consider this feature again.
> Would we always do it? Maybe not if its a close? If a close we want stuff
> to run quickly so we should skip the merge. But any other time, we should do
> it?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira