Sergey Shelukhin created HBASE-8765:
---------------------------------------
Summary: split should be based on store size, not HFile size
Key: HBASE-8765
URL: https://issues.apache.org/jira/browse/HBASE-8765
Project: HBase
Issue Type: Improvement
Affects Versions: 0.95.1
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
I noticed that the current split behavior is rather suboptimal with regard to
compactions. On large regions, HFile size limit triggers a split. Split is
followed by major compaction to get rid of the partial reference files.
However, HFile size limit is surpassed after compaction most of the time.
So, first we rewrite a lot of data into a new file. Then we say "Oh look! A
large file!", split the region and rewrite everything again.
Perhaps region split should be based on region size, or incoming compaction
size - large enough compaction should be converted into splits.
Thoughts? I think basing off region size is a simple fix, and will code it up
soon if there are no objections
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira