[
https://issues.apache.org/jira/browse/HBASE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770389#comment-13770389
]
Matt Corgan commented on HBASE-9553:
------------------------------------
I don't know the code-level implementation details of any of the garbage
collectors, but I imagine they do this to an extent already by dividing the
heap into regions of different chunk sizes and placing blocks into slightly
bigger slots than they need, effectively doing the padding by leaving empty
space after each block. Maybe not for tiny objects, but possibly for bigger
ones.
I also worry it would be hard to pick a single size to round all the blocks to
because hbase allows configurable block size and encoding per table. And even
if all tables use the default block size and encoding, the encoding will result
in different block sizes depending on the nature of the data in each table.
It would be a good question for the Mechanical Sympathy mailing list.
> Pad HFile blocks to a fixed size before placing them into the blockcache
> ------------------------------------------------------------------------
>
> Key: HBASE-9553
> URL: https://issues.apache.org/jira/browse/HBASE-9553
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
>
> In order to make it easy on the garbage collector and to avoid full
> compaction phases we should make sure that all (or at least a large
> percentage) of the HFile blocks as cached in the block cache are exactly the
> same size.
> Currently an HFile block is typically slightly larger than the declared block
> size, as the block will accommodate that last KV on the block. The padding
> would be a ColumnFamily option. In many cases 100 bytes would probably be a
> good value to make all blocks exactly the same size (but of course it depends
> on the max size of the KVs).
> This does not have to be perfect. The more blocks evicted and replaced in the
> block cache are of the exact same size the easier it should be on the GC.
> Thoughts?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira