[
https://issues.apache.org/jira/browse/HBASE-27232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wellington Chevreuil resolved HBASE-27232.
------------------------------------------
Resolution: Fixed
> Fix checking for encoded block size when deciding if block should be closed
> ---------------------------------------------------------------------------
>
> Key: HBASE-27232
> URL: https://issues.apache.org/jira/browse/HBASE-27232
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0-alpha-3, 2.4.13
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4
>
>
> On HFileWriterImpl.checkBlockBoundary, we useed to consider the unencoded and
> uncompressed data size when deciding to close a block and start a new one.
> That could lead to varying "on-disk" block sizes, depending on the encoding
> efficiency for the cells in each block.
> HBASE-17757 introduced the hbase.writer.unified.encoded.blocksize.ratio
> property, as ration of the original configured block size, to be compared
> against the encoded size. This was an attempt to ensure homogeneous block
> sizes. However, the check introduced by HBASE-17757 also considers the
> unencoded size, which in the cases where encoding efficiency is higher than
> what's configured in hbase.writer.unified.encoded.blocksize.ratio, it would
> still lead to varying block sizes.
> This patch changes that logic, to only consider encoded size if
> hbase.writer.unified.encoded.blocksize.ratio property is set, otherwise, it
> will consider the unencoded size. This gives a finer control over the on-disk
> block sizes and the overall number of blocks when encoding is in use.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)