[
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134408#comment-13134408
]
Matt Corgan commented on HBASE-4577:
------------------------------------
I think the intention of adding it was to see how big the data would be in
memory as opposed to on disk, which is a valuable metric. However, we're
already jumping ahead to doing delta encoding and prefix compression, so there
will soon be a need for a third metric to track encoded size. Maybe these 3
names would be better:
storefileSize: size as reported by the filesystem (lzo/gzip compressed)
encodedDataSize: size in the block cache (with delta encoding or prefix
compression, but no gzip)
rawDataSize (instead of uncompressedBytes): size when stored in the current
concatenated KeyValue format (the biggest of the 3)
The last 2 would only count datablocks of KeyValues. I'm not sure where
bloomfilters and indexblocks should be counted into these. Possibly separate
metrics?
> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -----------------------------------------------------------------------------
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.0
> Reporter: Jean-Daniel Cryans
> Assignee: gaojinchao
> Priority: Minor
> Fix For: 0.92.0
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418,
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira