[
https://issues.apache.org/jira/browse/HADOOP-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555778#action_12555778
]
Jim Kellerman commented on HADOOP-2521:
---------------------------------------
Remember that an HStoreKey contains both the family name and the member name.
You could have entries for 'contents:' (just the family name),
'contents:member1', 'contents:member2', etc., and they all get stored in the
same HStoreFile.
So unless you want to create a new object type to be the key, and then add the
necessary logic to transform to/from HStoreKeys, I'd say that trading off a
little space for time is a benefit, not a fault.
-1 on this proposal.
> [hbase] HStoreFiles needlessly store the column family name in every entry
> --------------------------------------------------------------------------
>
> Key: HADOOP-2521
> URL: https://issues.apache.org/jira/browse/HADOOP-2521
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: Bryan Duxbury
> Priority: Minor
>
> Today, HStoreFiles keep the entire serialized HStoreKey objects around for
> every cell in the HStore. Since HStores are 1-1 with column families, this is
> really unnecessary - you can always surmise the column family by looking at
> the HStore it belongs to. (This information would ostensibly come from the
> file name or a header section.) This means that we could remove the column
> family part of the HStoreKeys we put into the HStoreFile, reducing the size
> of data stored. This would be a space-saving benefit, removing redundant
> data, and could be a speed benefit, as you have to scan over less data in
> memory and transfer less data over the network.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.