[hbase] HStoreFiles needlessly store the column family name in every entry --------------------------------------------------------------------------
Key: HADOOP-2521 URL: https://issues.apache.org/jira/browse/HADOOP-2521 Project: Hadoop Issue Type: Improvement Components: contrib/hbase Reporter: Bryan Duxbury Priority: Minor Today, HStoreFiles keep the entire serialized HStoreKey objects around for every cell in the HStore. Since HStores are 1-1 with column families, this is really unnecessary - you can always surmise the column family by looking at the HStore it belongs to. (This information would ostensibly come from the file name or a header section.) This means that we could remove the column family part of the HStoreKeys we put into the HStoreFile, reducing the size of data stored. This would be a space-saving benefit, removing redundant data, and could be a speed benefit, as you have to scan over less data in memory and transfer less data over the network. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.