[
https://issues.apache.org/jira/browse/PHOENIX-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742291#comment-14742291
]
Lars Hofhansl commented on PHOENIX-1598:
----------------------------------------
So two things, right?
# Phoenix can map longer names to short actual names. The actual names should
imply their ordinal position (numbers are nice).
# HBase, being told about specific columns could do a better job compressing
the HFile or block data.
I doubt that #2 would win much over FAST_DIFF/PREFIX and/or SNAPPY. Need to
try. Maybe we can separate key storage from value storage, index the location
of all values, and hence allow for much faster filtering or aggregation.
#1 is nice at many level: Less storage, free renames, can avoid the binary
search (a HashMap of Integers is much cheaper as an Integer's hash is its
value, and compare is cheap), can even drop columns this way (map them to null
or a to a tomb stone).
Not trying to avoid work in HBase. Happy to help with #1.
bq. would it be possible to put an interface in place in HBase that would
control the population of the List<Cell> given a Cell
I'm not sure what exactly this would do. If that means that HBase needs to
search for the position of the column we haven't gained anything. HBase cannot
know ahead of time how many columns will be in a row. Should discuss.
> encode column names to save space
> ----------------------------------
>
> Key: PHOENIX-1598
> URL: https://issues.apache.org/jira/browse/PHOENIX-1598
> Project: Phoenix
> Issue Type: Improvement
> Reporter: noam bulvik
>
> when creating table using phoenix DDL replace the column names that the user
> give with shorter names to save space. the user will still the full name is
> his select statements and will get them in the result set but under the hood
> the infra will translate the names to their sorter version.
> example:
> when creating table with my_column_1, my_column_2 ... the table will be
> created with a as first column , b as the second one etc'
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)