Mujtaba Chohan created PHOENIX-3582:
---------------------------------------

             Summary: No significant space saving with immutable encoded column 
with large number of dense columns
                 Key: PHOENIX-3582
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3582
             Project: Phoenix
          Issue Type: Sub-task
            Reporter: Mujtaba Chohan


Tested with 2 schemas both with 5K varchar columns. In test #1 columns were 
named as column_1 ... column5000 whereas in test #2 columns were 10 byte random 
alphanumeric. Each columns is filled 15 random bytes and all column have values.

For test #1, Immutable encoded column uses ~4X *more* space than non-encoded 
column. Fast Diff encoding really shines when column names are highly 
compressible (column_1 ... column_5000)

For test #2, For worst case where column names are not compressible since they 
are random 10 byte alpha numeric, immutable encoded column uses 25% less space. 
 

Data generation class is attached to 
https://issues.apache.org/jira/browse/PHOENIX-3560. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to