Mujtaba Chohan created PHOENIX-3582:
---------------------------------------
Summary: No significant space saving with immutable encoded column
with large number of dense columns
Key: PHOENIX-3582
URL: https://issues.apache.org/jira/browse/PHOENIX-3582
Project: Phoenix
Issue Type: Sub-task
Reporter: Mujtaba Chohan
Tested with 2 schemas both with 5K varchar columns. In test #1 columns were
named as column_1 ... column5000 whereas in test #2 columns were 10 byte random
alphanumeric. Each columns is filled 15 random bytes and all column have values.
For test #1, Immutable encoded column uses ~4X *more* space than non-encoded
column. Fast Diff encoding really shines when column names are highly
compressible (column_1 ... column_5000)
For test #2, For worst case where column names are not compressible since they
are random 10 byte alpha numeric, immutable encoded column uses 25% less space.
Data generation class is attached to
https://issues.apache.org/jira/browse/PHOENIX-3560.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)