[ 
https://issues.apache.org/jira/browse/PHOENIX-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339711#comment-16339711
 ] 

Thomas D'Silva commented on PHOENIX-4550:
-----------------------------------------

The map idea won't work as it doesn't solve the issue of when a column is added 
to the base table as it would be stored at different positions within the 
single cell array.
Currently we store the col qualifier counter at the physical table level. If we 
stored it at the view level then the array would be dense. We didn't do this 
initially because we wouldn't be able to handle adding a new column to the base 
table (unless we preallocated a fixed number of columns and started the view 
counters at this number).  
We could solve this issue if we stored column values for each level of the 
hierarchy in its own cell. The base table column values would be stored in a 
single cell. Views would store their columns in a separate cell, grand 
child-views in their own cell and so on. 
We could also do this only when a column is added to the base table (or a 
view), then store that column value (and any further columns that are added) in 
its own cell. We start the column qualifier counter at each level at the number 
of cols in the parent +1.  


> Allow declaration of max columns on base physical table
> -------------------------------------------------------
>
>                 Key: PHOENIX-4550
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4550
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Priority: Major
>
> By declaring the max number of columns on a base table, we can optimize the 
> storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the 
> columns preceding the initial column of a view. This will make a huge 
> difference in storage when you have a base table with many views. For example:
> {code}
> -- Declare that the base table will have no more than 10 columns
> CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE,
>     CONSTRAINT pk PRIMARY KEY (k1, prefix))
>     MULTI_TENANT = true,
>     MAX_COLUMNS = 10;
> CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR)
>     AS SELECT * FROM base WHERE prefix = 'A00';
> CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR);
>     AS SELECT * FROM base WHERE prefix = 'A10';
> ...
> {code}
> As the number of views grow, the difference between the base table column 
> encoding (column #1) and the starting column number of the view (since the 
> starting offset is determined by an incrementing value on the base table) 
> will increase. This bloats the storage as we need to store null values for 
> column encodings between the base table column and the starting column of the 
> view.
> Instead, we'll pass through the MAX_COLUMNS value for queries and anything 
> column encoding less than this we know it'll be at the start. Anything 
> greater and we'll start the search from <column encoding> - <minimum view 
> column encoding>.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to