[ 
https://issues.apache.org/jira/browse/PHOENIX-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339850#comment-16339850
 ] 

Thomas D'Silva commented on PHOENIX-4550:
-----------------------------------------

The scope of SINGLE_CELL_PER_LEVEL is much larger, the MAX_COLUMNS approach is 
easier to implement.
For the MAX_COLUMNS  approach, when you add a new base table column, instead of 
getting the col qualifier from the physical table counter you will need to set 
it to the max of the current base table col qualifiers +1 (and also ensure you 
don't add more than max_columns). 
Also when you write the column values for the view cols in the single cell 
array, you will have to change the array index to be MAX_COLUMNS + (col 
qualifier - min view col qualifier). SingleCellColumnExpression would also need 
to be changed to read the correct index. 


> Ensure storage is dense when base table has many views
> ------------------------------------------------------
>
>                 Key: PHOENIX-4550
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4550
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Priority: Major
>
> By declaring the max number of columns on a base table, we can optimize the 
> storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the 
> columns preceding the initial column of a view. This will make a huge 
> difference in storage when you have a base table with many views. For example:
> {code}
> -- Declare that the base table will have no more than 10 columns
> CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE,
>     CONSTRAINT pk PRIMARY KEY (k1, prefix))
>     MULTI_TENANT = true,
>     MAX_COLUMNS = 10;
> CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR)
>     AS SELECT * FROM base WHERE prefix = 'A00';
> CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR);
>     AS SELECT * FROM base WHERE prefix = 'A10';
> ...
> {code}
> As the number of views grow, the difference between the base table column 
> encoding (column #1) and the starting column number of the view (since the 
> starting offset is determined by an incrementing value on the base table) 
> will increase. This bloats the storage as we need to store null values for 
> column encodings between the base table column and the starting column of the 
> view.
> Instead, we'll pass through the MAX_COLUMNS value for queries and anything 
> column encoding less than this we know it'll be at the start. Anything 
> greater and we'll start the search from <column encoding> - <minimum view 
> column encoding>.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to