[jira] [Commented] (PHOENIX-4550) Allow declaration of max columns on base physical table

Thomas D'Silva (JIRA) Tue, 23 Jan 2018 11:25:23 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336280#comment-16336280
 ]


Thomas D'Silva commented on PHOENIX-4550:
-----------------------------------------

[~jamestaylor]

Another way to deal with the sparsness of view data is to use a mapping from 
column qualifier to array index. We could build this map when we create the 
PTable of the view. This would allow us to have a compact array. For example if 
the col qualifiers of the view cols are:

base table cols : 1-2
view cols : 100-102, 500

We could have a map from col qualifier to array index
1->1, 2->2, 100->3,101->4,102->5, 500->6

We would have to have to handle columns that are deleted by flagging them (we 
already do this by using the LinkType.EXCLUDED_COLUMN for splittable system 
catalog in PHOENIX-3534), so that we build the map in the PTable correctly. We 
can actually drop the view column metadata when we rewrite the single cell 
without the deleted column value during compaction. 



> Allow declaration of max columns on base physical table
> -------------------------------------------------------
>
>                 Key: PHOENIX-4550
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4550
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Priority: Major
>
> By declaring the max number of columns on a base table, we can optimize the 
> storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the 
> columns preceding the initial column of a view. This will make a huge 
> difference in storage when you have a base table with many views. For example:
> {code}
> -- Declare that the base table will have no more than 10 columns
> CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE,
>     CONSTRAINT pk PRIMARY KEY (k1, prefix))
>     MULTI_TENANT = true,
>     MAX_COLUMNS = 10;
> CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR)
>     AS SELECT * FROM base WHERE prefix = 'A00';
> CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR);
>     AS SELECT * FROM base WHERE prefix = 'A10';
> ...
> {code}
> As the number of views grow, the difference between the base table column 
> encoding (column #1) and the starting column number of the view (since the 
> starting offset is determined by an incrementing value on the base table) 
> will increase. This bloats the storage as we need to store null values for 
> column encodings between the base table column and the starting column of the 
> view.
> Instead, we'll pass through the MAX_COLUMNS value for queries and anything 
> column encoding less than this we know it'll be at the start. Anything 
> greater and we'll start the search from <column encoding> - <minimum view 
> column encoding>.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4550) Allow declaration of max columns on base physical table

Reply via email to