James Taylor created PHOENIX-4550:
-------------------------------------
Summary: Allow declaration of max columns on base physical table
Key: PHOENIX-4550
URL: https://issues.apache.org/jira/browse/PHOENIX-4550
Project: Phoenix
Issue Type: Bug
Reporter: James Taylor
By declaring the max number of columns on a base table, we can optimize the
storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the
columns preceding the initial column of a view. This will make a huge
difference in storage when you have a base table with many views. For example:
{code}
-- Declare that the base table will have no more than 10 columns
CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE,
CONSTRAINT pk PRIMARY KEY (k1, prefix))
MULTI_TENANT = true,
MAX_COLUMNS = 10;
CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR)
AS SELECT * FROM base WHERE prefix = 'A00';
CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR);
AS SELECT * FROM base WHERE prefix = 'A10';
...
{code}
As the number of views grow, the difference between the base table column
encoding (column #1) and the starting column number of the view (since the
starting offset is determined by an incrementing value on the base table) will
increase. This bloats the storage as we need to store null values for column
encodings between the base table column and the starting column of the view.
Instead, we'll pass through the MAX_COLUMNS value for queries and anything
column encoding less than this we know it'll be at the start. Anything greater
and we'll start the search from <column encoding> - <minimum view column
encoding>.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)