[
https://issues.apache.org/jira/browse/PHOENIX-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900703#comment-13900703
]
James Taylor commented on PHOENIX-21:
-------------------------------------
I'm considering requiring the user to create a "shared" index from the physical
table to "enable" indexing on views. What do you think, [~elilevine]? It would
be something like this:
{code}
CREATE TABLE t (k VARCHAR PRIMARY KEY, c VARCHAR);
// Enables indexes to be created on views - it's possible we
// could require an additional keyword like SHARED or VIEW
// before INDEX, or even use a different syntax altogether.
// The advantage is that you could pass properties through
// and or split points, as typically indexes are smaller than
// tables and you usually want to split them differently.
// However, it's difficult to know what to use, as the "real"
// index would be created later - we don't know how
// many columns it'll have yet.
CREATE INDEX i ON t MAX_FILE_SIZE=200000
CREATE VIEW v(c2 VARCHAR) AS SELECT * FROM t WHERE c = 'foo';
// This would use the physical index table i
CREATE INDEX iv ON v(c2);
{code}
The alternative is to auto-magically create a physical index table by tacking
on a "_IDX" to the table name. We could do this for multi-tenant tables when
they're created. We wouldn't have a facility for passing through properties
specific to the index, but I suppose we could guess at the MAX_FILE_SIZE and
just have it be some percentage smaller than the tables (50% ?). I suppose this
could be tuned after-the-fact directly through HBase too.
For views that are not tenant-specific we could just create separate physical
index tables when an index is added to them. Or we could auto-magically create
a shared physical index table as above when the first view is created off of a
table.
Thoughts?
> Support indexes on multi-tenant views
> -------------------------------------
>
> Key: PHOENIX-21
> URL: https://issues.apache.org/jira/browse/PHOENIX-21
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: James Taylor
>
> Current plan is to create a sibling index table based on the base
> multi-tenant table, suffixed with _INDEX when the base table is created. If
> your interested in this feature, best to review the following documents first:
> http://phoenix.incubator.apache.org/views.html
> http://phoenix.incubator.apache.org/multi-tenancy.html
> In addition, we need a multi-tenant sequence, again based on the multi-tenant
> table name suffixed with _SEQ. The base columns in this table would be: the
> first column from the base table (i.e. the tenant ID column) followed by
> index id SMALLINT. This index ID is required to ensure that data from
> different indexes on the same multi-tenant table don't intermix with each
> other (as the rest of the PK is specific to the index being added). The index
> ID would be based on the next value in the _SEQ (a multi-tenant sequence).
> For simplicity, make it two bytes, starting with Short.MIN_VALUE. We'll never
> reuse an ID, so this gives us 64K create/drop indexes per multi-tenant base
> table. The rest of the PK columns would depend on the index being created,
> and would include in this order:
> <any constant columns (think key-prefix here) from your updatable view>
> <the indexed columns from the create index statement>
> <the rest of the PK columns from the data table - standard logic here>
> The changes necessary to support this include:
> 1. Tracking the constant value referenced on a column for an updatable view
> in PColumn. Might as well track if a column is referenced by a view as well.
> Two new columns: IS_REFERENCED_BY_VIEW, CONSTANT_VALUE_IN_VIEW. For the IS
> NULL case, we can use an empty byte array to differentiate from a null. These
> values would be passed through the CreateTableCompiler and upserted with the
> rest of the data.
> 2. Disallow the removal of any column in a view that is referenced in the
> view statement. An alternative would be to invalidate the view.
> 3. Add a new INDEX_ID column to PTable specifically for an INDEX on a VIEW.
> This will get populated based on the next value in the sequence.
> 4. Push the CONSTANT_VALUE_IN_VIEW into the IndexMaintainer along with the
> tenantId and the index ID. These values will be used to form the Put(s) and
> Delete(s) that get formed when an index is maintained.
> 5. Modify the code that automatically prepends tenant_id to also prepend the
> index ID in the case. This will be in WhereOptimizer, UpsertCompiler, and
> DeleteCompiler.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)