[
https://issues.apache.org/jira/browse/TRAFODION-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642845#comment-14642845
]
Anoop Sharma commented on TRAFODION-1419:
-----------------------------------------
I will create a document and add the syntax and semantics details that have been
discussed in comments of this jira. Will also add examples.
Here are responses to questions and comments.
Syntax: the <colFam>.<colName> syntax is the same syntax that is used
by phoenix
for creating a new table as well as mapping an existing hbase table to
relational syntax.
We will start with that but can always extend it with additional
syntax in future.
table created in ‘create table like’ will have the same families as the
source table.
An option could be specified to not use source families.
columns specified in an update stats statement are treated same as a
dml stmt. One cannot specify
column family for them.
to move a column from one family to another, one need to drop and add
it with a different col fam.
key columns will need to be part of the same column family
alter stmt could be used to assign different hbase options to different
families by using the NAME clause
columns of an index may belong to different families in the base table
but they will be created in one column
family in the index. This column family will be the same family as the default
col fam of the base table
columns of a unique or RI constraint can be in different families. If
an index need to be created for a unique or RI constraint,
then same rules as creating an index applies. Columns created in the index will
be in one column family but they could be part
of different families in the base table.
no changes to privileges.
while accessing columns in a dml statement or storing them in hbase,
the column family and column names are
passed to hbase which will return or update the correct location and value.
Hbase stores columns of one family together and may internally optimize get/put
of values from multiple families,
but caller is unaware of that.
> Add support for multiple column families in a trafodion table
> -------------------------------------------------------------
>
> Key: TRAFODION-1419
> URL: https://issues.apache.org/jira/browse/TRAFODION-1419
> Project: Apache Trafodion
> Issue Type: New Feature
> Reporter: Anoop Sharma
> Assignee: Anoop Sharma
>
> This proposal is to add support for multiple column families in trafodion
> tables. With this feature, one can store columns into multiple column
> families. One use for this would be to store frequently used columns in one
> column family and infrequently used columns to be stored in a different
> column family. That will have performance improvement when those columns are
> retrieved from hbase. There could be other uses as well.
> Syntax:
> create table <tablename> ( <colFam1>.<colName1> <datatype>,
> <colFam2>.<colName2> <datatype> ….)
> attributes default column family <colFam>;
> alter table <tablename> add column <colFam>.<colName> datatype;
> <colFam> : name of column family for that column
> Semantics:
> <colFam> name follows identifier rules. If not double quoted, then it
> will be upper cased. If double quoted, then case will be maintained.
> User specified column family can be of arbitrary length. To optimize
> space for column family stored in a cell, a 2 byte encoding is generated.
> Mapping of user specified column family to encoded column family is stored in
> metadata.
> If no column family is specified for a column during create table, then
> the family specified in ‘attributes default column family’ clause is used.
> If no ‘attribute default column family’ clause is specified , then system
> default col family is used.
> column family specification is supported for regular and volatile
> tables.
> all unique column families specified during create or alter are added
> to the table
> maximum number of column families supported in one table is 32. But it
> is hbase recommendation to not create too many column families.
> alter statement can be used to assign specific hbase options to
> specific column families
> using the NAME clause. If no name clause is specified, then alter hbase
> options are applied
> to all col families.
> invoke and showddl statements will show the original user specified
> column families and not the encoded column families
> Currently, multiple column families are not supported for columns of a
> user created or an implicitly created index.
> The default column family of the corresponding base table is used for all
> index columns.
> column family cannot be specified in a DML query
> column family cannot be specified for columns of an aligned row format
> table since all columns are stored as one cell
> Column names must be unique for each table. The same column name cannot
> be used as part of multiple column families.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)