[
https://issues.apache.org/jira/browse/TRAFODION-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652447#comment-14652447
]
Hans Zeller commented on TRAFODION-50:
--------------------------------------
At the moment, we are not allowing PARTITION BY in the syntax (if we do, that's
a bug and the PARTITION BY is probably being ignored). This JIRA proposes the
SPLIT BY syntax to reserve PARTITION BY for a future feature that does what you
suggested in the comment above, support for collocation of multiple tables.
||Syntax||Computed column in key||Initial HBase region splits||HBase split
policy||Comments||
|SALT|{noformat}"_SALT_"{noformat}|One region per salt value or per group of
salt values (see TRAFODION-781)|ConstantSizeRegionSplitPolicy|Easiest way to
pre-split a table. We use the constant size region split policy to discourage
agressive region splitting by HBase.|
|SPLIT BY|None|As specified in SPLIT BY|TBD, probably
ConstantSizeRegionSplitPolicy as well.|New syntax introduced by this JIRA|
> LP Blueprint: cmp-presplit-unsalted-tables - Add SQL syntax to pre-split
> unsalted tables into regions
> -----------------------------------------------------------------------------------------------------
>
> Key: TRAFODION-50
> URL: https://issues.apache.org/jira/browse/TRAFODION-50
> Project: Apache Trafodion
> Issue Type: New Feature
> Components: sql-cmp
> Reporter: Hans Zeller
> Assignee: Hans Zeller
> Priority: Critical
> Fix For: 2.0-incubating
>
>
> Currently, Trafodion creates tables as a single-region HBase table, unless
> the table is salted. Salted tables have one region per salt bucket. We would
> like to add SQL syntax to allow pre-splitting of unsalted tables as well. We
> would like to offer two new table options, SPLIT BY and PARTITION BY. Both
> allow specification of split keys. SPLIT BY will simply pre-split the table,
> with no special split policy. PARTITION BY will add a prefix split policy
> that ensures that all rows with common PARTITION BY column values will remain
> in the same table.
> Example:
> create table lineitem(
> l_orderkey int not null,
> l_linenum int not null,
> ...
> primary key (l_orderkey, l_linenum))
> PARTITION BY (l_orderkey)
> (add first key (10000),
> add first key (20000)
> );
> This will create three regions containing for values [<min>, 10000), [10000,
> 20000) and [20000, <max>) it will add a prefix split policy for a 4 byte
> prefix, the length of a non-nullable INT column. HBase will still be able to
> split the regions further, but we will have a guarantee that all rows for a
> given l_orderkey are located in the same region.
> If SPLIT BY is used instead of PARTITION BY, the same regions will be
> created, but no custom split policy will be added. Note that the PARTITION BY
> / SPLIT BY column(s) have to be a prefix of the clustering key of the table.
> PARTITION BY and SPLIT BY are not allowed for salted tables. We may or may
> not support pre-splitting of divisioned tables in the first release.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)