[ 
https://issues.apache.org/jira/browse/TRAFODION-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651905#comment-14651905
 ] 

Haifeng Li commented on TRAFODION-50:
-------------------------------------

I like this proposal. It would be very nice that we have some controls  on the 
region placement that allows the regions of multiple related tables are 
collocated on the same region server.

> LP Blueprint: cmp-presplit-unsalted-tables - Add SQL syntax to pre-split 
> unsalted tables into regions
> -----------------------------------------------------------------------------------------------------
>
>                 Key: TRAFODION-50
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-50
>             Project: Apache Trafodion
>          Issue Type: New Feature
>          Components: sql-cmp
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>            Priority: Critical
>             Fix For: 2.0-incubating
>
>
> Currently, Trafodion creates tables as a single-region HBase table, unless 
> the table is salted. Salted tables have one region per salt bucket. We would 
> like to add SQL syntax to allow pre-splitting of unsalted tables as well. We 
> would like to offer two new table options, SPLIT BY and PARTITION BY. Both 
> allow specification of split keys. SPLIT BY will simply pre-split the table, 
> with no special split policy. PARTITION BY will add a prefix split policy 
> that ensures that all rows with common PARTITION BY column values will remain 
> in the same table.
> Example:
> create table lineitem(
>    l_orderkey int not null,
>    l_linenum int not null,
>    ...
>    primary key (l_orderkey, l_linenum))
> PARTITION BY (l_orderkey)
> (add first key (10000),
>  add first key (20000)
> );
> This will create three regions containing for values [<min>, 10000), [10000, 
> 20000) and [20000, <max>) it will add a prefix split policy for a 4 byte 
> prefix, the length of a non-nullable INT column. HBase will still be able to 
> split the regions further, but we will have a guarantee that all rows for a 
> given l_orderkey are located in the same region.
> If SPLIT BY is used instead of PARTITION BY, the same regions will be 
> created, but no custom split policy will be added. Note that the PARTITION BY 
> / SPLIT BY column(s) have to be a prefix of the clustering key of the table. 
> PARTITION BY and SPLIT BY are not allowed for salted tables. We may or may 
> not support pre-splitting of divisioned tables in the first release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to