[ 
https://issues.apache.org/jira/browse/CARBONDATA-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1438.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0

> Unify the sort column and sort scope in create table command
> ------------------------------------------------------------
>
>                 Key: CARBONDATA-1438
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1438
>             Project: CarbonData
>          Issue Type: Improvement
>            Reporter: chenerlu
>             Fix For: 1.2.0
>
>          Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> 1     Requirement
> Currently, Users can specify sort column in table properties when create 
> table. And when load data, users can also specify sort scope in load options.
> In order to improve the ease of use for users, it will be better to specify 
> the sort related parameters all in create table command.
> Once sort scope is specified in create table command, it will be used in load 
> data even users have specified in load options.
> 2     Detailed design
> 2.1   Task-01
> Requirement: Create table can support specify sort scope
> Implement: Take use of table properties (Map<String, String>), will specify 
> sort scope in table properties by key/value pair, then existing interface 
> will be called to write this key/value pair into metastore.
> Will support Global Sort,Local Sort and No Sort,it can be specified in sql 
> command:
> CREATE TABLE tableWithGlobalSort (
> shortField SHORT,
> intField INT,
> bigintField LONG,
> doubleField DOUBLE,
> stringField STRING,
> timestampField TIMESTAMP,
> decimalField DECIMAL(18,2),
> dateField DATE,
> charField CHAR(5)
> )
> STORED BY 'carbondata'
> TBLPROPERTIES('SORT_COLUMNS'='stringField', 'SORT_SCOPE'='GLOBAL_SORT')
>  
> Tips:If the sort scope is global Sort, users should specify 
> GLOBAL_SORT_PARTITIONS. If users do not specify it, it will use the number of 
> map task. GLOBAL_SORT_PARTITIONS should be Integer type, the range is 
> [1,Integer.MaxValue],it is only used when the sort scope is global sort. 
> Global Sort   Use orderby operator in spark, data is ordered in segment level.
> Local Sort    Node ordered, carbondata file is ordered if it is written by 
> one task. 
> No Sort       No sort
> Tips:key and value is case-insensitive.
> 2.2   Task-02
> Requirement:
> Load data in will support local sort, no sort, global sort 
> Ignore the sort scope specified in load data and use the parameter which 
> specified in create table.
> Currently, user can specify the sort scope and global sort partitions in load 
> options, After modification, it will ignore the sort scope which specified in 
> load options and will get sort scope from table properties.
> Current logic: sort scope is from load options
> Number                Prerequisite    Sort scope
> 1     isSortTable is true && Sort Scope is Global Sort        Global 
> Sort(first check)
> 2     isSortTable is false    No Sort
> 3     isSortTable is true     Local Sort
> Tips: isSortTable is true means this table contains sort column or it 
> contains dimensions (except complex type), like string type.
> For example:
> Create table xxx1 (col1 string col2 int) stored by ‘carbondata’ --- sort table
> Create table xx1 (col1 int, col2 int) stored by ‘carbondata’ --- not sort 
> table
> Create table xx (col1 int, col2 string) stored by ‘carbondata’ tblproperties 
> (‘sort_column’=’col1’)  –- sort table
> New logic:sort scope is from create table
> Number        Prerequisite    Code branch
> 1     isSortTable = true && Sort Scope is Global Sort Global Sort(first check)
> 2     isSortTable= false || Sort Scope is No Sort     No Sort
> 3     isSortTable is true && Sort Scope is Local Sort Local Sort
> 4     isSortTable is true,without specify Sort Scope  Local Sort, (Keep 
> current logic) 
> 3     Acceptance standard
> Number        Acceptance standard
> 1     Use can specify sort scope(global, local, no sort) when create carbon 
> table in sql type
> 2     Load data will ignore the sort scope specified in load options and will 
> use the parameter which specify in create table command. If user still 
> specify the sort scope in load options, will give warning and inform user 
> that he will use the sort scope which specified in create table.
> 4     Feature restrictions
> NA
> 5     Dependencies
> NA
> 6     Technical risk
> NA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to