comments inline

> 在 2017年4月1日,下午5:06,a <ww...@163.com> 写道:
> 
> additinal suggestion:
> 1、support at least two level partition

I think we can let user specify the partition columns, it can be multiple 
columns together to form a partition key. Is this what you mean by two level 
partition? Generally speaking, partition on multiple columns usually leads to 
small file issues, which we may want to avoid.

> 2、build the B+Tree by partition column shoud split the segment and make it 
> small and may speed load data in carbondata

When using partitioning, it will slower down the loading process as it needs 
shuffle. But benefit is that queries have filter column on partition key will 
be faster.

> 3、delete data by partition column
> 

This could be a future feature in our roadmap after partition feature is 
supported.

> 
> 
> best regards
> fish
> 
> At 2017-03-31 23:42:07, "QiangCai" <qiang...@qq.com> wrote:
>> Hi all, 
>> 
>> Let's start the discussion regarding the partition table.
>> 
>> To support partition table, what we should do?
>> 
>> 1. create table with partition to support Range Partitioning, Hash
>> Partitioning, List Partitioning and Composite Partitioning, write the
>> partition info to schema. 
>> 
>> 2. during data loading, re-partition the input data, start a task process
>> a partition, write partition information to footer and index file.
>> 
>> 3. during data query, prune B+Tree by partition if the filter contain the
>> partition column. or prune data blocks by partition when there is only
>> partition column predicate.
>> 
>> 4. optimizer the join performance of two partition tables if partition
>> column is the join column.
>> 
>>  Any thoughts, comments and questions ?
>> 
>>  Thanks!
>> 
>> Best Regards
>> David
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-support-new-feature-Partition-Table-tp9935.html
>> Sent from the Apache CarbonData Mailing List archive mailing list archive at 
>> Nabble.com.

Reply via email to