Re: questions about carbondata

weijie tong Fri, 21 Oct 2016 21:30:35 -0700

tks for the reply, for 3,I still want to know that whether all the  blocklets
of all the blocks store sequence according to the sorted mdk  key? if so ,
the global sequence mdk key of the carbon table would behave like what
hbase rowkey does . or the sequence is block local ,the block index file
manage the block level index?


On Fri, Oct 21, 2016 at 5:48 PM, 杰 <[email protected]> wrote:

> hi,
> 1. correct.
>    one carbon file is same as one block, one block has many blocklets as
> well as one file footer which has metadata(btree index) of blocklets.
>    one load makes one segment,in one segment has many blocks.
> 2. carbon will sort dim column data in one blocklet,  then the row
> sequence will lost, so carbon will store  dim column data as will as row id
> together,
>    and dim column data sorted and row id sequence changed correspondingly
> , so the matchup(like Array: index => data) is kept.
>    when query, carbon will first get  the expected dim column data (based
> on filter), then accorfing to matchup to get row id.
>    then based on the row id, we can get measure data.
>    so the column data is called as inverted index, which means data =>
> index, not index => data.
> 3. yes.
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "weijie tong";<[email protected]>;
> 发送时间: 2016年10月21日(星期五) 下午4:01
> 收件人: "dev"<[email protected]>;
>
> 主题: questions about carbondata
>
>
>
> 1,what's the relation ship between these term?
>  carbondata file ,block, blocklet ,carbondata file footer ? once we have a
> batch job to load data into a carbondata table, does that mean the table
> file will be composed by different blocks ,and each block is a carbondata
> file  which is composed by many blocklets ,and one FileFooter  according to
> the carbondata file format ?
>
> 2, how does the column data store as inverted index?
>  invert the dim column data to what ? how does inverted index affect a
> query ?
>
> 3. does all the blocklets store sequence according to the sorted mdk  key ?
>
> hope someone can give a detail answer.
>

Re: questions about carbondata

Reply via email to