[jira] [Created] (CARBONDATA-734) Can't create parquet/orc table with carbonsession

2017-02-28 Thread Yadong Qi (JIRA)
Yadong Qi created CARBONDATA-734: Summary: Can't create parquet/orc table with carbonsession Key: CARBONDATA-734 URL: https://issues.apache.org/jira/browse/CARBONDATA-734 Project: CarbonData

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
Hi Likun, It would be same case if we use all non dictionary columns by default, it will increase the store size and decrease the performance so it is also does not encourage more users if performance is poor. If we need to make no-dictionary columns as default then we should first focus on

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread QiangCai
+1 It is not easy for user to understand the previous options. The logic of this two options SORT_COLUMNS AND TABLE_DICTIOANRY is very clear. I am coding to implement SORT_COLUMNS option by this way. Best Regards David Caiqiang -- View this message in context:

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Jacky Li
Yes, I agree to your point. The only concern I have is for loading, I have seen many users accidentally put high cardinality column into dictionary column then the loading failed because out of memory or loading very slow. I guess they just do not know to use DICTIONARY_EXCLUDE for these

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Jacky Li
> 在 2017年2月28日,下午8:35,Liang Chen 写道: > > Hi > > A couple of questions: > > 1) For SORT_KEY option: only build "MDK index, inverted index, minmax > index" for these columns which be specified into the option(SORT_KEY) ? > Yes, build MDK index, inverted index, minimax

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
Hi Likun, You mentioned that if user does not specify dictionary columns then by default those are chosen as no dictionary columns. But we have many disadvantages as I mentioned in above mail if you keep no dictionary as default. We have initially introduced no dictionary columns to handle high

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread bill.zhou
hi Ravindra That is a good idea to conside the sort column and dictioanry column together. For the DDL usability I have following suggestion. please share your suggestion 1. sort columns properties better keep the same style like dictionary. so the key word suggestion changed to SORT_INCLUDE

Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-28 Thread Jean-Baptiste Onofré
Hi guys, I created a pull request to add a complete release guide: https://github.com/apache/incubator-carbondata/pull/617 I also updated the maturity self-assessment doc: https://docs.google.com/document/d/12hifkDCfbyramBba1uRHYjwaKEcxAyWMxS9iwJ1_etY/edit?usp=sharing I would like just a

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Liang Chen
Hi A couple of questions: 1) For SORT_KEY option: only build "MDK index, inverted index, minmax index" for these columns which be specified into the option(SORT_KEY) ? 2) If users don't specify TABLE_DICTIONARY, then all columns don't make dictionary encoding, and all shuffle operations are

Re: Block B-tree loading failed

2017-02-28 Thread Ravindra Pesala
Hi, Have you loaded data freshly and try to execute the query? Or you are trying to query the old store you already has loaded? Regards, Ravindra. On 28 February 2017 at 17:20, ericzgy <1987zhangguang...@163.com> wrote: > Now when I load data into CarbonData table using spark1.6.2 and >

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Jacky Li
Yes, first we should simplify the DDL options. I propose following options, please check weather it miss some scenario. 1. SORT_COLUMNS, or SORT_KEY This indicates three things: 1) All columns specified in options will be used to construct Multi-Dimensional Key, which will be sorted along this

[jira] [Created] (CARBONDATA-733) Fixed testcase failure issue

2017-02-28 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-733: --- Summary: Fixed testcase failure issue Key: CARBONDATA-733 URL: https://issues.apache.org/jira/browse/CARBONDATA-733 Project: CarbonData Issue Type:

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Jacky Li
Yes, first we should simplify the DDL options. I propose following options, please check weather it miss some scenario. 1. SORT_COLUMNS, or SORT_KEY This indicates three things: 1) All columns specified in options will be used to construct Multi-Dimensional Key, which will be sorted along this