Yadong Qi created CARBONDATA-734:
Summary: Can't create parquet/orc table with carbonsession
Key: CARBONDATA-734
URL: https://issues.apache.org/jira/browse/CARBONDATA-734
Project: CarbonData
Hi Likun,
It would be same case if we use all non dictionary columns by default, it
will increase the store size and decrease the performance so it is also
does not encourage more users if performance is poor.
If we need to make no-dictionary columns as default then we should first
focus on
+1
It is not easy for user to understand the previous options.
The logic of this two options SORT_COLUMNS AND TABLE_DICTIOANRY is very
clear.
I am coding to implement SORT_COLUMNS option by this way.
Best Regards
David Caiqiang
--
View this message in context:
Yes, I agree to your point. The only concern I have is for loading, I have seen
many users accidentally put high cardinality column into dictionary column then
the loading failed because out of memory or loading very slow. I guess they
just do not know to use DICTIONARY_EXCLUDE for these
> 在 2017年2月28日,下午8:35,Liang Chen 写道:
>
> Hi
>
> A couple of questions:
>
> 1) For SORT_KEY option: only build "MDK index, inverted index, minmax
> index" for these columns which be specified into the option(SORT_KEY) ?
>
Yes, build MDK index, inverted index, minimax
Hi Likun,
You mentioned that if user does not specify dictionary columns then by
default those are chosen as no dictionary columns.
But we have many disadvantages as I mentioned in above mail if you keep no
dictionary as default. We have initially introduced no dictionary columns
to handle high
hi Ravindra
That is a good idea to conside the sort column and dictioanry column
together.
For the DDL usability I have following suggestion. please share your
suggestion
1. sort columns properties better keep the same style like dictionary.
so the key word suggestion changed to SORT_INCLUDE
Hi guys,
I created a pull request to add a complete release guide:
https://github.com/apache/incubator-carbondata/pull/617
I also updated the maturity self-assessment doc:
https://docs.google.com/document/d/12hifkDCfbyramBba1uRHYjwaKEcxAyWMxS9iwJ1_etY/edit?usp=sharing
I would like just a
Hi
A couple of questions:
1) For SORT_KEY option: only build "MDK index, inverted index, minmax
index" for these columns which be specified into the option(SORT_KEY) ?
2) If users don't specify TABLE_DICTIONARY, then all columns don't make
dictionary encoding, and all shuffle operations are
Hi,
Have you loaded data freshly and try to execute the query? Or you are
trying to query the old store you already has loaded?
Regards,
Ravindra.
On 28 February 2017 at 17:20, ericzgy <1987zhangguang...@163.com> wrote:
> Now when I load data into CarbonData table using spark1.6.2 and
>
Yes, first we should simplify the DDL options. I propose following options,
please check weather it miss some scenario.
1. SORT_COLUMNS, or SORT_KEY
This indicates three things:
1) All columns specified in options will be used to construct
Multi-Dimensional Key, which will be sorted along this
kumar vishal created CARBONDATA-733:
---
Summary: Fixed testcase failure issue
Key: CARBONDATA-733
URL: https://issues.apache.org/jira/browse/CARBONDATA-733
Project: CarbonData
Issue Type:
Yes, first we should simplify the DDL options. I propose following options,
please check weather it miss some scenario.
1. SORT_COLUMNS, or SORT_KEY
This indicates three things:
1) All columns specified in options will be used to construct Multi-Dimensional
Key, which will be sorted along this
13 matches
Mail list logo