Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2520#discussion_r204970343
--- Diff: docs/data-management-on-carbondata.md ---
@@ -124,6 +124,41 @@ This tutorial is going to introduce all commands and
data operations on CarbonDa
TBLPROPERTIES ('streaming'='true')
```
+ - **Local Dictionary Configuration**
+
+ Local Dictionary is generated only for no-dictionary string/varchar
datatype columns. It helps in:
+ 1. Getting more compression on dimension columns with less cardinality.
+ 2. Filter queries and full scan queries on No-dictionary columns with
local dictionary will be faster as filter will be done on encoded data.
+ 3. Reducing the store size and memory footprint as only unique values
will be stored as part of local dictionary and corresponding data will be
stored as encoded data.
+
+ By default, Local Dictionary will be enabled and generated for all
no-dictionary string/varchar datatype columns.
+
+ Users will be able to pass following properties in create table
command:
+
+ | Properties | Default value | Description |
+ | ---------- | ------------- | ----------- |
+ | LOCAL_DICTIONARY_ENABLE | true | By default, local dictionary
will be enabled for the table |
+ | LOCAL_DICTIONARY_THRESHOLD | 10000 | The maximum cardinality for
local dictionary generation (range- 1000 to 100000) |
--- End diff --
add more description for it, such as `If the cardinality exceeds the
threshold, this column will will not use local dictionary encoding. And in this
case, the data loading performance will decrease since there is a rollback
procedure for local dictionary encoding.`
---