[CARBONDATA-2750] Added Documentation for Local Dictionary Support

Added Documentation for Local Dictionary Support

This closes #2520


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e21e494b
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e21e494b
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e21e494b

Branch: refs/heads/branch-1.4
Commit: e21e494b6fa14e40eb5fdd9291fb051603644211
Parents: d691d49
Author: praveenmeenakshi56 <[email protected]>
Authored: Wed Jul 25 21:01:37 2018 +0530
Committer: ravipesala <[email protected]>
Committed: Tue Jul 31 00:11:26 2018 +0530

----------------------------------------------------------------------
 docs/data-management-on-carbondata.md | 66 ++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata/blob/e21e494b/docs/data-management-on-carbondata.md
----------------------------------------------------------------------
diff --git a/docs/data-management-on-carbondata.md 
b/docs/data-management-on-carbondata.md
index da259a6..27cdab6 100644
--- a/docs/data-management-on-carbondata.md
+++ b/docs/data-management-on-carbondata.md
@@ -124,6 +124,52 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
      TBLPROPERTIES ('streaming'='true')
      ```
 
+   - **Local Dictionary Configuration**
+     
+     Local Dictionary is generated only for no-dictionary string/varchar 
datatype columns. It helps in:
+     1. Getting more compression on dimension columns with less cardinality.
+     2. Filter queries and full scan queries on No-dictionary columns with 
local dictionary will be faster as filter will be done on encoded data.
+     3. Reducing the store size and memory footprint as only unique values 
will be stored as part of local dictionary and corresponding data will be 
stored as encoded data.
+   
+     By default, Local Dictionary will be enabled and generated for all 
no-dictionary string/varchar datatype columns.
+          
+     Users will be able to pass following properties in create table command: 
+          
+     | Properties | Default value | Description |
+     | ---------- | ------------- | ----------- |
+     | LOCAL_DICTIONARY_ENABLE | true | By default, local dictionary will be 
enabled for the table | 
+     | LOCAL_DICTIONARY_THRESHOLD | 10000 | The maximum cardinality for local 
dictionary generation (range- 1000 to 100000) |
+     | LOCAL_DICTIONARY_INCLUDE | all no-dictionary string/varchar columns | 
Columns for which Local Dictionary is generated. |
+     | LOCAL_DICTIONARY_EXCLUDE | none | Columns for which Local Dictionary is 
not generated |
+        
+      **NOTE:**  If the cardinality exceeds the threshold, this column will 
not use local dictionary encoding. And in this case, the data loading 
performance will decrease since there is a rollback procedure for local 
dictionary encoding.
+      
+      **Calculating Memory Usage for Local Dictionary:**
+      
+      Encoded data and Actual data are both stored when Local Dictionary is 
enabled.
+      Suppose 'x' columns are configured for Local Dictionary generation out 
of a total of 'y' string/varchar columns. 
+      
+      Total size will be 
+      
+      Memory size(y-x) + ((4 bytes * number of rows) * x) + (Local Dictionary 
size of x columns)
+      
+      Local Dictionary size = ((memory occupied by each unique value * 
cardinality of the column) * number of columns)
+      
+### Example:
+ 
+   ```
+   CREATE TABLE carbontable(
+             
+               column1 string,
+             
+               column2 string,
+             
+               column3 LONG )
+             
+     STORED BY 'carbondata'
+     
TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
+     'LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
+   ```
 ### Example:
 
    ```
@@ -390,6 +436,11 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
      ```
       NOTE: Add Complex datatype columns is not supported.
 
+Users can specify which columns to include and exclude for local dictionary 
generation after adding new columns. These will be appended with the already 
existing local dictionary include and exclude columns of main table 
respectively.
+  ```
+     ALTER TABLE carbon ADD COLUMNS (a1 STRING, b1 STRING) 
TBLPROPERTIES('LOCAL_DICTIONARY_INCLUDE'='a1','LOCAL_DICTIONARY_EXCLUDE'='b1')
+  ```
+
    - **DROP COLUMNS**
    
      This command is used to delete the existing column(s) in a table.
@@ -442,6 +493,21 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
       ```
       **NOTE:**
       * Merge index is not supported on streaming table.
+      
+- **SET and UNSET for Local Dictionary Properties**
+
+   When set command is used, all the newly set properties will override the 
corresponding old properties if exists.
+  
+   Example to SET Local Dictionary Properties:
+    ```
+   ALTER TABLE tablename SET 
TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='false','LOCAL_DICTIONARY_THRESHOLD'='1000','LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
+    ```
+   When Local Dictionary properties are unset, corresponding default values 
will be used for these properties.
+      
+   Example to UNSET Local Dictionary Properties:
+    ```
+   ALTER TABLE tablename UNSET 
TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE','LOCAL_DICTIONARY_THRESHOLD','LOCAL_DICTIONARY_INCLUDE','LOCAL_DICTIONARY_EXCLUDE')
+    ```
 
 ### DROP TABLE
   

Reply via email to