Repository: carbondata Updated Branches: refs/heads/master 06d38ff4b -> 34e74174e
[CARBONDATA-2648] Documentation for support for COLUMN_META_CACHE and CACHE_LEVEL in create table and alter table properties Documentation for support for COLUMN_META_CACHE and CACHE_LEVEL in create table and alter table properties This closes #2558 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/34e74174 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/34e74174 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/34e74174 Branch: refs/heads/master Commit: 34e74174e0e83b00a6dc603eb86bbcc64533d1ac Parents: 06d38ff Author: sgururajshetty <[email protected]> Authored: Wed Jul 25 18:14:07 2018 +0530 Committer: manishgupta88 <[email protected]> Committed: Wed Jul 25 19:00:15 2018 +0530 ---------------------------------------------------------------------- docs/data-management-on-carbondata.md | 98 +++++++++++++++++++++++++++++- 1 file changed, 97 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/carbondata/blob/34e74174/docs/data-management-on-carbondata.md ---------------------------------------------------------------------- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 4532b41..da259a6 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -141,7 +141,103 @@ This tutorial is going to introduce all commands and data operations on CarbonDa 'SORT_SCOPE'='NO_SORT') ``` **NOTE:** CarbonData also supports "using carbondata". Find example code at [SparkSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/SparkSessionExample.scala) in the CarbonData repo. - + + - **Caching Min/Max Value for Required Columns** + By default, CarbonData caches min and max values of all the columns in schema. As the load increases, the memory required to hold the min and max values increases considerably. This feature enables you to configure min and max values only for the required columns, resulting in optimized memory usage. + + Following are the valid values for COLUMN_META_CACHE: + * If you want no column min/max values to be cached in the driver. + + ``` + COLUMN_META_CACHE=ââ + ``` + + * If you want only col1 min/max values to be cached in the driver. + + ``` + COLUMN_META_CACHE=âcol1â + ``` + + * If you want min/max values to be cached in driver for all the specified columns. + + ``` + COLUMN_META_CACHE=âcol1,col2,col3,â¦â + ``` + + Columns to be cached can be specifies either while creating tale or after creation of the table. + During create table operation; specify the columns to be cached in table properties. + + Syntax: + + ``` + CREATE TABLE [dbName].tableName (col1 String, col2 String, col3 int,â¦) STORED BY âcarbondataâ TBLPROPERTIES (âCOLUMN_META_CACHEâ=âcol1,col2,â¦â) + ``` + + Example: + + ``` + CREATE TABLE employee (name String, city String, id int) STORED BY âcarbondataâ TBLPROPERTIES (âCOLUMN_META_CACHEâ=ânameâ) + ``` + + After creation of table or on already created tables use the alter table command to configure the columns to be cached. + + Syntax: + + ``` + ALTER TABLE [dbName].tableName SET TBLPROPERTIES (âCOLUMN_META_CACHEâ=âcol1,col2,â¦â) + ``` + + Example: + + ``` + ALTER TABLE employee SET TBLPROPERTIES (âCOLUMN_META_CACHEâ=âcityâ) + ``` + + - **Caching at Block or Blocklet Level** + This feature allows you to maintain the cache at Block level, resulting in optimized usage of the memory. The memory consumption is high if the Blocklet level caching is maintained as a Block can have multiple Blocklet. + + Following are the valid values for CACHE_LEVEL: + * Configuration for caching in driver at Block level (default value). + + ``` + CACHE_LEVEL= âBLOCKâ + ``` + + * Configuration for caching in driver at Blocklet level. + + ``` + CACHE_LEVEL= âBLOCKLETâ + ``` + + Cache level can be specified either while creating table or after creation of the table. + During create table operation specify the cache level in table properties. + + Syntax: + + ``` + CREATE TABLE [dbName].tableName (col1 String, col2 String, col3 int,â¦) STORED BY âcarbondataâ TBLPROPERTIES (âCACHE_LEVELâ=âBlockletâ) + ``` + + Example: + + ``` + CREATE TABLE employee (name String, city String, id int) STORED BY âcarbondataâ TBLPROPERTIES (âCACHE_LEVELâ=âBlockletâ) + ``` + + After creation of table or on already created tables use the alter table command to configure the cache level. + + Syntax: + + ``` + ALTER TABLE [dbName].tableName SET TBLPROPERTIES (âCACHE_LEVELâ=âBlockletâ) + ``` + + Example: + + ``` + ALTER TABLE employee SET TBLPROPERTIES (âCACHE_LEVELâ=âBlockletâ) + ``` + ## CREATE TABLE AS SELECT This function allows user to create a Carbon table from any of the Parquet/Hive/Carbon table. This is beneficial when the user wants to create Carbon table from any other Parquet/Hive table and use the Carbon query engine to query and achieve better query results for cases where Carbon is faster than other file formats. Also this feature can be used for backing up the data.
