[22/26] carbondata git commit: [Documentation] Editorial review
[Documentation] Editorial review Corrected spelling mistakes and grammer This closes #2965 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/d28f87c2 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/d28f87c2 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/d28f87c2 Branch: refs/heads/branch-1.5 Commit: d28f87c2896f9b50220c673cfb3d20f588042fee Parents: 9c149d7 Author: sgururajshetty Authored: Thu Nov 29 18:44:22 2018 +0530 Committer: ravipesala Committed: Fri Nov 30 21:57:21 2018 +0530 -- docs/configuration-parameters.md | 4 ++-- docs/ddl-of-carbondata.md| 4 ++-- docs/dml-of-carbondata.md| 6 +++--- docs/file-structure-of-carbondata.md | 3 +-- 4 files changed, 8 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/d28f87c2/docs/configuration-parameters.md -- diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md index a41a3d5..4aa2929 100644 --- a/docs/configuration-parameters.md +++ b/docs/configuration-parameters.md @@ -69,9 +69,9 @@ This section provides the details of all the configurations required for the Car | carbon.options.bad.records.logger.enable | false | CarbonData can identify the records that are not conformant to schema and isolate them as bad records. Enabling this configuration will make CarbonData to log such bad records. **NOTE:** If the input data contains many bad records, logging them will slow down the over all data loading throughput. The data load operation status would depend on the configuration in ***carbon.bad.records.action***. | | carbon.bad.records.action | FAIL | CarbonData in addition to identifying the bad records, can take certain actions on such data. This configuration can have four types of actions for bad records namely FORCE, REDIRECT, IGNORE and FAIL. If set to FORCE then it auto-corrects the data by storing the bad records as NULL. If set to REDIRECT then bad records are written to the raw CSV instead of being loaded. If set to IGNORE then bad records are neither loaded nor written to the raw CSV. If set to FAIL then data loading fails if any bad records are found. | | carbon.options.is.empty.data.bad.record | false | Based on the business scenarios, empty("" or '' or ,,) data can be valid or invalid. This configuration controls how empty data should be treated by CarbonData. If false, then empty ("" or '' or ,,) data will not be considered as bad record and vice versa. | -| carbon.options.bad.record.path | (none) | Specifies the HDFS path where bad records are to be stored. By default the value is Null. This path must to be configured by the user if ***carbon.options.bad.records.logger.enable*** is **true** or ***carbon.bad.records.action*** is **REDIRECT**. | +| carbon.options.bad.record.path | (none) | Specifies the HDFS path where bad records are to be stored. By default the value is Null. This path must be configured by the user if ***carbon.options.bad.records.logger.enable*** is **true** or ***carbon.bad.records.action*** is **REDIRECT**. | | carbon.blockletgroup.size.in.mb | 64 | Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md#carbondata-file-format) to understand the storage format of CarbonData. The data are read as a group of blocklets which are called blocklet groups. This parameter specifies the size of each blocklet group. Higher value results in better sequential IO access. The minimum value is 16MB, any value lesser than 16MB will reset to the default value (64MB). **NOTE:** Configuring a higher value might lead to poor performance as an entire blocklet group will have to read into memory before processing. For filter queries with limit, it is **not advisable** to have a bigger blocklet size. For aggregation queries which need to return more number of rows, bigger blocklet size is advisable. | -| carbon.sort.file.write.buffer.size | 16384 | CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. **NOTE:** This configuration is useful to tune IO and derive optimal performance. Based on the OS and underlying harddisk type, these values can significantly affect the overall performance. It is ideal to tune the buffersize equivalent to the IO buffer size of the OS. Recommended range is between 10240 and 10485760 bytes. | +| carbon.sort.file.write.buffer.size | 16384 | CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. **NOTE:** This
carbondata git commit: [Documentation] Editorial review
Repository: carbondata Updated Branches: refs/heads/master c55279c5c -> 4705d1a20 [Documentation] Editorial review Corrected spelling mistakes and grammer This closes #2965 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/4705d1a2 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/4705d1a2 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/4705d1a2 Branch: refs/heads/master Commit: 4705d1a20ac594ac115e7dc189fb80c633ec2e9b Parents: c55279c Author: sgururajshetty Authored: Thu Nov 29 18:44:22 2018 +0530 Committer: kunal642 Committed: Fri Nov 30 17:38:53 2018 +0530 -- docs/configuration-parameters.md | 4 ++-- docs/ddl-of-carbondata.md| 4 ++-- docs/dml-of-carbondata.md| 6 +++--- docs/file-structure-of-carbondata.md | 3 +-- 4 files changed, 8 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/4705d1a2/docs/configuration-parameters.md -- diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md index a41a3d5..4aa2929 100644 --- a/docs/configuration-parameters.md +++ b/docs/configuration-parameters.md @@ -69,9 +69,9 @@ This section provides the details of all the configurations required for the Car | carbon.options.bad.records.logger.enable | false | CarbonData can identify the records that are not conformant to schema and isolate them as bad records. Enabling this configuration will make CarbonData to log such bad records. **NOTE:** If the input data contains many bad records, logging them will slow down the over all data loading throughput. The data load operation status would depend on the configuration in ***carbon.bad.records.action***. | | carbon.bad.records.action | FAIL | CarbonData in addition to identifying the bad records, can take certain actions on such data. This configuration can have four types of actions for bad records namely FORCE, REDIRECT, IGNORE and FAIL. If set to FORCE then it auto-corrects the data by storing the bad records as NULL. If set to REDIRECT then bad records are written to the raw CSV instead of being loaded. If set to IGNORE then bad records are neither loaded nor written to the raw CSV. If set to FAIL then data loading fails if any bad records are found. | | carbon.options.is.empty.data.bad.record | false | Based on the business scenarios, empty("" or '' or ,,) data can be valid or invalid. This configuration controls how empty data should be treated by CarbonData. If false, then empty ("" or '' or ,,) data will not be considered as bad record and vice versa. | -| carbon.options.bad.record.path | (none) | Specifies the HDFS path where bad records are to be stored. By default the value is Null. This path must to be configured by the user if ***carbon.options.bad.records.logger.enable*** is **true** or ***carbon.bad.records.action*** is **REDIRECT**. | +| carbon.options.bad.record.path | (none) | Specifies the HDFS path where bad records are to be stored. By default the value is Null. This path must be configured by the user if ***carbon.options.bad.records.logger.enable*** is **true** or ***carbon.bad.records.action*** is **REDIRECT**. | | carbon.blockletgroup.size.in.mb | 64 | Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md#carbondata-file-format) to understand the storage format of CarbonData. The data are read as a group of blocklets which are called blocklet groups. This parameter specifies the size of each blocklet group. Higher value results in better sequential IO access. The minimum value is 16MB, any value lesser than 16MB will reset to the default value (64MB). **NOTE:** Configuring a higher value might lead to poor performance as an entire blocklet group will have to read into memory before processing. For filter queries with limit, it is **not advisable** to have a bigger blocklet size. For aggregation queries which need to return more number of rows, bigger blocklet size is advisable. | -| carbon.sort.file.write.buffer.size | 16384 | CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. **NOTE:** This configuration is useful to tune IO and derive optimal performance. Based on the OS and underlying harddisk type, these values can significantly affect the overall performance. It is ideal to tune the buffersize equivalent to the IO buffer size of the OS. Recommended range is between 10240 and 10485760 bytes. | +| carbon.sort.file.write.buffer.size | 16384 | CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines
[28/47] carbondata git commit: [Documentation] Editorial review comment fixed
[Documentation] Editorial review comment fixed Minor issues fixed (spelling, syntax, and missing info) This closes #2603 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/41bd3593 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/41bd3593 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/41bd3593 Branch: refs/heads/branch-1.4 Commit: 41bd3593d0abc727677ccabd8fc417eb492f3c2c Parents: 37e24f6 Author: sgururajshetty Authored: Thu Aug 2 19:57:31 2018 +0530 Committer: ravipesala Committed: Thu Aug 9 23:43:49 2018 +0530 -- docs/configuration-parameters.md | 2 +- docs/data-management-on-carbondata.md | 39 ++ docs/datamap/bloomfilter-datamap-guide.md | 12 docs/datamap/lucene-datamap-guide.md | 2 +- docs/datamap/timeseries-datamap-guide.md | 2 +- docs/sdk-guide.md | 8 +++--- 6 files changed, 34 insertions(+), 31 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/41bd3593/docs/configuration-parameters.md -- diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md index 6e4dea5..77cf230 100644 --- a/docs/configuration-parameters.md +++ b/docs/configuration-parameters.md @@ -140,7 +140,7 @@ This section provides the details of all the configurations required for CarbonD | carbon.enableMinMax | true | Min max is feature added to enhance query performance. To disable this feature, set it false. | | carbon.dynamicallocation.schedulertimeout | 5 | Specifies the maximum time (unit in seconds) the scheduler can wait for executor to be active. Minimum value is 5 sec and maximum value is 15 sec. | | carbon.scheduler.minregisteredresourcesratio | 0.8 | Specifies the minimum resource (executor) ratio needed for starting the block distribution. The default value is 0.8, which indicates 80% of the requested resource is allocated for starting block distribution. The minimum value is 0.1 min and the maximum value is 1.0. | -| carbon.search.enabled | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | +| carbon.search.enabled (Alpha Feature) | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | * **Global Dictionary Configurations** http://git-wip-us.apache.org/repos/asf/carbondata/blob/41bd3593/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 836fff9..41fd513 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -87,6 +87,25 @@ This tutorial is going to introduce all commands and data operations on CarbonDa * BATCH_SORT: It increases the load performance but decreases the query performance if identified blocks > parallelism. * GLOBAL_SORT: It increases the query performance, especially high concurrent point query. And if you care about loading resources isolation strictly, because the system uses the spark GroupBy to sort data, the resource can be controlled by spark. + + ### Example: + + ``` +CREATE TABLE IF NOT EXISTS productSchema.productSalesTable ( + productNumber INT, + productName STRING, + storeCity STRING, + storeProvince STRING, + productCategory STRING, + productBatch STRING, + saleQuantity INT, + revenue INT) +STORED BY 'carbondata' +TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity', + 'SORT_SCOPE'='NO_SORT') + ``` + + **NOTE:** CarbonData also supports "using carbondata". Find example code at [SparkSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/SparkSessionExample.scala) in the CarbonData repo. - **Table Block Size Configuration** @@ -170,23 +189,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
[39/50] [abbrv] carbondata git commit: [Documentation] Editorial review comment fixed
[Documentation] Editorial review comment fixed Minor issues fixed (spelling, syntax, and missing info) This closes #2603 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/12725b75 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/12725b75 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/12725b75 Branch: refs/heads/external-format Commit: 12725b75c7133971cc8a29d343def55ebd273c85 Parents: 9336924 Author: sgururajshetty Authored: Thu Aug 2 19:57:31 2018 +0530 Committer: kunal642 Committed: Fri Aug 3 18:50:23 2018 +0530 -- docs/configuration-parameters.md | 2 +- docs/data-management-on-carbondata.md | 39 ++ docs/datamap/bloomfilter-datamap-guide.md | 12 docs/datamap/lucene-datamap-guide.md | 2 +- docs/datamap/timeseries-datamap-guide.md | 2 +- docs/sdk-guide.md | 8 +++--- 6 files changed, 34 insertions(+), 31 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/12725b75/docs/configuration-parameters.md -- diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md index 6e4dea5..77cf230 100644 --- a/docs/configuration-parameters.md +++ b/docs/configuration-parameters.md @@ -140,7 +140,7 @@ This section provides the details of all the configurations required for CarbonD | carbon.enableMinMax | true | Min max is feature added to enhance query performance. To disable this feature, set it false. | | carbon.dynamicallocation.schedulertimeout | 5 | Specifies the maximum time (unit in seconds) the scheduler can wait for executor to be active. Minimum value is 5 sec and maximum value is 15 sec. | | carbon.scheduler.minregisteredresourcesratio | 0.8 | Specifies the minimum resource (executor) ratio needed for starting the block distribution. The default value is 0.8, which indicates 80% of the requested resource is allocated for starting block distribution. The minimum value is 0.1 min and the maximum value is 1.0. | -| carbon.search.enabled | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | +| carbon.search.enabled (Alpha Feature) | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | * **Global Dictionary Configurations** http://git-wip-us.apache.org/repos/asf/carbondata/blob/12725b75/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 836fff9..41fd513 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -87,6 +87,25 @@ This tutorial is going to introduce all commands and data operations on CarbonDa * BATCH_SORT: It increases the load performance but decreases the query performance if identified blocks > parallelism. * GLOBAL_SORT: It increases the query performance, especially high concurrent point query. And if you care about loading resources isolation strictly, because the system uses the spark GroupBy to sort data, the resource can be controlled by spark. + + ### Example: + + ``` +CREATE TABLE IF NOT EXISTS productSchema.productSalesTable ( + productNumber INT, + productName STRING, + storeCity STRING, + storeProvince STRING, + productCategory STRING, + productBatch STRING, + saleQuantity INT, + revenue INT) +STORED BY 'carbondata' +TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity', + 'SORT_SCOPE'='NO_SORT') + ``` + + **NOTE:** CarbonData also supports "using carbondata". Find example code at [SparkSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/SparkSessionExample.scala) in the CarbonData repo. - **Table Block Size Configuration** @@ -170,23 +189,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
carbondata git commit: [Documentation] Editorial review comment fixed
Repository: carbondata Updated Branches: refs/heads/master 9336924de -> 12725b75c [Documentation] Editorial review comment fixed Minor issues fixed (spelling, syntax, and missing info) This closes #2603 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/12725b75 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/12725b75 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/12725b75 Branch: refs/heads/master Commit: 12725b75c7133971cc8a29d343def55ebd273c85 Parents: 9336924 Author: sgururajshetty Authored: Thu Aug 2 19:57:31 2018 +0530 Committer: kunal642 Committed: Fri Aug 3 18:50:23 2018 +0530 -- docs/configuration-parameters.md | 2 +- docs/data-management-on-carbondata.md | 39 ++ docs/datamap/bloomfilter-datamap-guide.md | 12 docs/datamap/lucene-datamap-guide.md | 2 +- docs/datamap/timeseries-datamap-guide.md | 2 +- docs/sdk-guide.md | 8 +++--- 6 files changed, 34 insertions(+), 31 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/12725b75/docs/configuration-parameters.md -- diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md index 6e4dea5..77cf230 100644 --- a/docs/configuration-parameters.md +++ b/docs/configuration-parameters.md @@ -140,7 +140,7 @@ This section provides the details of all the configurations required for CarbonD | carbon.enableMinMax | true | Min max is feature added to enhance query performance. To disable this feature, set it false. | | carbon.dynamicallocation.schedulertimeout | 5 | Specifies the maximum time (unit in seconds) the scheduler can wait for executor to be active. Minimum value is 5 sec and maximum value is 15 sec. | | carbon.scheduler.minregisteredresourcesratio | 0.8 | Specifies the minimum resource (executor) ratio needed for starting the block distribution. The default value is 0.8, which indicates 80% of the requested resource is allocated for starting block distribution. The minimum value is 0.1 min and the maximum value is 1.0. | -| carbon.search.enabled | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | +| carbon.search.enabled (Alpha Feature) | false | If set to true, it will use CarbonReader to do distributed scan directly instead of using compute framework like spark, thus avoiding limitation of compute framework like SQL optimizer and task scheduling overhead. | * **Global Dictionary Configurations** http://git-wip-us.apache.org/repos/asf/carbondata/blob/12725b75/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 836fff9..41fd513 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -87,6 +87,25 @@ This tutorial is going to introduce all commands and data operations on CarbonDa * BATCH_SORT: It increases the load performance but decreases the query performance if identified blocks > parallelism. * GLOBAL_SORT: It increases the query performance, especially high concurrent point query. And if you care about loading resources isolation strictly, because the system uses the spark GroupBy to sort data, the resource can be controlled by spark. + + ### Example: + + ``` +CREATE TABLE IF NOT EXISTS productSchema.productSalesTable ( + productNumber INT, + productName STRING, + storeCity STRING, + storeProvince STRING, + productCategory STRING, + productBatch STRING, + saleQuantity INT, + revenue INT) +STORED BY 'carbondata' +TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity', + 'SORT_SCOPE'='NO_SORT') + ``` + + **NOTE:** CarbonData also supports "using carbondata". Find example code at [SparkSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/SparkSessionExample.scala) in the CarbonData repo. - **Table Block Size Configuration** @@ -170,23 +189,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
[03/50] [abbrv] carbondata git commit: [Documentation] Editorial Review comment fixed
[Documentation] Editorial Review comment fixed Editorial Review comment fixed This closes #2320 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5ad70095 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5ad70095 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5ad70095 Branch: refs/heads/carbonstore Commit: 5ad7009573b7a95a181221d6a58df05e1fafbeb6 Parents: 6aadfe7 Author: sgururajshetty Authored: Thu May 31 17:36:26 2018 +0530 Committer: kunal642 Committed: Thu May 31 17:40:30 2018 +0530 -- docs/data-management-on-carbondata.md| 4 ++-- docs/datamap/timeseries-datamap-guide.md | 8 2 files changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 51e98ab..706209c 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -35,11 +35,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa ``` CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type , ...)] - STORED BY 'carbondata' + STORED AS carbondata [TBLPROPERTIES (property_name=property_value, ...)] [LOCATION 'path'] ``` - **NOTE:** CarbonData also supports "STORED AS carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. + **NOTE:** CarbonData also supports "STORED AS carbondata" and "USING carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. ### Usage Guidelines Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties. http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 7847312..bea5286 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -1,12 +1,12 @@ # CarbonData Timeseries DataMap -* [Timeseries DataMap](#timeseries-datamap-intoduction-(alpha-feature-in-1.3.0)) +* [Timeseries DataMap Introduction](#timeseries-datamap-intoduction) * [Compaction](#compacting-pre-aggregate-tables) * [Data Management](#data-management-with-pre-aggregate-tables) -## Timeseries DataMap Intoduction (Alpha feature in 1.3.0) -Timeseries DataMap a pre-aggregate table implementation based on 'preaggregate' DataMap. -Difference is that Timerseries DataMap has built-in understanding of time hierarchy and +## Timeseries DataMap Introduction (Alpha feature in 1.3.0) +Timeseries DataMap a pre-aggregate table implementation based on 'pre-aggregate' DataMap. +Difference is that Timeseries DataMap has built-in understanding of time hierarchy and levels: year, month, day, hour, minute, so that it supports automatic roll-up in time dimension for query.
[40/50] [abbrv] carbondata git commit: [Documentation] Editorial Review comment fixed
[Documentation] Editorial Review comment fixed Editorial Review comment fixed This closes #2320 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5ad70095 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5ad70095 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5ad70095 Branch: refs/heads/spark-2.3 Commit: 5ad7009573b7a95a181221d6a58df05e1fafbeb6 Parents: 6aadfe7 Author: sgururajshetty Authored: Thu May 31 17:36:26 2018 +0530 Committer: kunal642 Committed: Thu May 31 17:40:30 2018 +0530 -- docs/data-management-on-carbondata.md| 4 ++-- docs/datamap/timeseries-datamap-guide.md | 8 2 files changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 51e98ab..706209c 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -35,11 +35,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa ``` CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type , ...)] - STORED BY 'carbondata' + STORED AS carbondata [TBLPROPERTIES (property_name=property_value, ...)] [LOCATION 'path'] ``` - **NOTE:** CarbonData also supports "STORED AS carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. + **NOTE:** CarbonData also supports "STORED AS carbondata" and "USING carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. ### Usage Guidelines Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties. http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 7847312..bea5286 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -1,12 +1,12 @@ # CarbonData Timeseries DataMap -* [Timeseries DataMap](#timeseries-datamap-intoduction-(alpha-feature-in-1.3.0)) +* [Timeseries DataMap Introduction](#timeseries-datamap-intoduction) * [Compaction](#compacting-pre-aggregate-tables) * [Data Management](#data-management-with-pre-aggregate-tables) -## Timeseries DataMap Intoduction (Alpha feature in 1.3.0) -Timeseries DataMap a pre-aggregate table implementation based on 'preaggregate' DataMap. -Difference is that Timerseries DataMap has built-in understanding of time hierarchy and +## Timeseries DataMap Introduction (Alpha feature in 1.3.0) +Timeseries DataMap a pre-aggregate table implementation based on 'pre-aggregate' DataMap. +Difference is that Timeseries DataMap has built-in understanding of time hierarchy and levels: year, month, day, hour, minute, so that it supports automatic roll-up in time dimension for query.
[03/26] carbondata git commit: [Documentation] Editorial Review comment fixed
[Documentation] Editorial Review comment fixed Editorial Review comment fixed This closes #2320 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/8f3ecaf4 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/8f3ecaf4 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/8f3ecaf4 Branch: refs/heads/branch-1.4 Commit: 8f3ecaf4b2e855983bf280b40c5077da409f9e64 Parents: dc0ec1e Author: sgururajshetty Authored: Thu May 31 17:36:26 2018 +0530 Committer: ravipesala Committed: Tue Jun 5 16:04:20 2018 +0530 -- docs/data-management-on-carbondata.md| 4 ++-- docs/datamap/timeseries-datamap-guide.md | 8 2 files changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/8f3ecaf4/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 51e98ab..706209c 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -35,11 +35,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa ``` CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type , ...)] - STORED BY 'carbondata' + STORED AS carbondata [TBLPROPERTIES (property_name=property_value, ...)] [LOCATION 'path'] ``` - **NOTE:** CarbonData also supports "STORED AS carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. + **NOTE:** CarbonData also supports "STORED AS carbondata" and "USING carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. ### Usage Guidelines Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties. http://git-wip-us.apache.org/repos/asf/carbondata/blob/8f3ecaf4/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 7847312..bea5286 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -1,12 +1,12 @@ # CarbonData Timeseries DataMap -* [Timeseries DataMap](#timeseries-datamap-intoduction-(alpha-feature-in-1.3.0)) +* [Timeseries DataMap Introduction](#timeseries-datamap-intoduction) * [Compaction](#compacting-pre-aggregate-tables) * [Data Management](#data-management-with-pre-aggregate-tables) -## Timeseries DataMap Intoduction (Alpha feature in 1.3.0) -Timeseries DataMap a pre-aggregate table implementation based on 'preaggregate' DataMap. -Difference is that Timerseries DataMap has built-in understanding of time hierarchy and +## Timeseries DataMap Introduction (Alpha feature in 1.3.0) +Timeseries DataMap a pre-aggregate table implementation based on 'pre-aggregate' DataMap. +Difference is that Timeseries DataMap has built-in understanding of time hierarchy and levels: year, month, day, hour, minute, so that it supports automatic roll-up in time dimension for query.
carbondata git commit: [Documentation] Editorial Review comment fixed
Repository: carbondata Updated Branches: refs/heads/master 6aadfe70a -> 5ad700957 [Documentation] Editorial Review comment fixed Editorial Review comment fixed This closes #2320 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5ad70095 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5ad70095 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5ad70095 Branch: refs/heads/master Commit: 5ad7009573b7a95a181221d6a58df05e1fafbeb6 Parents: 6aadfe7 Author: sgururajshetty Authored: Thu May 31 17:36:26 2018 +0530 Committer: kunal642 Committed: Thu May 31 17:40:30 2018 +0530 -- docs/data-management-on-carbondata.md| 4 ++-- docs/datamap/timeseries-datamap-guide.md | 8 2 files changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 51e98ab..706209c 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -35,11 +35,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa ``` CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type , ...)] - STORED BY 'carbondata' + STORED AS carbondata [TBLPROPERTIES (property_name=property_value, ...)] [LOCATION 'path'] ``` - **NOTE:** CarbonData also supports "STORED AS carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. + **NOTE:** CarbonData also supports "STORED AS carbondata" and "USING carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo. ### Usage Guidelines Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties. http://git-wip-us.apache.org/repos/asf/carbondata/blob/5ad70095/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 7847312..bea5286 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -1,12 +1,12 @@ # CarbonData Timeseries DataMap -* [Timeseries DataMap](#timeseries-datamap-intoduction-(alpha-feature-in-1.3.0)) +* [Timeseries DataMap Introduction](#timeseries-datamap-intoduction) * [Compaction](#compacting-pre-aggregate-tables) * [Data Management](#data-management-with-pre-aggregate-tables) -## Timeseries DataMap Intoduction (Alpha feature in 1.3.0) -Timeseries DataMap a pre-aggregate table implementation based on 'preaggregate' DataMap. -Difference is that Timerseries DataMap has built-in understanding of time hierarchy and +## Timeseries DataMap Introduction (Alpha feature in 1.3.0) +Timeseries DataMap a pre-aggregate table implementation based on 'pre-aggregate' DataMap. +Difference is that Timeseries DataMap has built-in understanding of time hierarchy and levels: year, month, day, hour, minute, so that it supports automatic roll-up in time dimension for query.
carbondata git commit: [Documentation] Editorial Review
Repository: carbondata Updated Branches: refs/heads/branch-1.3 6eb647144 -> e14c20e65 [Documentation] Editorial Review Spelling correction fixed STATMENT to STATEMENT granualrity to granularity This closes #2079 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e14c20e6 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e14c20e6 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e14c20e6 Branch: refs/heads/branch-1.3 Commit: e14c20e65426d58507484ae68f860c3e541d8ca9 Parents: 6eb6471 Author: sgururajshettyAuthored: Tue Mar 20 11:01:11 2018 +0530 Committer: manishgupta88 Committed: Tue Mar 27 16:46:34 2018 +0530 -- docs/data-management-on-carbondata.md| 2 +- docs/datamap/timeseries-datamap-guide.md | 16 2 files changed, 9 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/e14c20e6/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 2aa4a49..04f2123 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -664,7 +664,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa LOAD DATA [LOCAL] INPATH 'folder_path' INTO TABLE [db_name.]table_name PARTITION (partition_spec) OPTIONS(property_name=property_value, ...) - INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) + INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) ``` Example: http://git-wip-us.apache.org/repos/asf/carbondata/blob/e14c20e6/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 886c161..7847312 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -27,7 +27,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'year_granualrity'='1', + 'year_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -37,7 +37,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'month_granualrity'='1', + 'month_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -47,7 +47,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'day_granualrity'='1', + 'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -57,7 +57,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'hour_granualrity'='1', + 'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -67,7 +67,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'minute_granualrity'='1', + 'minute_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -77,7 +77,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'minute_granualrity'='1', + 'minute_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -105,7 +105,7 @@ level and hour level pre-aggregate USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', -'day_granualrity'='1', +'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -115,7 +115,7 @@ level and hour level pre-aggregate USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', -'hour_granualrity'='1', +'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex
carbondata git commit: [Documentation] Editorial Review
Repository: carbondata Updated Branches: refs/heads/master 67581cfe6 -> dfc5e8c53 [Documentation] Editorial Review Spelling correction fixed STATMENT to STATEMENT granualrity to granularity This closes #2079 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/dfc5e8c5 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/dfc5e8c5 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/dfc5e8c5 Branch: refs/heads/master Commit: dfc5e8c53d87b187f358fc27a124f72107231c57 Parents: 67581cf Author: sgururajshettyAuthored: Tue Mar 20 11:01:11 2018 +0530 Committer: manishgupta88 Committed: Wed Mar 21 15:31:48 2018 +0530 -- docs/data-management-on-carbondata.md| 2 +- docs/datamap/timeseries-datamap-guide.md | 16 2 files changed, 9 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/dfc5e8c5/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index bd4afdc..22db960 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -699,7 +699,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa LOAD DATA [LOCAL] INPATH 'folder_path' INTO TABLE [db_name.]table_name PARTITION (partition_spec) OPTIONS(property_name=property_value, ...) - INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) + INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) ``` Example: http://git-wip-us.apache.org/repos/asf/carbondata/blob/dfc5e8c5/docs/datamap/timeseries-datamap-guide.md -- diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md index 886c161..7847312 100644 --- a/docs/datamap/timeseries-datamap-guide.md +++ b/docs/datamap/timeseries-datamap-guide.md @@ -27,7 +27,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'year_granualrity'='1', + 'year_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -37,7 +37,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'month_granualrity'='1', + 'month_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -47,7 +47,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'day_granualrity'='1', + 'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -57,7 +57,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'hour_granualrity'='1', + 'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -67,7 +67,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'minute_granualrity'='1', + 'minute_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -77,7 +77,7 @@ ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', - 'minute_granualrity'='1', + 'minute_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -105,7 +105,7 @@ level and hour level pre-aggregate USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', -'day_granualrity'='1', +'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex @@ -115,7 +115,7 @@ level and hour level pre-aggregate USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', -'hour_granualrity'='1', +'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex
[06/25] carbondata git commit: [Documentation] Editorial review
[Documentation] Editorial review correct some docus description This closes #1992 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e5d9802a Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e5d9802a Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e5d9802a Branch: refs/heads/branch-1.3 Commit: e5d9802abe244e24a64fc883690632732d94f306 Parents: 6c25d24 Author: sgururajshettyAuthored: Fri Feb 23 17:05:17 2018 +0530 Committer: ravipesala Committed: Sat Mar 3 17:46:26 2018 +0530 -- docs/data-management-on-carbondata.md | 36 +++--- docs/faq.md | 4 ++-- docs/troubleshooting.md | 4 ++-- docs/useful-tips-on-carbondata.md | 2 +- 4 files changed, 23 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/e5d9802a/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index f70e0b7..78ab010 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -178,7 +178,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa SHOW TABLES IN defaultdb ``` -### ALTER TALBE +### ALTER TABLE The following section introduce the commands to modify the physical or logical state of the existing table(s). @@ -494,7 +494,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa [ WHERE { } ] ``` - alternatively the following the command can also be used for updating the CarbonData Table : + alternatively the following command can also be used for updating the CarbonData Table : ``` UPDATE @@ -674,7 +674,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa Insert OVERWRITE - This command allows you to insert or load overwrite on a spcific partition. + This command allows you to insert or load overwrite on a specific partition. ``` INSERT OVERWRITE TABLE table_name @@ -898,50 +898,50 @@ will fetch the data from the main table **sales** For existing table with loaded data, data load to pre-aggregate table will be triggered by the CREATE DATAMAP statement when user creates the pre-aggregate table. For incremental loads after aggregates tables are created, loading data to main table triggers -the load to pre-aggregate tables once main table loading is complete.These loads are automic +the load to pre-aggregate tables once main table loading is complete. These loads are automic meaning that data on main table and aggregate tables are only visible to the user after all tables are loaded # Querying data from pre-aggregate tables -Pre-aggregate tables cannot be queries directly.Queries are to be made on main table.Internally -carbondata will check associated pre-aggregate tables with the main table and if the +Pre-aggregate tables cannot be queries directly. Queries are to be made on main table. Internally +carbondata will check associated pre-aggregate tables with the main table, and if the pre-aggregate tables satisfy the query condition, the plan is transformed automatically to use -pre-aggregate table to fetch the data +pre-aggregate table to fetch the data. # Compacting pre-aggregate tables Compaction command (ALTER TABLE COMPACT) need to be run separately on each pre-aggregate table. Running Compaction command on main table will **not automatically** compact the pre-aggregate tables.Compaction is an optional operation for pre-aggregate table. If compaction is performed on main table but not performed on pre-aggregate table, all queries still can benefit from -pre-aggregate tables.To further improve performance on pre-aggregate tables, compaction can be +pre-aggregate tables. To further improve performance on pre-aggregate tables, compaction can be triggered on pre-aggregate tables directly, it will merge the segments inside pre-aggregate table. # Update/Delete Operations on pre-aggregate tables This functionality is not supported. NOTE (RESTRICTION): - * Update/Delete operations are not supported on main table which has pre-aggregate tables - created on it.All the pre-aggregate tables will have to be dropped before update/delete - operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually + Update/Delete operations are not supported on main table which has pre-aggregate tables + created on it. All the pre-aggregate tables will have to be dropped before update/delete + operations can be
carbondata git commit: [Documentation] Editorial review
Repository: carbondata Updated Branches: refs/heads/master eae9064a5 -> 310b06de1 [Documentation] Editorial review correct some docus description This closes #1992 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/310b06de Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/310b06de Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/310b06de Branch: refs/heads/master Commit: 310b06de1d2df88a78cbf005ca09f08553c98539 Parents: eae9064 Author: sgururajshettyAuthored: Fri Feb 23 17:05:17 2018 +0530 Committer: chenliang613 Committed: Sat Feb 24 11:23:45 2018 +0800 -- docs/data-management-on-carbondata.md | 36 +++--- docs/faq.md | 4 ++-- docs/troubleshooting.md | 4 ++-- docs/useful-tips-on-carbondata.md | 2 +- 4 files changed, 23 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/carbondata/blob/310b06de/docs/data-management-on-carbondata.md -- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index f70e0b7..78ab010 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -178,7 +178,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa SHOW TABLES IN defaultdb ``` -### ALTER TALBE +### ALTER TABLE The following section introduce the commands to modify the physical or logical state of the existing table(s). @@ -494,7 +494,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa [ WHERE { } ] ``` - alternatively the following the command can also be used for updating the CarbonData Table : + alternatively the following command can also be used for updating the CarbonData Table : ``` UPDATE @@ -674,7 +674,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa Insert OVERWRITE - This command allows you to insert or load overwrite on a spcific partition. + This command allows you to insert or load overwrite on a specific partition. ``` INSERT OVERWRITE TABLE table_name @@ -898,50 +898,50 @@ will fetch the data from the main table **sales** For existing table with loaded data, data load to pre-aggregate table will be triggered by the CREATE DATAMAP statement when user creates the pre-aggregate table. For incremental loads after aggregates tables are created, loading data to main table triggers -the load to pre-aggregate tables once main table loading is complete.These loads are automic +the load to pre-aggregate tables once main table loading is complete. These loads are automic meaning that data on main table and aggregate tables are only visible to the user after all tables are loaded # Querying data from pre-aggregate tables -Pre-aggregate tables cannot be queries directly.Queries are to be made on main table.Internally -carbondata will check associated pre-aggregate tables with the main table and if the +Pre-aggregate tables cannot be queries directly. Queries are to be made on main table. Internally +carbondata will check associated pre-aggregate tables with the main table, and if the pre-aggregate tables satisfy the query condition, the plan is transformed automatically to use -pre-aggregate table to fetch the data +pre-aggregate table to fetch the data. # Compacting pre-aggregate tables Compaction command (ALTER TABLE COMPACT) need to be run separately on each pre-aggregate table. Running Compaction command on main table will **not automatically** compact the pre-aggregate tables.Compaction is an optional operation for pre-aggregate table. If compaction is performed on main table but not performed on pre-aggregate table, all queries still can benefit from -pre-aggregate tables.To further improve performance on pre-aggregate tables, compaction can be +pre-aggregate tables. To further improve performance on pre-aggregate tables, compaction can be triggered on pre-aggregate tables directly, it will merge the segments inside pre-aggregate table. # Update/Delete Operations on pre-aggregate tables This functionality is not supported. NOTE (RESTRICTION): - * Update/Delete operations are not supported on main table which has pre-aggregate tables - created on it.All the pre-aggregate tables will have to be dropped before update/delete - operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually + Update/Delete operations are not supported on main table which has pre-aggregate tables + created on it. All the