[CARBONDATA-1702] [Segment-Reading] added doc for query from specified segments feature
add doc for query from specified segments This closes #1727 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/705b1111 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/705b1111 Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/705b1111 Branch: refs/heads/branch-1.3 Commit: 705b1111de1e84f1bebb86b6550fcfed477e6d63 Parents: c70e73f Author: praveenmeenakshi56 <[email protected]> Authored: Wed Dec 27 19:30:52 2017 +0530 Committer: chenliang613 <[email protected]> Committed: Mon Jan 8 15:11:05 2018 +0800 ---------------------------------------------------------------------- docs/data-management-on-carbondata.md | 79 ++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/carbondata/blob/705b1111/docs/data-management-on-carbondata.md ---------------------------------------------------------------------- diff --git a/docs/data-management-on-carbondata.md b/docs/data-management-on-carbondata.md index 20c70e3..39acfbf 100644 --- a/docs/data-management-on-carbondata.md +++ b/docs/data-management-on-carbondata.md @@ -781,3 +781,82 @@ This tutorial is going to introduce all commands and data operations on CarbonDa ``` DELETE FROM TABLE CarbonDatabase.CarbonTable WHERE SEGMENT.STARTTIME BEFORE '2017-06-01 12:05:06' ``` +### SEGMENT READING + + This command is used to read data from specified segments during CarbonScan. + + + Get the Segment ID: + + ``` + SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments + ``` + + Set the segment IDs + + ``` + SET cabon.input.segments.<database_name>.<table_name> = <list of segment IDs>; + ``` + + **Property:** + + cabon.input.segments: Specifies the segment IDs to be queried. This property allows you to query specified segments of the specified table. The CarbonScan will read data from specified segments only. + + ``` + SET cabon.input.segments.<database_name>.<table_name> = <list of segment IDs>; + ``` + + If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query. + + ``` + CarbonSession.threadSet ("cabon.input.segments.<database_name>.<table_name>","<list of segment IDs>"); + ``` + + Reset the segment IDs: + + ``` + SET cabon.input.segments.<database_name>.<table_name> = *; + ``` + + If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query. + + ``` + CarbonSession.threadSet ("cabon.input.segments.<database_name>.<table_name>","*"); + ``` + + Reset + + It will reset all the properties set for carbondata. It is not recommended if you do not want to reset all the properties except cabon.input.segments. + + ``` + RESET + ``` + + **NOTE**: It is not recommended to set this property in carbon.properties file, because all the sessions will take this segments list unless it is overwritten at session or thread level. + + **Examples:** + + * Example to show the list of segment IDs,segment status, and other required details and then specify the list of segments to be read. + + ``` + SHOW SEGMENTS FOR carbontable1; + + SET cabon.input.segments.db.carbontable1 = 1,3,9; + ``` + + * Example to query with segments reading in multi threading mode: + + ``` + CarbonSession.threadSet ("cabon.input.segments.db.carbontable_Multi_Thread","1,3"); + ``` + + * Example for threadset in multithread environment (following shows how it is used in Scala code): + + ``` + def main(args: Array[String]) { + Future { + CarbonSession.threadSet ("cabon.input.segments.db.carbontable_Multi_Thread","1") + spark.sql("select count(empno) from cabon.input.segments.db.carbontable_Multi_Thread").show(); + } + } + ```
