chenliang613 commented on a change in pull request #4239: URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373056
########## File path: docs/addsegment-guide.md ########## @@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver ### Add segment with path and format Users can add the existing data as a segment to the carbon table provided the schema of the data and the carbon table should be the same. + + Syntax + + ``` + ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...) + ``` + +**Supported properties:** + +| Property | Description | +| ------------------------------------------------------------ | ------------------------------------------------------------ | +| [PATH](#path) | User external old table path | +| [FORMAT](#format) | User external old table file format | +| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string" | + + +- + You can use the following options to add segment: + + - ##### PATH: + User old table path. + + ``` + OPTIONS('PATH'='hdfs://usr/oldtable') + ``` + + - ##### FORMAT: + User old table file format. eg : parquet, orc + + ``` + OPTIONS('FORMAT'='parquet') + ``` + - ##### PARTITION: + Extract partition info for partition table , should be form of "a:int, b:string" + + ``` + OPTIONS('PARTITION'='a:int, b:string') + ``` + -``` -alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet') -``` In the above command user can add the existing data to the carbon table as a new segment and also can provide the data format. During add segment, it will infer the schema from data and validates the schema against the carbon table. If the schema doesn’t match it throws an exception. +**Example:** + +Exist old hive partition table , stored as orc or parquet file format: + + +```sql +CREATE TABLE default.log_parquet_par ( + id BIGINT, + event_time BIGINT, + ip STRING +)PARTITIONED BY ( + day INT, + hour INT, + type INT +) +STORED AS parquet +LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par'; +``` + +Parquet File Location : + +``` +25.1 K 75.2 K /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0 Review comment: please remove these info : 25.1 K 75.2 K -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@carbondata.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org