carbondata git commit: [CARBONDATA-2369] Add a document for Non Transactional table with SDK writer guide

gvramana Wed, 25 Apr 2018 05:06:58 -0700

Repository: carbondata
Updated Branches:
  refs/heads/master f2bb9f4eb -> 1b8271726



[CARBONDATA-2369] Add a document for Non Transactional table with SDK writer 
guide

This closes #2198


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/1b827172
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/1b827172
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/1b827172

Branch: refs/heads/master
Commit: 1b827172678ad1f6a011655d3b8ec3c568256f79
Parents: f2bb9f4
Author: ajantha-bhat <[email protected]>
Authored: Fri Apr 20 16:36:37 2018 +0530
Committer: Venkata Ramana G <[email protected]>
Committed: Wed Apr 25 17:33:17 2018 +0530

----------------------------------------------------------------------
 docs/data-management-on-carbondata.md |  80 +++++--
 docs/faq.md                           |  13 ++
 docs/sdk-writer-guide.md              | 347 +++++++++++++++++++++++++++++
 3 files changed, 422 insertions(+), 18 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata/blob/1b827172/docs/data-management-on-carbondata.md
----------------------------------------------------------------------
diff --git a/docs/data-management-on-carbondata.md 
b/docs/data-management-on-carbondata.md
index 2eb91bb..8999f32 100644
--- a/docs/data-management-on-carbondata.md
+++ b/docs/data-management-on-carbondata.md
@@ -93,7 +93,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
      ```
      TBLPROPERTIES ('TABLE_BLOCKSIZE'='512')
      ```
-     NOTE: 512 or 512M both are accepted.
+     **NOTE:** 512 or 512M both are accepted.
 
    - **Table Compaction Configuration**
    
@@ -141,7 +141,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
 
 ## CREATE TABLE AS SELECT
   This function allows user to create a Carbon table from any of the 
Parquet/Hive/Carbon table. This is beneficial when the user wants to create 
Carbon table from any other Parquet/Hive table and use the Carbon query engine 
to query and achieve better query results for cases where Carbon is faster than 
other file formats. Also this feature can be used for backing up the data.
-### Syntax
+
   ```
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name 
   STORED BY 'carbondata' 
@@ -174,6 +174,50 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
 
   ```
 
+## CREATE EXTERNAL TABLE
+  This function allows user to create external table by specifying location.
+  ```
+  CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.]table_name 
+  STORED BY 'carbondata' LOCATION â$FilesPathâ
+  ```
+  
+### Create external table on managed table data location.
+  Managed table data location provided will have both FACT and Metadata 
folder. 
+  This data can be generated by creating a normal carbon table and use this 
path as $FilesPath in the above syntax.
+  
+  **Example:**
+  ```
+  sql("CREATE TABLE origin(key INT, value STRING) STORED BY 'carbondata'")
+  sql("INSERT INTO origin select 100,'spark'")
+  sql("INSERT INTO origin select 200,'hive'")
+  // creates a table in $storeLocation/origin
+  
+  sql(s"""
+  |CREATE EXTERNAL TABLE source
+  |STORED BY 'carbondata'
+  |LOCATION '$storeLocation/origin'
+  """.stripMargin)
+  checkAnswer(sql("SELECT count(*) from source"), sql("SELECT count(*) from 
origin"))
+  ```
+  
+### Create external table on Non-Transactional table data location.
+  Non-Transactional table data location will have only carbondata and 
carbonindex files, there will not be a metadata folder (table status and 
schema).
+  Our SDK module currently support writing data in this format.
+  
+  **Example:**
+  ```
+  sql(
+  s"""CREATE EXTERNAL TABLE sdkOutputTable STORED BY 'carbondata' LOCATION
+  |'$writerPath' """.stripMargin)
+  ```
+  
+  Here writer path will have carbondata and index files.
+  This can be SDK output. Refer [SDK Writer 
Guide](https://github.com/apache/carbondata/blob/master/docs/sdk-writer-guide.md).
 
+  
+  **Note:**
+  Dropping of the external table should not delete the files present in the 
location.
+
+
 ## CREATE DATABASE 
   This function creates a new database. By default the database is created in 
Carbon store location, but you can also specify custom location.
   ```
@@ -268,7 +312,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
      Valid Scenarios
      - Invalid scenario - Change of decimal precision from (10,2) to (10,5) is 
invalid as in this case only scale is increased but total number of digits 
remains the same.
      - Valid scenario - Change of decimal precision from (10,2) to (12,3) is 
valid as the total number of digits are increased by 2 but scale is increased 
only by 1 which will not lead to any data loss.
-     - NOTE: The allowed range is 38,38 (precision, scale) and is a valid 
upper case scenario which is not resulting in data loss.
+     - **NOTE:** The allowed range is 38,38 (precision, scale) and is a valid 
upper case scenario which is not resulting in data loss.
 
      Example1:Changing data type of column a1 from INT to BIGINT.
      ```
@@ -303,7 +347,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   ```
   REFRESH TABLE dbcarbon.productSalesTable
   ```
-  NOTE: 
+  **NOTE:** 
   * The new database name and the old database name should be same.
   * Before executing this command the old table schema and data should be 
copied into the new database location.
   * If the table is aggregate table, then all the aggregate tables should be 
copied to the new database location.
@@ -385,7 +429,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
     OPTIONS('HEADER'='false') 
     ```
 
-       NOTE: If the HEADER option exist and is set to 'true', then the 
FILEHEADER option is not required.
+       **NOTE:** If the HEADER option exist and is set to 'true', then the 
FILEHEADER option is not required.
        
   - **FILEHEADER:** Headers can be provided in the LOAD DATA command if 
headers are missing in the source files.
 
@@ -433,21 +477,21 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
     ```
     
OPTIONS('COLUMNDICT'='column1:dictionaryFilePath1,column2:dictionaryFilePath2')
     ```
-    NOTE: ALL_DICTIONARY_PATH and COLUMNDICT can't be used together.
+    **NOTE:** ALL_DICTIONARY_PATH and COLUMNDICT can't be used together.
     
   - **DATEFORMAT/TIMESTAMPFORMAT:** Date and Timestamp format for specified 
column.
 
     ```
     OPTIONS('DATEFORMAT' = 'yyyy-MM-dd','TIMESTAMPFORMAT'='yyyy-MM-dd 
HH:mm:ss')
     ```
-    NOTE: Date formats are specified by date pattern strings. The date pattern 
letters in CarbonData are same as in JAVA. Refer to 
[SimpleDateFormat](http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html).
+    **NOTE:** Date formats are specified by date pattern strings. The date 
pattern letters in CarbonData are same as in JAVA. Refer to 
[SimpleDateFormat](http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html).
 
   - **SORT COLUMN BOUNDS:** Range bounds for sort columns.
 
     ```
     OPTIONS('SORT_COLUMN_BOUNDS'='v11,v21,v31;v12,v22,v32;v13,v23,v33')
     ```
-    NOTE:
+    **NOTE:**
     * SORT_COLUMN_BOUNDS will be used only when the SORT_SCOPE is 'local_sort'.
     * Each bound is separated by ';' and each field value in bound is 
separated by ','.
     * Carbondata will use these bounds as ranges to process data concurrently.
@@ -461,7 +505,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
     OPTIONS('SINGLE_PASS'='TRUE')
    ```
 
-   NOTE:
+   **NOTE:**
    * If this option is set to TRUE then data loading will take less time.
    * If this option is set to some invalid value other than TRUE or FALSE then 
it uses the default value.
 
@@ -489,7 +533,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
     OPTIONS('BAD_RECORDS_LOGGER_ENABLE'='true', 
'BAD_RECORD_PATH'='hdfs://hacluster/tmp/carbon', 
'BAD_RECORDS_ACTION'='REDIRECT', 'IS_EMPTY_DATA_BAD_RECORD'='false')
     ```
 
-  NOTE:
+  **NOTE:**
   * BAD_RECORDS_ACTION property can have four type of actions for bad records 
FORCE, REDIRECT, IGNORE and FAIL.
   * FAIL option is its Default value. If the FAIL option is used, then data 
loading fails if any bad records are found.
   * If the REDIRECT option is used, CarbonData will add all bad records in to 
a separate CSV file. However, this file must not be used for subsequent data 
loading because the content may not exactly match the source record. You are 
advised to cleanse the original source record for further data ingestion. This 
option is used to remind you which records are bad records.
@@ -530,7 +574,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   [ WHERE { <filter_condition> } ]
   ```
 
-  NOTE:
+  **NOTE:**
   * The source table and the CarbonData table must have the same table schema.
   * The data type of source and destination table columns should be same
   * INSERT INTO command does not support partial success if bad records are 
found, it will fail.
@@ -569,7 +613,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   [ WHERE { <filter_condition> } ]
   ```
   
-  NOTE:The update command fails if multiple input rows in source table are 
matched with single row in destination table.
+  **NOTE:** The update command fails if multiple input rows in source table 
are matched with single row in destination table.
   
   Examples:
   ```
@@ -778,7 +822,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   [TBLPROPERTIES ('PARTITION_TYPE'='HASH',
                   'NUM_PARTITIONS'='N' ...)]
   ```
-  NOTE: N is the number of hash partitions
+  **NOTE:** N is the number of hash partitions
 
 
   Example:
@@ -805,7 +849,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
                   'RANGE_INFO'='2014-01-01, 2015-01-01, 2016-01-01, ...')]
   ```
 
-  NOTE:
+  **NOTE:**
   * The 'RANGE_INFO' must be defined in ascending order in the table 
properties.
   * The default format for partition column of Date/Timestamp type is 
yyyy-MM-dd. Alternate formats for Date/Timestamp could be defined in 
CarbonProperties.
 
@@ -834,7 +878,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   [TBLPROPERTIES ('PARTITION_TYPE'='LIST',
                   'LIST_INFO'='A, B, C, ...')]
   ```
-  NOTE: List partition supports list info in one level group.
+  **NOTE:** List partition supports list info in one level group.
 
   Example:
   ```
@@ -883,7 +927,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   ALTER TABLE [db_name].table_name DROP PARTITION(partition_id) WITH DATA
   ```
 
-  NOTE:
+  **NOTE:**
   * Hash partition table is not supported for ADD, SPLIT and DROP commands.
   * Partition Id: in CarbonData like the hive, folders are not used to divide 
partitions instead partition id is used to replace the task id. It could make 
use of the characteristic and meanwhile reduce some metadata.
 
@@ -913,7 +957,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   'BUCKETCOLUMNS'='columnname')
   ```
 
-  NOTE:
+  **NOTE:**
   * Bucketing cannot be performed for columns of Complex Data Types.
   * Columns in the BUCKETCOLUMN parameter must be dimensions. The BUCKETCOLUMN 
parameter cannot be a measure or a combination of measures and dimensions.
 
@@ -1004,7 +1048,7 @@ This tutorial is going to introduce all commands and data 
operations on CarbonDa
   SET carbon.input.segments.<database_name>.<table_name> = <list of segment 
IDs>
   ```
   
-  NOTE:
+  **NOTE:**
   carbon.input.segments: Specifies the segment IDs to be queried. This 
property allows you to query specified segments of the specified table. The 
CarbonScan will read data from specified segments only.
   
   If user wants to query with segments reading in multi threading mode, then 
CarbonSession. threadSet can be used instead of SET query.

http://git-wip-us.apache.org/repos/asf/carbondata/blob/1b827172/docs/faq.md
----------------------------------------------------------------------
diff --git a/docs/faq.md b/docs/faq.md
index a2f8f59..9f74842 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -27,6 +27,7 @@
 * [How Carbon will behave when execute insert operation in abnormal 
scenarios?](#how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios)
 * [Why aggregate query is not fetching data from aggregate 
table?](#why-aggregate-query-is-not-fetching-data-from-aggregate-table)
 * [Why all executors are showing success in Spark UI even after Dataload 
command failed at Driver 
side?](#Why-all-executors-are-showing-success-in-Spark-UI-even-after-Dataload-command-failed-at-driver-side)
+* [Why different time zone result for select query output when query SDK 
writer 
output?](#Why-different-time-zone-result-for-select-query-output-when-query-SDK-writer-output)
 
 ## What are Bad Records?
 Records that fail to get loaded into the CarbonData due to data type 
incompatibility or are empty or have incompatible format are classified as Bad 
Records.
@@ -182,3 +183,15 @@ select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry 
group by cntry;
 ## Why all executors are showing success in Spark UI even after Dataload 
command failed at Driver side?
 Spark executor shows task as failed after the maximum number of retry 
attempts, but loading the data having bad records and BAD_RECORDS_ACTION 
(carbon.bad.records.action) is set as âFAILâ will attempt only once but 
will send the signal to driver as failed instead of throwing the exception to 
retry, as there is no point to retry if bad record found and BAD_RECORDS_ACTION 
is set to fail. Hence the Spark executor displays this one attempt as 
successful but the command has actually failed to execute. Task attempts or 
executor logs can be checked to observe the failure reason.
 
+## Why different time zone result for select query output when query SDK 
writer output? 
+SDK writer is an independent entity, hence SDK writer can generate carbondata 
files from a non-cluster machine that has different time zones. But at cluster 
when those files are read, it always takes cluster time-zone. Hence, the value 
of timestamp and date datatype fields are not original value.
+If wanted to control timezone of data while writing, then set cluster's 
time-zone in SDK writer by calling below API.
+```
+TimeZone.setDefault(timezoneValue)
+```
+**Example:**
+``` 
+cluster timezone is Asia/Shanghai
+TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+```
+

http://git-wip-us.apache.org/repos/asf/carbondata/blob/1b827172/docs/sdk-writer-guide.md
----------------------------------------------------------------------
diff --git a/docs/sdk-writer-guide.md b/docs/sdk-writer-guide.md
new file mode 100644
index 0000000..bfbf997
--- /dev/null
+++ b/docs/sdk-writer-guide.md
@@ -0,0 +1,347 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a 
carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or 
live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No 
metadata folder will be present.
+
+## Quick example
+
+### Example with csv format 
+
+```java
+ import java.io.IOException;
+ 
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+ 
+ public class TestSdk {
+ 
+   public static void main(String[] args) throws IOException, 
InvalidLoadOptionException {
+     testSdkWriter();
+   }
+ 
+   public static void testSdkWriter() throws IOException, 
InvalidLoadOptionException {
+     String path = "/home/root1/Documents/ab/temp";
+ 
+     Field[] fields = new Field[2];
+     fields[0] = new Field("name", DataTypes.STRING);
+     fields[1] = new Field("age", DataTypes.INT);
+ 
+     Schema schema = new Schema(fields);
+ 
+     CarbonWriterBuilder builder = 
CarbonWriter.builder().withSchema(schema).outputPath(path);
+ 
+     CarbonWriter writer = builder.buildWriterForCSVInput();
+ 
+     int rows = 5;
+     for (int i = 0; i < rows; i++) {
+       writer.write(new String[] { "robot" + (i % 10), String.valueOf(i) });
+     }
+     writer.close();
+   }
+ }
+```
+
+### Example with Avro format
+```java
+import java.io.IOException;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+
+import org.apache.avro.generic.GenericData;
+import org.apache.commons.lang.CharEncoding;
+
+import tech.allegro.schema.json2avro.converter.JsonAvroConverter;
+
+public class TestSdkAvro {
+
+  public static void main(String[] args) throws IOException, 
InvalidLoadOptionException {
+    testSdkWriter();
+  }
+
+
+  public static void testSdkWriter() throws IOException, 
InvalidLoadOptionException {
+    String path = "./AvroCarbonWriterSuiteWriteFiles";
+    // Avro schema
+    String avroSchema =
+        "{" +
+            "   \"type\" : \"record\"," +
+            "   \"name\" : \"Acme\"," +
+            "   \"fields\" : ["
+            + "{ \"name\" : \"name\", \"type\" : \"string\" },"
+            + "{ \"name\" : \"age\", \"type\" : \"int\" }]" +
+            "}";
+
+    String json = "{\"name\":\"bob\", \"age\":10}";
+
+    // conversion to GenericData.Record
+    JsonAvroConverter converter = new JsonAvroConverter();
+    GenericData.Record record = converter.convertToGenericDataRecord(
+        json.getBytes(CharEncoding.UTF_8), new 
org.apache.avro.Schema.Parser().parse(avroSchema));
+
+    // for sdk schema
+    Field[] fields = new Field[2];
+    fields[0] = new Field("name", DataTypes.STRING);
+    fields[1] = new Field("age", DataTypes.STRING);
+
+    try {
+      CarbonWriter writer = CarbonWriter.builder()
+          .withSchema(new org.apache.carbondata.sdk.file.Schema(fields))
+          .outputPath(path)
+          .buildWriterForAvroInput();
+
+      for (int i = 0; i < 100; i++) {
+        writer.write(record);
+      }
+      writer.close();
+    } catch (Exception e) {
+      e.printStackTrace();
+    }
+  }
+}
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the 
mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+
+### Class org.apache.carbondata.sdk.file.CarbonWriterBuilder
+```
+/**
+* prepares the builder with the schema provided
+* @param schema is instance of Schema
+*        This method must be called when building CarbonWriterBuilder
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withSchema(Schema schema);
+```
+
+```
+/**
+* Sets the output path of the writer builder
+* @param path is the absolute path where output files are written
+*             This method must be called when building CarbonWriterBuilder
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder outputPath(String path);
+```
+
+```
+/**
+* If set false, writes the carbondata and carbonindex files in a flat folder 
structure
+* @param isTransactionalTable is a boolelan value
+*             if set to false, then writes the carbondata and carbonindex files
+*                                                            in a flat folder 
structure.
+*             if set to true, then writes the carbondata and carbonindex files
+*                                                            in segment folder 
structure..
+*             By default set to false.
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder isTransactionalTable(boolean isTransactionalTable);
+```
+
+```
+/**
+* to set the timestamp in the carbondata and carbonindex index files
+* @param UUID is a timestamp to be used in the carbondata and carbonindex 
index files.
+*             By default set to zero.
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder uniqueIdentifier(long UUID);
+```
+
+```
+/**
+* To set the carbondata file size in MB between 1MB-2048MB
+* @param blockSize is size in MB between 1MB to 2048 MB
+*                  default value is 1024 MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockSize(int blockSize);
+```
+
+```
+/**
+* To set the blocklet size of carbondata file
+* @param blockletSize is blocklet size in MB
+*                     default value is 64 MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockletSize(int blockletSize);
+```
+
+```
+/**
+* sets the list of columns that needs to be in sorted order
+* @param sortColumns is a string array of columns that needs to be sorted.
+*                    If it is null or by default all dimensions are selected 
for sorting
+*                    If it is empty array, no columns are sorted
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder sortBy(String[] sortColumns);
+```
+
+```
+/**
+* If set, create a schema file in metadata folder.
+* @param persist is a boolean value, If set to true, creates a schema file in 
metadata folder.
+*                By default set to false. will not create metadata folder
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder persistSchemaFile(boolean persist);
+```
+
+```
+/**
+* sets the taskNo for the writer. SDKs concurrently running
+* will set taskNo in order to avoid conflicts in file's name during write.
+* @param taskNo is the TaskNo user wants to specify.
+*               by default it is system time in nano seconds.
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder taskNo(String taskNo);
+```
+
+```
+/**
+* To support the load options for sdk writer
+* @param options key,value pair of load options.
+*                supported keys values are
+*                a. bad_records_logger_enable -- true (write into separate 
logs), false
+*                b. bad_records_action -- FAIL, FORCE, IGNORE, REDIRECT
+*                c. bad_record_path -- path
+*                d. dateformat -- same as JAVA SimpleDateFormat
+*                e. timestampformat -- same as JAVA SimpleDateFormat
+*                f. complex_delimiter_level_1 -- value to Split the 
complexTypeData
+*                g. complex_delimiter_level_2 -- value to Split the nested 
complexTypeData
+*                h. quotechar
+*                i. escapechar
+*
+*                Default values are as follows.
+*
+*                a. bad_records_logger_enable -- "false"
+*                b. bad_records_action -- "FAIL"
+*                c. bad_record_path -- ""
+*                d. dateformat -- "" , uses from carbon.properties file
+*                e. timestampformat -- "", uses from carbon.properties file
+*                f. complex_delimiter_level_1 -- "$"
+*                g. complex_delimiter_level_2 -- ":"
+*                h. quotechar -- "\""
+*                i. escapechar -- "\\"
+*
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withLoadOptions(Map<String, String> options);
+```
+
+```
+/**
+* Build a {@link CarbonWriter}, which accepts row in CSV format object
+* @return CSVCarbonWriter
+* @throws IOException
+* @throws InvalidLoadOptionException
+*/
+public CarbonWriter buildWriterForCSVInput() throws IOException, 
InvalidLoadOptionException;
+```
+
+```  
+/**
+* Build a {@link CarbonWriter}, which accepts Avro format object
+* @return AvroCarbonWriter 
+* @throws IOException
+* @throws InvalidLoadOptionException
+*/
+public CarbonWriter buildWriterForAvroInput() throws IOException, 
InvalidLoadOptionException;
+```
+
+### Class org.apache.carbondata.sdk.file.CarbonWriter
+```
+/**
+* Write an object to the file, the format of the object depends on the 
implementation
+* If AvroCarbonWriter, object is of type 
org.apache.avro.generic.GenericData.Record 
+* If CSVCarbonWriter, object is of type String[] 
+* @param object
+* @throws IOException
+*/
+public abstract void write(Object object) throws IOException;
+```
+
+```
+/**
+* Flush and close the writer
+*/
+public abstract void close() throws IOException;
+```
+
+```
+/**
+* Create a {@link CarbonWriterBuilder} to build a {@link CarbonWriter}
+*/
+public static CarbonWriterBuilder builder() {
+return new CarbonWriterBuilder();
+}
+```
+
+### Class org.apache.carbondata.sdk.file.Field
+```
+/**
+* Field Constructor
+* @param name name of the field
+* @param type datatype of field, specified in strings.
+*/
+public Field(String name, String type);
+```
+
+```
+/**
+* Field constructor
+* @param name name of the field
+* @param type datatype of the field of class DataType
+*/
+public Field(String name, DataType type);  
+```
+
+### Class org.apache.carbondata.sdk.file.Schema
+
+```
+/**
+* construct a schema with fields
+* @param fields
+*/
+public Schema(Field[] fields);
+```
+
+```
+/**
+* Create a Schema using JSON string, for example:
+* [
+*   {"name":"string"},
+*   {"age":"int"}
+* ] 
+* @param json specified as string
+* @return Schema
+*/
+public static Schema parseJson(String json);
+```
\ No newline at end of file

carbondata git commit: [CARBONDATA-2369] Add a document for Non Transactional table with SDK writer guide

Reply via email to