[jira] [Issue Comment Deleted] (CARBONDATA-2951) CSDK: Provide C++ interface for SDK

2018-11-23 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2951:

Comment: was deleted

(was: Introducing Apache CarbonData : A new hadoop-native file format for 
faster data analysis, O'Reilly Open Source Convention: OSCON, May 16 - 19, 2016 
:https://www.youtube.com/watch?v=VEckmJuU47g)

> CSDK: Provide C++ interface for SDK
> ---
>
> Key: CARBONDATA-2951
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2951
> Project: CarbonData
>  Issue Type: Task
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Critical
> Fix For: NONE
>
>
> For the some user of using C++ code in their project, they can't call 
> CarbonData interface and integrate CarbonData into their C++ project. So we 
> plan to provide C++  interface for C++ user to integrate carbon, including 
> read and write CarbonData. It's will more convenient for they.
> We plan to design and develop  as following:
> 1. Provide CarbonReader for SDK, it can read carbon data in C++ language
>   ##features/interfaces
> 1.1.  create CarbonReader
>   1.2.hasNext()
>   1.3.readNextRow()
>   1.4.close()
>   1.5.support OBS(AK/SK/Endpoint)
>   1.6 support batch read(withBatch,readNextBatchRow) 
>   1.7 support vecor read(default) and carbonrecordreader 
> (withRowRecordReader)
>   1.8 projection
>   
>   ##support data types:
>String, 
> Long,Varchar(string),Short,Int,Date(int),timestamp(long),boolean,Decimal(string),Float
>Array in carbonrecordreader, not support in vectorreader
>byte=>support in java RowUtil, not in C++ carbon reader
>
>   ## Schema and data
>Create table tbl_email_form_to_for_XX( 
>   Event_Time Timestamp,
>   Ingestion_Time Timestamp,
>   From_Email String,
>   To_Email String,
>   From_To_type String,
>   Event_ID String
>   ) using carbon options(path ‘obs://X/tbl_email_form_to_for_XX’)
>   ETL 6 columns from 18 columns table
>   
>   example data:
>   from_email_36550_phillip.al...@enron.com
> to_email_36550_stagecoachm...@hotmail.com   from_to 
> <29528303.107585557.JavaMail.evans@thyme>   153801549700
> 975514920
> 2. the performance should be reach X millions records/s/node
> 3.Provide CarbonWriter for SDK, it can write carbon data in C++ language
>   ##features/interfaces
>   3.1.create CarbonWriter, including create schema(withCsvInput),set 
> outputPath, and build,
>   3.2.write()
>   3.3.close()
>   3.4.support OBS(AK/SK/Endpoint)(withHadoopConf)
>   3.5.writtenBy
>   3.6. support withTableProperty, withLoadOption,taskNo, 
> uniqueIdentifier, withThreadSafe,  withBlockSize, withBlockletSize, 
> localDictionaryThreshold, enableLocalDictionary in C++ SDK (PR2899 TO BE 
> review)
>   
>   ##Data types:
>  Carbon need support base data types, including string, float, 
> double, int, long, date, timestamp, bool, array.
>   For other, we can convert:
>  char array => carbon string
>  Enum => Carbon string
>   set and list => carbon array
>   ##performance
>   Writing Performance is not required now
>   
> 4. read schema function
> readSchema
> getVersionDetails  =>TODO
> 5. support carbonproperties
>   5.1 addProperty
>   5.2 getProperty
>   
> 6.TODO:
>   6.1.getVersionDetails. => to be review
>   6.2.updated SDK/CSDK reader doc => to be review
>   6.3.support byte(write read)
>   6.4.support long string columns
>   6.5.support sortBy=> to be review
>   6.6.support withCsvInput(Schema schema);  create schema(JAVA)
>   6.7. optimize the write doc => to be review
>   /**
>   * Create a {@link CarbonWriterBuilder} to build a 
> {@link CarbonWriter}
>   */
>   public static CarbonWriterBuilder builder() {
>   return new CarbonWriterBuilder();
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (CARBONDATA-2951) CSDK: Provide C++ interface for SDK

2018-11-23 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2951:

Comment: was deleted

(was: Apache Carbondata: An Indexed Columnar File Format for Interactive Query 
by Jacky Li/Jihong Ma, Spark summit EAST 
2017:https://www.youtube.com/watch?v=lhsAg2H_GXc)

> CSDK: Provide C++ interface for SDK
> ---
>
> Key: CARBONDATA-2951
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2951
> Project: CarbonData
>  Issue Type: Task
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Critical
> Fix For: NONE
>
>
> For the some user of using C++ code in their project, they can't call 
> CarbonData interface and integrate CarbonData into their C++ project. So we 
> plan to provide C++  interface for C++ user to integrate carbon, including 
> read and write CarbonData. It's will more convenient for they.
> We plan to design and develop  as following:
> 1. Provide CarbonReader for SDK, it can read carbon data in C++ language
>   ##features/interfaces
> 1.1.  create CarbonReader
>   1.2.hasNext()
>   1.3.readNextRow()
>   1.4.close()
>   1.5.support OBS(AK/SK/Endpoint)
>   1.6 support batch read(withBatch,readNextBatchRow) 
>   1.7 support vecor read(default) and carbonrecordreader 
> (withRowRecordReader)
>   1.8 projection
>   
>   ##support data types:
>String, 
> Long,Varchar(string),Short,Int,Date(int),timestamp(long),boolean,Decimal(string),Float
>Array in carbonrecordreader, not support in vectorreader
>byte=>support in java RowUtil, not in C++ carbon reader
>
>   ## Schema and data
>Create table tbl_email_form_to_for_XX( 
>   Event_Time Timestamp,
>   Ingestion_Time Timestamp,
>   From_Email String,
>   To_Email String,
>   From_To_type String,
>   Event_ID String
>   ) using carbon options(path ‘obs://X/tbl_email_form_to_for_XX’)
>   ETL 6 columns from 18 columns table
>   
>   example data:
>   from_email_36550_phillip.al...@enron.com
> to_email_36550_stagecoachm...@hotmail.com   from_to 
> <29528303.107585557.JavaMail.evans@thyme>   153801549700
> 975514920
> 2. the performance should be reach X millions records/s/node
> 3.Provide CarbonWriter for SDK, it can write carbon data in C++ language
>   ##features/interfaces
>   3.1.create CarbonWriter, including create schema(withCsvInput),set 
> outputPath, and build,
>   3.2.write()
>   3.3.close()
>   3.4.support OBS(AK/SK/Endpoint)(withHadoopConf)
>   3.5.writtenBy
>   3.6. support withTableProperty, withLoadOption,taskNo, 
> uniqueIdentifier, withThreadSafe,  withBlockSize, withBlockletSize, 
> localDictionaryThreshold, enableLocalDictionary in C++ SDK (PR2899 TO BE 
> review)
>   
>   ##Data types:
>  Carbon need support base data types, including string, float, 
> double, int, long, date, timestamp, bool, array.
>   For other, we can convert:
>  char array => carbon string
>  Enum => Carbon string
>   set and list => carbon array
>   ##performance
>   Writing Performance is not required now
>   
> 4. read schema function
> readSchema
> getVersionDetails  =>TODO
> 5. support carbonproperties
>   5.1 addProperty
>   5.2 getProperty
>   
> 6.TODO:
>   6.1.getVersionDetails. => to be review
>   6.2.updated SDK/CSDK reader doc => to be review
>   6.3.support byte(write read)
>   6.4.support long string columns
>   6.5.support sortBy=> to be review
>   6.6.support withCsvInput(Schema schema);  create schema(JAVA)
>   6.7. optimize the write doc => to be review
>   /**
>   * Create a {@link CarbonWriterBuilder} to build a 
> {@link CarbonWriter}
>   */
>   public static CarbonWriterBuilder builder() {
>   return new CarbonWriterBuilder();
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (CARBONDATA-2951) CSDK: Provide C++ interface for SDK

2018-11-23 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2951:

Comment: was deleted

(was: Apache Carbondata: An indexed columnar file format for interactive query 
with Spark SQL:https://www.youtube.com/watch?v=yya8-GzRW5M)

> CSDK: Provide C++ interface for SDK
> ---
>
> Key: CARBONDATA-2951
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2951
> Project: CarbonData
>  Issue Type: Task
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Critical
> Fix For: NONE
>
>
> For the some user of using C++ code in their project, they can't call 
> CarbonData interface and integrate CarbonData into their C++ project. So we 
> plan to provide C++  interface for C++ user to integrate carbon, including 
> read and write CarbonData. It's will more convenient for they.
> We plan to design and develop  as following:
> 1. Provide CarbonReader for SDK, it can read carbon data in C++ language
>   ##features/interfaces
> 1.1.  create CarbonReader
>   1.2.hasNext()
>   1.3.readNextRow()
>   1.4.close()
>   1.5.support OBS(AK/SK/Endpoint)
>   1.6 support batch read(withBatch,readNextBatchRow) 
>   1.7 support vecor read(default) and carbonrecordreader 
> (withRowRecordReader)
>   1.8 projection
>   
>   ##support data types:
>String, 
> Long,Varchar(string),Short,Int,Date(int),timestamp(long),boolean,Decimal(string),Float
>Array in carbonrecordreader, not support in vectorreader
>byte=>support in java RowUtil, not in C++ carbon reader
>
>   ## Schema and data
>Create table tbl_email_form_to_for_XX( 
>   Event_Time Timestamp,
>   Ingestion_Time Timestamp,
>   From_Email String,
>   To_Email String,
>   From_To_type String,
>   Event_ID String
>   ) using carbon options(path ‘obs://X/tbl_email_form_to_for_XX’)
>   ETL 6 columns from 18 columns table
>   
>   example data:
>   from_email_36550_phillip.al...@enron.com
> to_email_36550_stagecoachm...@hotmail.com   from_to 
> <29528303.107585557.JavaMail.evans@thyme>   153801549700
> 975514920
> 2. the performance should be reach X millions records/s/node
> 3.Provide CarbonWriter for SDK, it can write carbon data in C++ language
>   ##features/interfaces
>   3.1.create CarbonWriter, including create schema(withCsvInput),set 
> outputPath, and build,
>   3.2.write()
>   3.3.close()
>   3.4.support OBS(AK/SK/Endpoint)(withHadoopConf)
>   3.5.writtenBy
>   3.6. support withTableProperty, withLoadOption,taskNo, 
> uniqueIdentifier, withThreadSafe,  withBlockSize, withBlockletSize, 
> localDictionaryThreshold, enableLocalDictionary in C++ SDK (PR2899 TO BE 
> review)
>   
>   ##Data types:
>  Carbon need support base data types, including string, float, 
> double, int, long, date, timestamp, bool, array.
>   For other, we can convert:
>  char array => carbon string
>  Enum => Carbon string
>   set and list => carbon array
>   ##performance
>   Writing Performance is not required now
>   
> 4. read schema function
> readSchema
> getVersionDetails  =>TODO
> 5. support carbonproperties
>   5.1 addProperty
>   5.2 getProperty
>   
> 6.TODO:
>   6.1.getVersionDetails. => to be review
>   6.2.updated SDK/CSDK reader doc => to be review
>   6.3.support byte(write read)
>   6.4.support long string columns
>   6.5.support sortBy=> to be review
>   6.6.support withCsvInput(Schema schema);  create schema(JAVA)
>   6.7. optimize the write doc => to be review
>   /**
>   * Create a {@link CarbonWriterBuilder} to build a 
> {@link CarbonWriter}
>   */
>   public static CarbonWriterBuilder builder() {
>   return new CarbonWriterBuilder();
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (CARBONDATA-2951) CSDK: Provide C++ interface for SDK

2018-11-23 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2951:

Comment: was deleted

(was: https://www.computerhope.com/issues/ch001002.htm)

> CSDK: Provide C++ interface for SDK
> ---
>
> Key: CARBONDATA-2951
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2951
> Project: CarbonData
>  Issue Type: Task
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Critical
> Fix For: NONE
>
>
> For the some user of using C++ code in their project, they can't call 
> CarbonData interface and integrate CarbonData into their C++ project. So we 
> plan to provide C++  interface for C++ user to integrate carbon, including 
> read and write CarbonData. It's will more convenient for they.
> We plan to design and develop  as following:
> 1. Provide CarbonReader for SDK, it can read carbon data in C++ language
>   ##features/interfaces
> 1.1.  create CarbonReader
>   1.2.hasNext()
>   1.3.readNextRow()
>   1.4.close()
>   1.5.support OBS(AK/SK/Endpoint)
>   1.6 support batch read(withBatch,readNextBatchRow) 
>   1.7 support vecor read(default) and carbonrecordreader 
> (withRowRecordReader)
>   1.8 projection
>   
>   ##support data types:
>String, 
> Long,Varchar(string),Short,Int,Date(int),timestamp(long),boolean,Decimal(string),Float
>Array in carbonrecordreader, not support in vectorreader
>byte=>support in java RowUtil, not in C++ carbon reader
>
>   ## Schema and data
>Create table tbl_email_form_to_for_XX( 
>   Event_Time Timestamp,
>   Ingestion_Time Timestamp,
>   From_Email String,
>   To_Email String,
>   From_To_type String,
>   Event_ID String
>   ) using carbon options(path ‘obs://X/tbl_email_form_to_for_XX’)
>   ETL 6 columns from 18 columns table
>   
>   example data:
>   from_email_36550_phillip.al...@enron.com
> to_email_36550_stagecoachm...@hotmail.com   from_to 
> <29528303.107585557.JavaMail.evans@thyme>   153801549700
> 975514920
> 2. the performance should be reach X millions records/s/node
> 3.Provide CarbonWriter for SDK, it can write carbon data in C++ language
>   ##features/interfaces
>   3.1.create CarbonWriter, including create schema(withCsvInput),set 
> outputPath, and build,
>   3.2.write()
>   3.3.close()
>   3.4.support OBS(AK/SK/Endpoint)(withHadoopConf)
>   3.5.writtenBy
>   3.6. support withTableProperty, withLoadOption,taskNo, 
> uniqueIdentifier, withThreadSafe,  withBlockSize, withBlockletSize, 
> localDictionaryThreshold, enableLocalDictionary in C++ SDK (PR2899 TO BE 
> review)
>   
>   ##Data types:
>  Carbon need support base data types, including string, float, 
> double, int, long, date, timestamp, bool, array.
>   For other, we can convert:
>  char array => carbon string
>  Enum => Carbon string
>   set and list => carbon array
>   ##performance
>   Writing Performance is not required now
>   
> 4. read schema function
> readSchema
> getVersionDetails  =>TODO
> 5. support carbonproperties
>   5.1 addProperty
>   5.2 getProperty
>   
> 6.TODO:
>   6.1.getVersionDetails. => to be review
>   6.2.updated SDK/CSDK reader doc => to be review
>   6.3.support byte(write read)
>   6.4.support long string columns
>   6.5.support sortBy=> to be review
>   6.6.support withCsvInput(Schema schema);  create schema(JAVA)
>   6.7. optimize the write doc => to be review
>   /**
>   * Create a {@link CarbonWriterBuilder} to build a 
> {@link CarbonWriter}
>   */
>   public static CarbonWriterBuilder builder() {
>   return new CarbonWriterBuilder();
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (CARBONDATA-2951) CSDK: Provide C++ interface for SDK

2018-11-20 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2951:

Comment: was deleted

(was: 
https://r2---sn-npoeenek.googlevideo.com/videoplayback?lmt=1521057331061550=1812.178=youtube=25=yes=142.93.137.161=VLv0W5_UNcqagAeN3q7oBA=254556679094EFD17DAC3DAD278E66478407BC49.4E38F21849411EDBDCDBF5990ED176A375B2F06A=cms1=o-AC5nDRva0FaTCRRdBU5bhUeOEws4bx8zmbynLQo0P895=22=video%2Fmp4=1542786997=dur,ei,expire,id,ip,ipbits,itag,lmt,mime,mip,mm,mn,ms,mv,pl,ratebypass,requiressl,source=2=yes=WEB=0_id=lhsAg2H_GXc=Apache+Carbondata-+An+Indexed+Columnar+File+Format+for+Interactive+Query+by+Jacky+Li-Jihong+Ma_counter=1=sn-5hnel77l=23763603_id=7ca3e5fd8923a3ee_redirect=yes=116.66.184.191=34=sn-npoeenek=ltu=1542765298=m)

> CSDK: Provide C++ interface for SDK
> ---
>
> Key: CARBONDATA-2951
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2951
> Project: CarbonData
>  Issue Type: Task
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Critical
> Fix For: NONE
>
>
> For the some user of using C++ code in their project, they can't call 
> CarbonData interface and integrate CarbonData into their C++ project. So we 
> plan to provide C++  interface for C++ user to integrate carbon, including 
> read and write CarbonData. It's will more convenient for they.
> We plan to design and develop  as following:
> 1. Provide CarbonReader for SDK, it can read carbon data in C++ language
>   ##features/interfaces
> 1.1.  create CarbonReader
>   1.2.hasNext()
>   1.3.readNextRow()
>   1.4.close()
>   1.5.support OBS(AK/SK/Endpoint)
>   1.6 support batch read(withBatch,readNextBatchRow) 
>   1.7 support vecor read(default) and carbonrecordreader 
> (withRowRecordReader)
>   1.8 projection
>   
>   ##support data types:
>String, 
> Long,Varchar(string),Short,Int,Date(int),timestamp(long),boolean,Decimal(string),Float
>Array in carbonrecordreader, not support in vectorreader
>byte=>support in java RowUtil, not in C++ carbon reader
>
>   ## Schema and data
>Create table tbl_email_form_to_for_XX( 
>   Event_Time Timestamp,
>   Ingestion_Time Timestamp,
>   From_Email String,
>   To_Email String,
>   From_To_type String,
>   Event_ID String
>   ) using carbon options(path ‘obs://X/tbl_email_form_to_for_XX’)
>   ETL 6 columns from 18 columns table
>   
>   example data:
>   from_email_36550_phillip.al...@enron.com
> to_email_36550_stagecoachm...@hotmail.com   from_to 
> <29528303.107585557.JavaMail.evans@thyme>   153801549700
> 975514920
> 2. the performance should be reach X millions records/s/node
> 3.Provide CarbonWriter for SDK, it can write carbon data in C++ language
>   ##features/interfaces
>   3.1.create CarbonWriter, including create schema(withCsvInput),set 
> outputPath, and build,
>   3.2.write()
>   3.3.close()
>   3.4.support OBS(AK/SK/Endpoint)(withHadoopConf)
>   3.5.writtenBy
>   3.6. support withTableProperty, withLoadOption,taskNo, 
> uniqueIdentifier, withThreadSafe,  withBlockSize, withBlockletSize, 
> localDictionaryThreshold, enableLocalDictionary in C++ SDK (PR2899 TO BE 
> review)
>   
>   ##Data types:
>  Carbon need support base data types, including string, float, 
> double, int, long, date, timestamp, bool, array.
>   For other, we can convert:
>  char array => carbon string
>  Enum => Carbon string
>   set and list => carbon array
>   ##performance
>   Writing Performance is not required now
>   
> 4. read schema function
> readSchema
> getVersionDetails  =>TODO
> 5. support carbonproperties
>   5.1 addProperty
>   5.2 getProperty
>   
> 6.TODO:
>   6.1.getVersionDetails. => to be review
>   6.2.updated SDK/CSDK reader doc => to be review
>   6.3.support byte(write read)
>   6.4.support long string columns
>   6.5.support sortBy=> to be review
>   6.6.support withCsvInput(Schema schema);  create schema(JAVA)
>   6.7. optimize the write doc => to be review
>   /**
>   * Create a {@link CarbonWriterBuilder} to build a 
> {@link CarbonWriter}
>   */
>   public static CarbonWriterBuilder builder() {
>   return new CarbonWriterBuilder();
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)