[ 
https://issues.apache.org/jira/browse/CARBONDATA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen updated CARBONDATA-2148:
-----------------------------------
    Affects Version/s:     (was: 1.3.0)
                       1.3.1

> Use Row parser to replace current default parser:CSVStreamParserImp
> -------------------------------------------------------------------
>
>                 Key: CARBONDATA-2148
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2148
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: data-load, spark-integration
>    Affects Versions: 1.3.1
>            Reporter: Zhichao  Zhang
>            Assignee: Zhichao  Zhang
>            Priority: Minor
>             Fix For: 1.3.1
>
>          Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently the default value of 'carbon.stream.parser' is CSVStreamParserImp, 
> it transforms InternalRow(0) to Array[Object], InternalRow(0) represents the 
> value of one line which is received from Socket. When it receives data from 
> Kafka, the schema of InternalRow is changed, either it need to assemble the 
> fields of kafka data Row into a String and stored it as InternalRow(0), or 
> define a new parser to convert kafka data Row to Array[Object]. It needs the 
> same operation for every table.
> *Solution:*
> Use a new parser called RowStreamParserImpl as the default parser instead of 
> CSVStreamParserImpl, this new parser will automatically convert InternalRow 
> to Array[Object] according to the schema. In general, we will transform 
> source data to a structed Row object, using this way, we do not need to 
> define a parser for every table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to