[jira] [Assigned] (CARBONDATA-3161) Pipe "|" dilimiter is not working for streaming table

Pawan Malwal (JIRA) Mon, 17 Dec 2018 22:43:19 -0800


     [ 
https://issues.apache.org/jira/browse/CARBONDATA-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Pawan Malwal reassigned CARBONDATA-3161:
----------------------------------------

    Assignee:     (was: Pawan Malwal)

> Pipe "|" dilimiter is not working for streaming table
> -----------------------------------------------------
>
>                 Key: CARBONDATA-3161
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3161
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-load
>            Reporter: Pawan Malwal
>            Priority: Minor
>
> csv data with "|" as a dilimiter is not getting loaded into streaming table 
> correctly.
> *DDL:*
> create table table1_st(begintime TIMESTAMP, deviceid STRING, statcycle INT, 
> topologypath STRING, devicetype STRING, rebootnum INT) stored by 'carbondata' 
> TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT','sort_columns'='deviceid,begintime','streaming'
>  ='true');
> *Run in spark shell:*
> import org.apache.spark.sql.SparkSession;
> import org.apache.spark.sql.SparkSession.Builder;
> import org.apache.spark.sql.CarbonSession;
> import org.apache.spark.sql.CarbonSession.CarbonBuilder;
> import org.apache.spark.sql.streaming._
> import org.apache.carbondata.streaming.parser._
> val enableHiveSupport = SparkSession.builder().enableHiveSupport();
> val carbon=new 
> CarbonBuilder(enableHiveSupport).getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/")
> val df=carbon.readStream.text("/user/*.csv")
> val qrymm_0001 = 
> df.writeStream.format("carbondata").option(CarbonStreamParser.CARBON_STREAM_PARSER,
>  
> CarbonStreamParser.CARBON_STREAM_PARSER_CSV).{color:#FF0000}*option("delimiter","|")*{color}.option("header","false").option("dbName","stdb").option("checkpointLocation",
>  
> "/tmp/tb1").option("bad_records_action","FORCE").option("tableName","table1_st").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("TIMESTAMPFORMAT","yyyy-dd-MM
>  HH:mm:ss").start
>  
> *Sample records:*
>  begintime| deviceid| statcycle| topologypath| devicetype| rebootnum
>  2018-10-01 00:00:00|Device1|0|dsad|STB|9
>  2018-10-01 00:05:00|Device1|0|Rsad|STB|4
>  2018-10-01 00:10:00|Device1|0|fsf|STB|6
>  2018-10-01 00:15:00|Device1|0|fdgf|STB|8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3161) Pipe "|" dilimiter is not working for streaming table

Reply via email to