Re: [Discussion] Support Date/Time format for Timestamp columns to be defined at column level
Hi,all I have done this issue. Pls check: PR 219 https://github.com/apache/incubator-carbondata/pull/219 2016-09-24 16:41 GMT+08:00 向志强 <lionx.hua...@gmail.com>: > Hi, all > > In recent days, I am trying to handle issues CARBONDATA-37. We are trying > to support that Date format can be set at column level. > > There is a doubt that we should feedback the the same format for Date > column or feedback a uniform format. Absolutely. > > For example. > > we create a table and define two cols which data type is Date. But the > Date format is different. > > col1(Date) col2(Date) > 2016-09-24 2016-09-25 00:00:00 > > when querying, for two formats below, which should be returned? > > col1(Date) col2(Date) > 2016-09-24 2016-09-25 00:00:00 > > or > > col1(Date) col2(Date) > 2016-09-24 00:00:00 2016-09-25 00:00:00 > > if we set -MM-DD HH:MM:SS as default format. > > > Best wishes! >
Re: Load data command Quote character unexpected behavior.
hi, Harmeet By your descriptions above, I can not reproduce the problem which you described. It returned the right result in my env. Pls use the lasted version and check again, and give more details. Lionx 2016-10-21 15:07 GMT+08:00 Harmeet: > Hey Team, > > I am using load data command with specific Quote character. But after > loading the data, the behavior of quote character is not working. Below is > my example: > > *create table one (name string, description string, salary double, age int, > dob timestamp) stored by 'carbondata';* > > csf File >> > > name, description, salary, age, dob > tammy, $my name$, 90, 22, 19/10/2019 > > 0: jdbc:hive2://127.0.0.1:1> load data local inpath > 'hdfs://localhost:54310/home/harmeet/dollarquote.csv' into table one > OPTIONS('QUOTECHAR'="$"); > > Results >> > 0: jdbc:hive2://127.0.0.1:1> select * from one; > +---+--+ > ---+---+---+--+ > | name| description | dob | > salary > | age | > +---+--+ > ---+---+---+--+ > | tammy | $my name$ | NULL | > 90.0 | 22| > +---+--+ > ---+---+---+--+ > > I am assuming, in description column only "my name" data is loaded and > dollars was exclude, but this is not working. The same behavior, if we are > using ' (Single Quote) with data. > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/Load-data- > command-Quote-character-unexpected-behavior-tp2145.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. >
Re: carbondata org.apache.thrift.TBaseHelper.hashCode(segment_id); 问题
hi, jingwu, Now, Carbon dose not support load data from local, pls put the file into HDFS and test it again. Lionx 2016-10-19 16:55 GMT+08:00 仲景武: > > hi, all > > I have installed carbonate succeed following the document “ > https://cwiki.apache.org/confluence/display/CARBONDATA/“ > > but when load data into carbonate table throws exception: > > > run command: > cc.sql("load data local inpath '../carbondata/sample.csv' into table > test_table") > > errors: > > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path > does not exist: /home/bigdata/bigdata/carbondata/sample.csv > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat. > singleThreadedListStatus(FileInputFormat.java:321) > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat. > listStatus(FileInputFormat.java:264) > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat. > getSplits(FileInputFormat.java:385) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > at org.apache.spark.rdd.RDD.take(RDD.scala:1302) > at com.databricks.spark.csv.CarbonCsvRelation.firstLine$ > lzycompute(CarbonCsvRelation.scala:181) > at com.databricks.spark.csv.CarbonCsvRelation.firstLine( > CarbonCsvRelation.scala:176) > at com.databricks.spark.csv.CarbonCsvRelation.inferSchema( > CarbonCsvRelation.scala:144) > at com.databricks.spark.csv.CarbonCsvRelation.( > CarbonCsvRelation.scala:74) > at com.databricks.spark.csv.newapi.DefaultSource. > createRelation(DefaultSource.scala:142) > at com.databricks.spark.csv.newapi.DefaultSource. > createRelation(DefaultSource.scala:44) > at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply( > ResolvedDataSource.scala:158) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) > at org.apache.carbondata.spark.util.GlobalDictionaryUtil$.loadDataFrame( > GlobalDictionaryUtil.scala:386) > at org.apache.carbondata.spark.util.GlobalDictionaryUtil$. > generateGlobalDictionary(GlobalDictionaryUtil.scala:767) > at org.apache.spark.sql.execution.command.LoadTable. > run(carbonTableSchema.scala:1170) > at org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult$lzycompute(commands.scala:58) > at org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult(commands.scala:56) > at org.apache.spark.sql.execution.ExecutedCommand. > doExecute(commands.scala:70) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$5.apply(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$5.apply(SparkPlan.scala:130) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) > at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute( > QueryExecution.scala:55) > at org.apache.spark.sql.execution.QueryExecution. > toRdd(QueryExecution.scala:55) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:145) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:130) > at org.apache.carbondata.spark.rdd.CarbonDataFrameRDD.( > CarbonDataFrameRDD.scala:23) > at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:137) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC. > (:42) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.< > init>(:47) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:57) > at $iwC$$iwC$$iwC$$iwC$$iwC.(:59) > at $iwC$$iwC$$iwC$$iwC.(:61) > at $iwC$$iwC$$iwC.(:63) > at $iwC$$iwC.(:65) > at $iwC.(:67) > at (:69) > at .(:73) > at .() > at .(:7) > at .() > at $print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:57) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( >
Re: [Discussion] Support String Trim For Table Level or Col Level
trim the data when data loading. 2016-10-17 16:22 GMT+08:00 Ravindra Pesala <ravi.pes...@gmail.com>: > Hi Lionx, > > Can you give more details on this feature? > Are you talking about trim() function while querying? Or trim the data > while loading to carbon? > > Regards, > Ravi. > > On 17 October 2016 at 12:56, 向志强 <lionx.hua...@gmail.com> wrote: > > > Hi all, > > We are trying to support string trim feature in carbon. > > The feature will be set in "Create Table". > > Pls discuss in which level we should to support this feature, Tbl or Col? > > > > Thx, > > Lionx > > > > > > -- > Thanks & Regards, > Ravi >
[Discussion] Support Date/Time format for Timestamp columns to be defined at column level
Hi, all In recent days, I am trying to handle issues CARBONDATA-37. We are trying to support that Date format can be set at column level. There is a doubt that we should feedback the the same format for Date column or feedback a uniform format. Absolutely. For example. we create a table and define two cols which data type is Date. But the Date format is different. col1(Date) col2(Date) 2016-09-24 2016-09-25 00:00:00 when querying, for two formats below, which should be returned? col1(Date) col2(Date) 2016-09-24 2016-09-25 00:00:00 or col1(Date) col2(Date) 2016-09-24 00:00:00 2016-09-25 00:00:00 if we set -MM-DD HH:MM:SS as default format. Best wishes!