Re: [Discussion] Support Date/Time format for Timestamp columns to be defined at column level

2016-10-28 Thread
Hi,all
I have done this issue.
Pls check:
PR 219
https://github.com/apache/incubator-carbondata/pull/219

2016-09-24 16:41 GMT+08:00 向志强 <lionx.hua...@gmail.com>:

> Hi, all
>
> In recent days, I am trying to handle issues CARBONDATA-37. We are trying
> to support that Date format can be set at column level.
>
> There is a doubt that we should feedback the the same format for Date
> column or feedback a uniform format. Absolutely.
>
> For example.
>
> we create a table and define two cols which data type is Date. But the
> Date format is different.
>
> col1(Date)   col2(Date)
> 2016-09-24 2016-09-25 00:00:00
>
> when querying, for two formats below, which should be returned?
>
>  col1(Date)   col2(Date)
> 2016-09-24 2016-09-25 00:00:00
>
> or
>
>  col1(Date) col2(Date)
> 2016-09-24 00:00:00 2016-09-25 00:00:00
>
> if we set -MM-DD HH:MM:SS as default format.
>
>
> Best wishes!
>


Re: Load data command Quote character unexpected behavior.

2016-10-21 Thread
hi, Harmeet
By your descriptions above, I can not reproduce the problem which you
described. It returned the right result in my env.
Pls use the lasted version and check again, and give more details.
Lionx

2016-10-21 15:07 GMT+08:00 Harmeet :

> Hey Team,
>
> I am using load data command with specific Quote character. But after
> loading the data, the behavior of quote character is not working. Below is
> my example:
>
> *create table one (name string, description string, salary double, age int,
> dob timestamp) stored by 'carbondata';*
>
> csf File >>
>
> name, description, salary, age, dob
> tammy, $my name$, 90, 22, 19/10/2019
>
> 0: jdbc:hive2://127.0.0.1:1> load data local inpath
> 'hdfs://localhost:54310/home/harmeet/dollarquote.csv' into table one
> OPTIONS('QUOTECHAR'="$");
>
> Results >>
> 0: jdbc:hive2://127.0.0.1:1> select * from one;
> +---+--+
> ---+---+---+--+
> |   name| description  |  dob  |
> salary
> |  age  |
> +---+--+
> ---+---+---+--+
> | tammy |  $my name$   | NULL  |
> 90.0  | 22|
> +---+--+
> ---+---+---+--+
>
> I am assuming, in description column only "my name" data is loaded and
> dollars was exclude, but this is not working. The same behavior, if we are
> using ' (Single Quote) with data.
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Load-data-
> command-Quote-character-unexpected-behavior-tp2145.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>


Re: carbondata org.apache.thrift.TBaseHelper.hashCode(segment_id); 问题

2016-10-19 Thread
hi, jingwu,

Now, Carbon dose not support load data from local, pls put the file into
HDFS and test it again.

Lionx

2016-10-19 16:55 GMT+08:00 仲景武 :

>
> hi, all
>
> I have installed carbonate succeed  following the document “
> https://cwiki.apache.org/confluence/display/CARBONDATA/“
>
> but when load data into carbonate table  throws exception:
>
>
> run command:
> cc.sql("load data local inpath '../carbondata/sample.csv' into table
> test_table")
>
> errors:
>
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> does not exist: /home/bigdata/bigdata/carbondata/sample.csv
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.
> singleThreadedListStatus(FileInputFormat.java:321)
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.
> listStatus(FileInputFormat.java:264)
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.
> getSplits(FileInputFormat.java:385)
> at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
> at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
> MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
> at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:150)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:111)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
> at org.apache.spark.rdd.RDD.take(RDD.scala:1302)
> at com.databricks.spark.csv.CarbonCsvRelation.firstLine$
> lzycompute(CarbonCsvRelation.scala:181)
> at com.databricks.spark.csv.CarbonCsvRelation.firstLine(
> CarbonCsvRelation.scala:176)
> at com.databricks.spark.csv.CarbonCsvRelation.inferSchema(
> CarbonCsvRelation.scala:144)
> at com.databricks.spark.csv.CarbonCsvRelation.(
> CarbonCsvRelation.scala:74)
> at com.databricks.spark.csv.newapi.DefaultSource.
> createRelation(DefaultSource.scala:142)
> at com.databricks.spark.csv.newapi.DefaultSource.
> createRelation(DefaultSource.scala:44)
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(
> ResolvedDataSource.scala:158)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.apache.carbondata.spark.util.GlobalDictionaryUtil$.loadDataFrame(
> GlobalDictionaryUtil.scala:386)
> at org.apache.carbondata.spark.util.GlobalDictionaryUtil$.
> generateGlobalDictionary(GlobalDictionaryUtil.scala:767)
> at org.apache.spark.sql.execution.command.LoadTable.
> run(carbonTableSchema.scala:1170)
> at org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult$lzycompute(commands.scala:58)
> at org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult(commands.scala:56)
> at org.apache.spark.sql.execution.ExecutedCommand.
> doExecute(commands.scala:70)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$
> execute$5.apply(SparkPlan.scala:132)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$
> execute$5.apply(SparkPlan.scala:130)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(
> RDDOperationScope.scala:150)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
> at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(
> QueryExecution.scala:55)
> at org.apache.spark.sql.execution.QueryExecution.
> toRdd(QueryExecution.scala:55)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:145)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:130)
> at org.apache.carbondata.spark.rdd.CarbonDataFrameRDD.(
> CarbonDataFrameRDD.scala:23)
> at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:137)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> (:42)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<
> init>(:47)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:57)
> at $iwC$$iwC$$iwC$$iwC$$iwC.(:59)
> at $iwC$$iwC$$iwC$$iwC.(:61)
> at $iwC$$iwC$$iwC.(:63)
> at $iwC$$iwC.(:65)
> at $iwC.(:67)
> at (:69)
> at .(:73)
> at .()
> at .(:7)
> at .()
> at $print()
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> 

Re: [Discussion] Support String Trim For Table Level or Col Level

2016-10-17 Thread
trim the data when data loading.

2016-10-17 16:22 GMT+08:00 Ravindra Pesala <ravi.pes...@gmail.com>:

> Hi Lionx,
>
> Can you give more details on this feature?
> Are you talking about trim() function while querying? Or trim the data
> while loading to carbon?
>
> Regards,
> Ravi.
>
> On 17 October 2016 at 12:56, 向志强 <lionx.hua...@gmail.com> wrote:
>
> > Hi all,
> > We are trying to support string trim feature in carbon.
> > The feature will be set in "Create Table".
> > Pls discuss in which level we should to support this feature, Tbl or Col?
> >
> > Thx,
> > Lionx
> >
>
>
>
> --
> Thanks & Regards,
> Ravi
>


[Discussion] Support Date/Time format for Timestamp columns to be defined at column level

2016-09-24 Thread
Hi, all

In recent days, I am trying to handle issues CARBONDATA-37. We are trying
to support that Date format can be set at column level.

There is a doubt that we should feedback the the same format for Date
column or feedback a uniform format. Absolutely.

For example.

we create a table and define two cols which data type is Date. But the Date
format is different.

col1(Date)   col2(Date)
2016-09-24 2016-09-25 00:00:00

when querying, for two formats below, which should be returned?

 col1(Date)   col2(Date)
2016-09-24 2016-09-25 00:00:00

or

 col1(Date) col2(Date)
2016-09-24 00:00:00 2016-09-25 00:00:00

if we set -MM-DD HH:MM:SS as default format.


Best wishes!