?????? column auto mapping when loading data from csv file

2017-03-13 Thread Yinwei Li
my opinion > we should let user decide this behavior. > > Regards > Manish Gupta > > On Mon, Mar 13, 2017 at 7:48 AM, Yinwei Li <251469...@qq.com> wrote: > > > Hi all, > > > > > > when loading data from a csv file to carbondata table, w

column auto mapping when loading data from csv file

2017-03-12 Thread Yinwei Li
Hi all, when loading data from a csv file to carbondata table, we have 2 choices to mapping the columns from csv file to carbondata table: 1. add columns' names at the start of the csv file 2. declare the column mapping at the data loading script shall we add a feature which make an

??????Removing of kettle code from Carbondata

2017-03-10 Thread Yinwei Li
+1 well, I agree with Ravindra, As a better solution is born and it seems no backward compatibility issue in a data loading process. -- -- ??: "Ravindra Pesala";; : 2017??3??11??(??) 10:21 ??:

spark + carbondata vs. spark + parquet performance test under benchmark tpc-ds

2017-03-01 Thread Yinwei Li
Hi all, I've made a simple performance test under benchmark tpc-ds using spark 2.1.0+carbondata 1.0.0 and spark 2.1.0+parquet, and I've make a note of the whole process. Considering the massive words and codes and tables & for the convenience of the updating of the note, I put the

carbondata vs. impala performance test under benchmark tpc-ds

2017-02-24 Thread Yinwei Li
rformance with new format, we have already raised the PR(584 and 586) for the same, It is still under review and it will be merged soon. Once these PR's are merged we will start verify the TPC-DS performace as well. Regards, Ravindra. On 21 February 2017 at 13:48, Yinwei Li <251469...@qq

Re: carbondata performance test under benchmark tpc-ds

2017-02-21 Thread Yinwei Li
up↑ haha~~~ -- Original -- From: "ﻬ.贝壳里的海";<251469...@qq.com>; Date: Mon, Feb 20, 2017 09:52 AM To: "dev"; Subject: carbondata performance test under benchmark tpc-ds Hi all, I've made a simple performance test

carbondata performance test under benchmark tpc-ds

2017-02-19 Thread Yinwei Li
Hi all, I've made a simple performance test under benchmark tpc-ds using spark2.1.0+carbondata1.0.0, well the result seems unsatisfactory. The details are as follows: About Env: Hadoop 2.7.2 + Spark 2.1.0 + CarbonData 1.0.0 Cluster: 5 nodes, 32G mem per node About TPC-DS:

?????? data lost when loading data from csv file to carbon table

2017-02-15 Thread Yinwei Li
Hi Ravindra: I add DICTIONARY_INCLUDE for each of them: carbon.sql("create table if not exists _1g.store_returns(sr_returned_date_sk integer, sr_return_time_sk integer, sr_item_sk integer, sr_customer_sk integer, sr_cdemo_sk integer, sr_hdemo_sk integer, sr_addr_sk integer, sr_store_sk

?????? data lost when loading data from csv file to carbon table

2017-02-15 Thread Yinwei Li
dded to that bad record location. 2. You can alternatively verify by ignoring the bad records by using following command carbon.sql(s"load data inpath '$src/web_sales.csv' into table _1g.web_sales OPTIONS('DELIMITER'='|','bad_records_logger_enable'='true', 'bad_records_action'='ignor

?????? data lost when loading data from csv file to carbon table

2017-02-14 Thread Yinwei Li
(??) 10:41 ??: "dev"<dev@carbondata.incubator.apache.org>; : Re: data lost when loading data from csv file to carbon table Hi, Please set carbon.badRecords.location in carbon.properties and check any bad records are added to that location. Regards, Ravindra. On 14 Februa

data lost when loading data from csv file to carbon table

2017-02-14 Thread Yinwei Li
Hi all, I met an data lost problem when loading data from csv file to carbon table, here are some details: Env: Spark 2.1.0 + Hadoop 2.7.2 + CarbonData 1.0.0 Total Records:719,384 Loaded Records:606,305 (SQL: select count(1) from table) My Attemps: Attemp1: Add option

?????? Error while loading - Table is locked for updation. Please tryafter some time ( Spark 1.6.2 )

2017-02-06 Thread Yinwei Li
Hi Sanoj, maybe you can try init carbonContext by setting the parameter storePath as follows: scala> val storePath = "hdfs://localohst:9000/home/hadoop/carbondata/bin/carbonshellstore" scala> val cc = new CarbonContext(sc, storePath) --

discussions about the main content of the next version of document of carbondata

2017-01-18 Thread Yinwei Li
Hi all, Document of a incubator apache project maybe the most important way that starters can get in touch with, it can not only help a starter fast into quickly, but help skilled users know more about the details and can in turn contribute the cummunity. I'm now launching a