my opinion
> we should let user decide this behavior.
>
> Regards
> Manish Gupta
>
> On Mon, Mar 13, 2017 at 7:48 AM, Yinwei Li <251469...@qq.com> wrote:
>
> > Hi all,
> >
> >
> > when loading data from a csv file to carbondata table, w
Hi all,
when loading data from a csv file to carbondata table, we have 2 choices to
mapping the columns from csv file to carbondata table:
1. add columns' names at the start of the csv file
2. declare the column mapping at the data loading script
shall we add a feature which make an
+1
well, I agree with Ravindra, As a better solution is born and it seems no
backward compatibility issue in a data loading process.
-- --
??: "Ravindra Pesala";;
: 2017??3??11??(??) 10:21
??:
Hi all,
I've made a simple performance test under benchmark tpc-ds using spark
2.1.0+carbondata 1.0.0 and spark 2.1.0+parquet, and I've make a note of the
whole process.
Considering the massive words and codes and tables & for the convenience of
the updating of the note, I put the
rformance with new format, we have already raised the PR(584 and 586) for
the same, It is still under review and it will be merged soon. Once these
PR's are merged we will start verify the TPC-DS performace as well.
Regards,
Ravindra.
On 21 February 2017 at 13:48, Yinwei Li <251469...@qq
up↑
haha~~~
-- Original --
From: "ﻬ.贝壳里的海";<251469...@qq.com>;
Date: Mon, Feb 20, 2017 09:52 AM
To: "dev";
Subject: carbondata performance test under benchmark tpc-ds
Hi all,
I've made a simple performance test
Hi all,
I've made a simple performance test under benchmark tpc-ds using
spark2.1.0+carbondata1.0.0, well the result seems unsatisfactory. The details
are as follows:
About Env:
Hadoop 2.7.2 + Spark 2.1.0 + CarbonData 1.0.0
Cluster: 5 nodes, 32G mem per node
About TPC-DS:
Hi Ravindra:
I add DICTIONARY_INCLUDE for each of them:
carbon.sql("create table if not exists _1g.store_returns(sr_returned_date_sk
integer, sr_return_time_sk integer, sr_item_sk integer, sr_customer_sk integer,
sr_cdemo_sk integer, sr_hdemo_sk integer, sr_addr_sk integer, sr_store_sk
dded to that bad record location.
2. You can alternatively verify by ignoring the bad records by using
following command
carbon.sql(s"load data inpath '$src/web_sales.csv' into table _1g.web_sales
OPTIONS('DELIMITER'='|','bad_records_logger_enable'='true',
'bad_records_action'='ignor
(??) 10:41
??: "dev"<dev@carbondata.incubator.apache.org>;
: Re: data lost when loading data from csv file to carbon table
Hi,
Please set carbon.badRecords.location in carbon.properties and check any
bad records are added to that location.
Regards,
Ravindra.
On 14 Februa
Hi all,
I met an data lost problem when loading data from csv file to carbon table,
here are some details:
Env: Spark 2.1.0 + Hadoop 2.7.2 + CarbonData 1.0.0
Total Records:719,384
Loaded Records:606,305 (SQL: select count(1) from table)
My Attemps:
Attemp1: Add option
Hi Sanoj,
maybe you can try init carbonContext by setting the parameter storePath as
follows:
scala> val storePath =
"hdfs://localohst:9000/home/hadoop/carbondata/bin/carbonshellstore"
scala> val cc = new CarbonContext(sc, storePath)
--
Hi all,
Document of a incubator apache project maybe the most important way that
starters can get in touch with, it can not only help a starter fast into
quickly, but help skilled users know more about the details and can in turn
contribute the cummunity.
I'm now launching a
13 matches
Mail list logo