Re: [DISCUSSION] Forceful minor Compaction

2017-04-19 Thread Liang Chen
Hi Kunal Thank you for taking the good topic for discussion. First , let us think about : why users want to do forceful minor compaction, which cases? Current "MAJOR compaction" whether can cover "forceful MINOR compaction" scenarios ? As we know, compaction is mainly for optimizing index

[jira] [Created] (CARBONDATA-944) Fix wrong log info during drop table in spark-shell

2017-04-17 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-944: - Summary: Fix wrong log info during drop table in spark-shell Key: CARBONDATA-944 URL: https://issues.apache.org/jira/browse/CARBONDATA-944 Project: CarbonData

Re: java.io.FileNotFoundException: file:/data/carbon_data/default/carbon_table/Metadata/schema.write

2017-04-15 Thread Liang Chen
Hi Please check if you have the right for the directory: Constants.METASTORE_DB you can use "chmod" to add right. Regards Liang xm_zzc wrote > Hi all: > Please help. I directly ran a CarbonData demo program on Eclipse, which > copy from >

Re: CarbonData performance benchmkaring

2017-04-12 Thread Liang Chen
Hi 1.Did you use the latest master version , or 1.0 ? suggest you use master to test 2.Have you tested other TPC-H query which including where/filter? 3.In your case, the query is slow ? or the below "write.format" is slow ? write.format("csv").save("hdfs://hdfsmaster/output/carbon/proj1/")

[jira] [Created] (CARBONDATA-895) Fix license header checking issues

2017-04-10 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-895: - Summary: Fix license header checking issues Key: CARBONDATA-895 URL: https://issues.apache.org/jira/browse/CARBONDATA-895 Project: CarbonData Issue Type

[jira] [Created] (CARBONDATA-891) Fix compilation issue of AlterTableValidationTestCase generate new folder "carbon.store"

2017-04-09 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-891: - Summary: Fix compilation issue of AlterTableValidationTestCase generate new folder "carbon.store" Key: CARBONDATA-891 URL: https://issues.apache.org/jira/browse/CARB

Re: [DISCUSSION]implement delta encoding for numeric type column in SORT_COLUMNS

2017-04-05 Thread Liang Chen
Hi David Thanks for your starting this new feature's discussion. Can you explain what are the major benefits after doing delta encoding for the numeric type column. Regards Liang 2017-04-05 16:01 GMT+05:30 QiangCai : > Hi all, > > Now we plan to implement delta encoding

[jira] [Created] (CARBONDATA-872) Fix comment issues of integration/presto for easier reading

2017-04-05 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-872: - Summary: Fix comment issues of integration/presto for easier reading Key: CARBONDATA-872 URL: https://issues.apache.org/jira/browse/CARBONDATA-872 Project

Re: Dimension column of integer type - to exclude from dictionary

2017-04-04 Thread Liang Chen
Hi Sanoj First , see if i understand your requirement: you only want to build index for column "Account", but don't want to build dictionary for column "Account", is it right? If the above my understanding is right, then David mentioned "SORT_COLUMNS" feature will satisfy your requirements.

[jira] [Created] (CARBONDATA-850) Fix the comment definition issues of CarbonData thrift files

2017-04-04 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-850: - Summary: Fix the comment definition issues of CarbonData thrift files Key: CARBONDATA-850 URL: https://issues.apache.org/jira/browse/CARBONDATA-850 Project

Re: Problem with creating a table in Spark 2.

2017-04-03 Thread Liang Chen
Hi Please check if the below path is correct in your machine? /user/hive/warehouse/carbon/ Regards Liang 2017-04-03 18:05 GMT+05:30 Marek Wiewiorka : > Hi All - I'm trying to follow an example from the quick start guide and in > spark-shell trying to create a

Re: 关于加载数据字典的问题

2017-04-01 Thread Liang Chen
t I don't know which side of the generated dictionary file path > > > -- 原始邮件 ------ > *发件人:* "Liang Chen";<chenliang...@apache.org>; > *发送时间:* 2017年4月1日(星期六) 下午4:49 > *收件人:* "于天星"<784606...@qq.com>; > *主题:* Re: 关于加载数据字典

Re: Need help in configuring dataload.properties

2017-03-30 Thread Liang Chen
Hi Please refer to : https://github.com/apache/incubator-carbondata/blob/master/docs/installation-guide.md Regards Liang 2017-03-30 19:19 GMT+05:30 Srinath Thota : > Hi Team, > > > I have configured Carbon in spark standalone mode as per the documents and > available

Re: Re:Re: Re: Optimize Order By + Limit Query

2017-03-30 Thread Liang Chen
Hi +1 for simafengyun's optimization, it looks good to me. I propose to do "limit" pushdown first, similar with filter pushdown. what is your opionion? @simafengyun For "order by" pushdown, let us work out an ideal solution to consider all aggregation push down cases. Ravindara's comment is

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

2017-03-29 Thread Liang Chen
Hi Aniket Thanks for your great contribution, The feature of ingestion streaming data to carbondata would be very useful for some real-time query scenarios. Some inputs from my side: 1. I agree with approach 2 for streaming file format, the performance for query must be ensured. 2. Whether

Re: carbondata find a bug

2017-03-27 Thread Liang Chen
Hi tianli First, please send mail to dev-subscr...@carbondata.incubator.apache.org for joining mailing list group. Then you can send and receive mail from dev@carbondata.incubator.apache.org. Can you raise one JIRA at https://issues.apache.org/jira/browse/CARBONDATA, and raise one pull request

Re: question about dimension's sort order in blocklet level

2017-03-27 Thread Liang Chen
Hi Can you provide one table to show your info, can't see very clear? The column of high cardinality(>100) would not do dictionary. Regards Liang 2017-03-27 14:32 GMT+05:30 马云 : > Hi DEV, > > I create table according to the below SQL > > cc.sql(""" > >

[jira] [Created] (CARBONDATA-826) Create carbondata-connector of presto for supporting presto query carbon data

2017-03-27 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-826: - Summary: Create carbondata-connector of presto for supporting presto query carbon data Key: CARBONDATA-826 URL: https://issues.apache.org/jira/browse/CARBONDATA-826

Re: Re:Re:Re:Re:Re:Re: insert into carbon table failed

2017-03-27 Thread Liang Chen
Hi Please enable vector , it might help limit query. import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.constants.CarbonCommonConstants CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER, "true") Regards Liang a wrote

Re: Re:Re:Re:Re:Re:Re: insert into carbon table failed

2017-03-26 Thread Liang Chen
Hi 1.Use your current test environment (CarbonData 1.0 + Spark1.6), Please divide 2 billions data into 4 pieces(each is 0.5 billion), load data again. 2.For CarbonData 1.0 + Spark1.6 with kettle for loading data, please configure the bellow 3 parameters in carbon.properties(note: please copy

Re: [DISCUSSION] Initiating Apache CarbonData-1.1.0 incubating Release

2017-03-26 Thread Liang Chen
Hi Yes, update and delete feature with spark-2.x, will be supported after 1.1.0. As planed , 1.2 would support it or earlier. Regards Liang xm_zzc wrote > Hi, does this version support for the updating and deleting with > spark-2.1? Seems like it does not support, what time is it planned to >

Re: [DISCUSSION] Initiating Apache CarbonData-1.1.0 incubating Release

2017-03-26 Thread Liang Chen
Hi +1 for starting to prepare new release 1.1 Great progress, new file format V3 would significantly improve performance. Regards Liang 2017-03-26 10:46 GMT+05:30 Ravindra Pesala : > Hi All, > > As planned we are going to release Apache CarbonData-1.1.0. Please discuss >

Re: Questions about dictionary-encoded column and MDK

2017-03-25 Thread Liang Chen
lared in create table statement > > On Thu, Mar 23, 2017 at 11:51 PM, Liang Chen <chenliang6...@gmail.com> > wrote: > > > Hi > > > > 1.System makes MDK index for dimensions(string columns as dimensions, > > numeric > > columns as measures) , so you have to

Re: insert into carbon table failed

2017-03-25 Thread Liang Chen
Hi Please provide all columns' cardinality info(distinct value). Regards Liang ww...@163.com wrote > Hello! > > 0、The failure > When i insert into carbon table,i encounter failure。The failure is as > follow: > Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most >

[jira] [Created] (CARBONDATA-817) Optimize performance by leveraging CarbonData's unique features

2017-03-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-817: - Summary: Optimize performance by leveraging CarbonData's unique features Key: CARBONDATA-817 URL: https://issues.apache.org/jira/browse/CARBONDATA-817 Project

[jira] [Created] (CARBONDATA-816) Add examples for hive integration under /Examples

2017-03-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-816: - Summary: Add examples for hive integration under /Examples Key: CARBONDATA-816 URL: https://issues.apache.org/jira/browse/CARBONDATA-816 Project: CarbonData

[jira] [Created] (CARBONDATA-815) Add basic hive integration code

2017-03-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-815: - Summary: Add basic hive integration code Key: CARBONDATA-815 URL: https://issues.apache.org/jira/browse/CARBONDATA-815 Project: CarbonData Issue Type: Sub

[jira] [Created] (CARBONDATA-813) Fix pom issues and add the correct dependency jar to build success for integration/presto

2017-03-23 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-813: - Summary: Fix pom issues and add the correct dependency jar to build success for integration/presto Key: CARBONDATA-813 URL: https://issues.apache.org/jira/browse/CARBONDATA-813

Re: Questions about dictionary-encoded column and MDK

2017-03-23 Thread Liang Chen
Hi 1.System makes MDK index for dimensions(string columns as dimensions, numeric columns as measures) , so you have to specify at least one dimension(string column) for building MDK index. 2.You can set numeric column with DICTIONARY_INCLUDE or DICTIONARY_EXCLUDE to build MDK index. For case2,

Re: [apache/incubator-carbondata] [CARBONDATA-727][WIP] add hiveintegration for carbon (#672)

2017-03-23 Thread Liang Chen
lter table hive_carbon add columns(name string, scale decimal, country > string, salary double); > > > > > > 6.check table schema > > > execute "show create table hive_carbon" > > > > > > 7. execute "select * from hive_carbon" and "

Re: Questions about dictionary-encoded column and MDK

2017-03-23 Thread Liang Chen
Hi Can you provide your full exception info. Regards Liang 2017-03-23 13:54 GMT+05:30 Jin Zhou : > Hi, > > Recently I'm doing some tests on spark2.1.0+carbondata1.0.0 and have some > questions: > > 1)Exception is thrown when table created without any dictionary column. > Does

[jira] [Created] (CARBONDATA-808) Create PrestoExample

2017-03-23 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-808: - Summary: Create PrestoExample Key: CARBONDATA-808 URL: https://issues.apache.org/jira/browse/CARBONDATA-808 Project: CarbonData Issue Type: Sub-task

[jira] [Created] (CARBONDATA-807) Add the basic presto integration code

2017-03-22 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-807: - Summary: Add the basic presto integration code Key: CARBONDATA-807 URL: https://issues.apache.org/jira/browse/CARBONDATA-807 Project: CarbonData Issue

[jira] [Created] (CARBONDATA-805) Fix groupid,package name,Class name issues

2017-03-22 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-805: - Summary: Fix groupid,package name,Class name issues Key: CARBONDATA-805 URL: https://issues.apache.org/jira/browse/CARBONDATA-805 Project: CarbonData

Re: Removing of kettle code from Carbondata

2017-03-10 Thread Liang Chen
Hi Agree, +1. The new data load(through spark) is quite stable and good performance, so i agree to remove kettle flow for data loading. Regards Liang 2017-03-11 9:51 GMT+08:00 Ravindra Pesala : > Hi All, > > I guess it is time to remove the kettle flow from Carbondata

Apache CarbonData got the BLACKDUCK award: https://www.blackducksoftware.com/open-source-rookies-2016

2017-03-10 Thread Liang Chen
Hi ALL *Apache CarbonData got the BLACKDUCK award: * https://www.blackducksoftware.com/open-source-rookies-2016: For nine years, the Black Duck Open Source Rookies of the Year awards have recognized some of the most innovative and influential open source projects launched during the previous

Re: [New Feature] Alter table support in carbondata

2017-03-09 Thread Liang Chen
Hi Thanks for you started this discussion for alter table feature. A couple of comments: 1.For "change of data type" , whether only support from INT to BIGINT, or not ? 2.Whether support adjust the order of columns for MDK , and make compaction to resort data as per the new order of columns , or

Re: I loaded the data with the timestamp field unsuccessful

2017-03-08 Thread Liang Chen
Hi If the issue has be fixed? BTW, you don't need add date column to DICTIONARY_INCLUDE, it do index for date/timestamp columns. Regards Liang kex wrote > I loaded the data with the timestamp field unsuccessful,and timestamp > field is null. > > my sql: > carbon.sql("create TABLE IF NOT EXISTS

Re: Apache CarbonData online meetup on 13th Mar,2017

2017-03-08 Thread Liang Chen
Hi phalodi Sorry for this. Apache CarbonData community will organize meetup in India soon. Regards Liang phalodi wrote > Hi , I also want to join this meetup but when i register for the meetup > and proceed to pay it will not show the indian banks for payment options. > > On Tue, Mar 7, 2017

[jira] [Created] (CARBONDATA-753) Fix Date and Timestamp format issues

2017-03-08 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-753: - Summary: Fix Date and Timestamp format issues Key: CARBONDATA-753 URL: https://issues.apache.org/jira/browse/CARBONDATA-753 Project: CarbonData Issue Type

Apache CarbonData online meetup on 13th Mar,2017

2017-03-06 Thread Liang Chen
Hi all Welcome to attend Apache CarbonData online meetup on 13th Mar,2017, you can register at : http://edu.csdn.net/huiyiCourse/detail/342 This meetup will focus on introducing code modules. Regards Liang -- View this message in context:

[jira] [Created] (CARBONDATA-750) Improve exception information description while user input wrong creation table script

2017-03-06 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-750: - Summary: Improve exception information description while user input wrong creation table script Key: CARBONDATA-750 URL: https://issues.apache.org/jira/browse/CARBONDATA-750

[jira] [Created] (CARBONDATA-749) Unexpected error log message while dropping carbon table

2017-03-06 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-749: - Summary: Unexpected error log message while dropping carbon table Key: CARBONDATA-749 URL: https://issues.apache.org/jira/browse/CARBONDATA-749 Project: CarbonData

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Liang Chen
Hi A couple of questions: 1) For SORT_KEY option: only build "MDK index, inverted index, minmax index" for these columns which be specified into the option(SORT_KEY) ? 2) If users don't specify TABLE_DICTIONARY, then all columns don't make dictionary encoding, and all shuffle operations are

Re: carbondata vs. impala performance test under benchmark tpc-ds

2017-02-25 Thread Liang Chen
Hi Thank you shared the test result. It would be more reasonable if you could do the test comparison with same compute engine. Spark 2.1+parquet , Spark 2.1+carbondata. Are you interested in participating in doing this test along with us.(carbondata,parquet) Regards Liang 李寅威 wrote > Hi all,

Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-20 Thread Liang Chen
Hi JB Thanks for you started the discussion and driving it. I will ping you by skype and email to complete some TODO tasks. One query:for license analysis section, why are there many unknown licenses? do we need to fix it ? Regards Liang -- View this message in context:

Re: data lost when loading data from csv file to carbon table

2017-02-20 Thread Liang Chen
Hi Already raised one JIAR issue:How to handle the bad records. https://issues.apache.org/jira/browse/CARBONDATA-714 Regards Liang -- View this message in context:

Re: Introducing V3 format.

2017-02-15 Thread Liang Chen
ld store. So > backward compatibility works even though we jump to V3 format. > > Regards, > Ravindra. > > On 16 February 2017 at 04:18, Liang Chen > chenliang6136@ > wrote: > >> Hi Ravi >> >> Thank you bringing the discussion to mailing list, i h

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-15 Thread Liang Chen
Hi He xiaoqiao Quick start is local model spark. Your case is yarn cluster , please check : https://github.com/apache/incubator-carbondata/blob/master/docs/installation-guide.md Regards Liang 2017-02-15 3:29 GMT-08:00 Xiaoqiao He : > hi Manish Gupta, > > Thanks for you

Re: Introducing V3 format.

2017-02-15 Thread Liang Chen
Hi Ravi Thank you bringing the discussion to mailing list, i have one question: how to ensure backward-compatible after introducing the new format. Regards Liang Jean-Baptiste Onofré wrote > Agree. > > +1 > > Regards > JB > > On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal >

[jira] [Created] (CARBONDATA-703) Update build command after optimizing thrift compile issues

2017-02-11 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-703: - Summary: Update build command after optimizing thrift compile issues Key: CARBONDATA-703 URL: https://issues.apache.org/jira/browse/CARBONDATA-703 Project

Re: why there are no join in the official benchmark test

2017-02-06 Thread Liang Chen
Hi We are test based on TPC-H/TPC-DS benchmark, the report will be shared soon. Regards Liang 2017-02-07 1:28 GMT-05:00 Yinwei Li <251469...@qq.com>: > Hi all, > > > In Apache CarbonData Performance Benchmark(0.1.0) there are no join in > all SQLs, what's the main reason? > > > I want to

Re: [ANNOUNCE] Apache CarbonData 1.0.0-incubating released

2017-02-06 Thread Liang Chen
-hoc queries. How can > I leverage CarbonData for my business, please? > > On Sun, Feb 5, 2017 at 5:27 PM, Liang Chen <chenliang6...@gmail.com> > wrote: > > > Hi xiaoqiao > > > > Very happy to see that you will keep contributing on CarbonData, "Do

Re: Discussion about getting excution duration about a query when using sparkshell+carbondata

2017-02-06 Thread Liang Chen
Hi I used the below method in spark shell for DEMO, for your reference: import org.apache.spark.sql.catalyst.util._ benchmark { carbondf.filter($"name" === "Allen" and $"gender" === "Male" and $"province" === "NB" and $"singler" === "false").count } Regards Liang 2017-02-06 22:07 GMT-05:00

Re: [ANNOUNCE] Apache CarbonData 1.0.0-incubating released

2017-02-05 Thread Liang Chen
Hi xiaoqiao Very happy to see that you will keep contributing on CarbonData, "Double Array Trie" is really a good feature to improve dictionary part. Yes, CarbonData's goal is for solving complex and diversity scenarios. Please let us(community) know if you deploy CarbonData on scenario system

[jira] [Created] (CARBONDATA-695) Create DataFrame example in example/spark2, read carbon data to dataframe

2017-02-04 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-695: - Summary: Create DataFrame example in example/spark2, read carbon data to dataframe Key: CARBONDATA-695 URL: https://issues.apache.org/jira/browse/CARBONDATA-695

[jira] [Created] (CARBONDATA-694) Optimize quick start document through adding hdfs as storepath

2017-02-04 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-694: - Summary: Optimize quick start document through adding hdfs as storepath Key: CARBONDATA-694 URL: https://issues.apache.org/jira/browse/CARBONDATA-694 Project

Re: store location can't be found

2017-02-03 Thread Liang Chen
Hi Have you configured as per the guide : https://github.com/apache/incubator-carbondata/blob/master/docs/installation-guide.md Regards Liang 2017-02-04 10:42 GMT+08:00 Mars Xu : > Hello All, > I met a problem of file not exist. it looks like the store >

[jira] [Created] (CARBONDATA-679) Add examples read CarbonData file to dataframe in Spark 2.1

2017-01-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-679: - Summary: Add examples read CarbonData file to dataframe in Spark 2.1 Key: CARBONDATA-679 URL: https://issues.apache.org/jira/browse/CARBONDATA-679 Project

Re: Re: Failed to APPEND_FILE, hadoop.hdfs.protocol.AlreadyBeingCreatedException

2017-01-20 Thread Liang Chen
Hi mvn -DskipTests -Pspark-1.5 -Dspark.version=1.5.2 clean package Please refer to build doc: https://github.com/apache/incubator-carbondata/tree/master/build Regards Liang 2017-01-20 16:00 GMT+08:00 彭 : > I build the jar with hadoop2.6, like "mvn package -DskipTests >

Re: question about presto integration

2017-01-17 Thread Liang Chen
Hi 1.Yes, CarbonData would consider to make broader integration with different engine, include presto. 2.As i know ,one contributor who is from ctrip is working on integration between CarbonData and Presto, once this contributor finish it, this feature will be considered into roadmap. Regards

Re: discussion about benchmark standard that carbondata used

2017-01-15 Thread Liang Chen
Hi Agree. currently we are testing as per TPC-H. In the future will also test TPC-DS, do you want to join us together for the benchmark test works? Regards Liang 2017-01-16 8:58 GMT+08:00 251469031 <251469...@qq.com>: > Hi all, > > > Benchmark test can measure the performance of a system.

[jira] [Created] (CARBONDATA-639) "Delete data" feature doesn't work

2017-01-14 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-639: - Summary: "Delete data" feature doesn't work Key: CARBONDATA-639 URL: https://issues.apache.org/jira/browse/CARBONDATA-639 Project: CarbonData

Re: [jira] [Created] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.

2017-01-11 Thread Liang Chen
OK, thank you start this work. One thing please notice : Please only put .md files to github, don't suggest adding other kind of files to github, like pdf,text and so on. Regards Liang -- View this message in context:

Re: [VOTE] Apache CarbonData 1.0.0-incubating (RC1)

2017-01-10 Thread Liang Chen
t; Thanks for the wonderful working. > > I am very interesting and want the following features from a customer view. > > > > [+1] Support Spark2.1 > [+1]New load data solution without kettle > [-1] IUD(Supported by Spark 1.5) > [+1]Performance improvement > > > > > &g

Re: [VOTE] Apache CarbonData 1.0.0-incubating (RC1)

2017-01-10 Thread Liang Chen
[+1] Support Spark2.1 > [+1]New load data solution without kettle > [-1] IUD(Supported by Spark 1.5) > [+1]Performance improvement > > > > > > On Jan 11, 2017, 12:14 AM +0800, Liang Chen , wrote: > > Hi > > > > Please vote on releasing the following

[VOTE] Apache CarbonData 1.0.0-incubating (RC1)

2017-01-10 Thread Liang Chen
Hi Please vote on releasing the following candidate as Apache CarbonData version 1.0.0. The vote will be open at least for 72 hours, If this vote passes (we need at least 3 binding votes, meaning three votes from the PPMC), I will forward to gene...@incubator.apache.org for the IPMC votes. [ ]

[jira] [Created] (CARBONDATA-616) Remove the duplicated class CarbonDataWriterException.java

2017-01-10 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-616: - Summary: Remove the duplicated class CarbonDataWriterException.java Key: CARBONDATA-616 URL: https://issues.apache.org/jira/browse/CARBONDATA-616 Project

Re: Problem while copying file from local store to carbon store

2017-01-09 Thread Liang Chen
putStream.open0(Native Method) > ... > INFO 10-01 10:29:59,547 - [test_table: Graph - > MDKeyGentest_table][partitionID:0] > ---logs print by liyinwei end - > ERROR 10-01 10:29:59,547 - [test_table: Graph - > MDKeyGentest_table][partiti

Re: Problem while copying file from local store to carbon store

2017-01-09 Thread Liang Chen
Hi Please use spark-shell to create carboncontext, you can refer to these articles : https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67635497 Regards Liang -- View this message in context:

Re: minor compact throw err 'IndexBuilderException'

2017-01-05 Thread Liang Chen
Hi 1.Just i tested at my machine for 0.2 version,it is working fine. - scala> cc.sql("ALTER TABLE connectdemo1 COMPACT 'MINOR'") INFO 05-01 23:46:54,111 - main Query [ALTER TABLE CONNECTDEMO1 COMPACT 'MINOR'] INFO 05-01

Re: [UT Fail Report] UT can not pass when run with branch master

2017-01-04 Thread Liang Chen
Hi It is fixed, now the master can pass compilation. Thanks for you pointed out it. Regards Liang hexiaoqiao wrote > UT fails when run with branch master of carbondata ( > https://github.com/apache/incubator-carbondata/tree/master). > > exception as following: > >>

Re: 回复: how to make carbon run faster

2017-01-04 Thread Liang Chen
Hi First: i suggest you reload data again, one time to load all 35G data , to check the query effectiveness again. Second: After you finish the above E2E test, you would understand the whole process of Carbon. then i suggest you start to read source code and some technical documents for further

Re: how to make carbon run faster

2017-01-01 Thread Liang Chen
Hi Thanks for you started try Apache CarbonData project. There are may have various reasons for the test result,i assumed that you made time based partition for ORC data ,right ? 1.Can you tell that the SQL generated how many rows data? 2.You can try more SQL query, for example : select *

[jira] [Created] (CARBONDATA-575) Remove integration-testcases module

2016-12-28 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-575: - Summary: Remove integration-testcases module Key: CARBONDATA-575 URL: https://issues.apache.org/jira/browse/CARBONDATA-575 Project: CarbonData Issue Type

Re: 回复: Dictionary file is locked for updation

2016-12-27 Thread Liang Chen
Hi Updated ,thanks for you pointed out the issue. Regards Liang 李寅威 wrote > thx QiangCai, the problem is solved. > > > so, maybe it's better to correct the document at > https://cwiki.apache.org/confluence/display/CARBONDATA/Cluster+deployment+guide, > change the value of

Re: [Discussion]Simplify the deployment of carbondata

2016-12-25 Thread Liang Chen
Hi Thanks you started a good discussion. For 1 and 2, i agree. In 1.0.0 version, will support it. For 3 : Need keep the parameter, users can specify carbon's store location. If users don't specify the carbon store location, can use the default location what you suggested:

Re: [jira] [Created] (CARBONDATA-559) Job failed at last step

2016-12-25 Thread Liang Chen
Copied the below information from Apache JIRA. -- Hi Lionel Global dictionary is generated successfully but data loading graph is not started because it seems that kettle home at executor size is not set properly as displayed in logs. NFO 23-12 16:58:47,461 -

Re: [jira] [Created] (CARBONDATA-562) Carbon Context initialization is failed with spark 1.6.3

2016-12-24 Thread Liang Chen
Hi Babulal Spark didn't support spark 1.6.3 ,you can try spark 1.6.1 and 1.6.2. Please refer to : https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration Regards Liang 2016-12-25 13:51 GMT+08:00 Babulal (JIRA) : > Babulal created

[jira] [Created] (CARBONDATA-561) Merge the two CarbonOption.scala into one under spark-common

2016-12-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-561: - Summary: Merge the two CarbonOption.scala into one under spark-common Key: CARBONDATA-561 URL: https://issues.apache.org/jira/browse/CARBONDATA-561 Project

[jira] [Created] (CARBONDATA-560) In QueryExecutionException, can not use executorService.shutdownNow() to shut down immediately.

2016-12-24 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-560: - Summary: In QueryExecutionException, can not use executorService.shutdownNow() to shut down immediately. Key: CARBONDATA-560 URL: https://issues.apache.org/jira/browse

Re: etl.DataLoadingException: The input file does not exist

2016-12-22 Thread Liang Chen
Hi This is because that you use cluster mode, but the input file is local file. 1.If you use cluster mode, please load hadoop files 2.If you just want to load local files, please use local mode. 李寅威 wrote > Hi, > > when i run the following script: > > > scala>val dataFilePath = new >

Re: same query and I change the value than throw a error

2016-12-21 Thread Liang Chen
Hi Are you using hive client to run sql to query carbon table ? jdbc:hive2://172.12.1.24:1> select * from hotel_event_2 where c1 = "key_label_1_10" and c3 > "2005-11-18 00:28:02"; Regards Liang sailingYang wrote > hi I use

Re: carbondata Indenpdent reader

2016-12-20 Thread Liang Chen
Hi For Q1: Carbon Data be stored under storePath , it can specify anywhere. Under "storePath", there are two folders : Fact and Metadata. As per you provided info, you specified the "storePath" is load path, this is why you can not find info from hdfs. For Q2: Please refer to

Re: [Improvement] Carbon query gc problem

2016-12-19 Thread Liang Chen
Hi+1,Store data in offheap to avoid gc problem , the solution will help performance more. Kumar Vishal wrote > There are lots of gc when carbon is processing more number of > recordsduring query, which is impacting carbon query performance.To solve > this gcproblem happening when query output is

Re: How to compile the latest source code of carbondata

2016-12-18 Thread Liang Chen
ster ~]$ cd carbondata/bin/ > [hadoop@master bin]$ ll > total 8 > -rwxrwxr-x 1 hadoop hadoop 3879 Dec 19 14:54 carbon-spark-shell > -rwxrwxr-x 1 hadoop hadoop 2820 Dec 19 14:54 carbon-spark-sql > > > > is this phenomenon normal ? > > > > > > -

Re: [DISCUSSION] CarbonData loading solution discussion

2016-12-15 Thread Liang Chen
Hi Jacky Thanks you started a good discussion. see if i understand your points: Scenario1 likes the current load data solution(0.2.0). 1.0.0 Will provide a new solution option of "single-pass data loading" to meet this kind of scenario: For subsequent data loads if the most dictionary code has

Re: carbondata test join question

2016-12-14 Thread Liang Chen
Hi geda As we know, CarbonData's key feature is index. About tuning SQL, you can refer to : https://cwiki.apache.org/confluence/display/CARBONDATA/FAQ Regards Liang -- View this message in context:

Re: save dataframe error, why loading ./TEMPCSV ?

2016-12-13 Thread Liang Chen
Hi tempCSV just is a temp folder, will be deleted after finishing load data to carbon table. You can set some breakpoints to debug example DataFrameAPIExample.scala , you will find the temp folder. Regards Liang Regards Liang 2016-12-14 13:55 GMT+08:00 Li Peng : >

Re: error when save DF to carbondata file

2016-12-13 Thread Liang Chen
Hi As discussed, please use 0.2.0 version, and use load method. 2016-12-13 14:08 GMT+08:00 Lu Cao : > Hi Dev team, > I run spark-shell in my local spark standalone mode. It returned error > > java.io.IOException: No input paths specified in job > > when I was trying

Re: About hive integration

2016-12-08 Thread Liang Chen
Hi Agree. Hive has been widely used, this is a consensus。 Apache CarbonData community already have the plan to support hive integration, look forward to seeing your contribution on hive integration also :) Regards Liang cenyuhai wrote > Hi, all: > Now carbondata is not working in hive

Re: query on carbondata table return error.

2016-12-08 Thread Liang Chen
Hi Can you raise one JIRA to report this issue? Regards Liang Cao Lu 曹鲁 wrote > Hi dev team, > I build the carbondata from master branch and distributed to the spark on > yarn cluster. > The data successfully loaded and count(*) is OK, but when I tried to query > the detail data, it returns

Re: carbondata-0.2 load data failed in yarn molde

2016-12-08 Thread Liang Chen
Hi Have you solved this issue after applying new configurations? Regards Liang geda wrote > hello: > i test data in spark locak model ,then load data inpath to table ,works > well. > but when i use yarn-client modle, with 1w rows , size :940k ,but error > happend ,there is no lock find in

Re: [Discussion] Some confused properties

2016-12-08 Thread Liang Chen
Hi Thanks you started the discussion. the storelocation is for storing all CarbonData files. Regards Liang cenyuhai wrote > Hi, all: > I am trying to use carbon, but I am confused about the properties as > blow: > > > carbon.storelocation=hdfs://hacluster/Opt/CarbonStore > #Base

Re: [Discussion] Parsing values during data load should adopt a strict check or lenient check mechanism

2016-12-06 Thread Liang Chen
Hi Thank you started a good discussion. I propose to do strict check mechanism to avoid these problems what you mentioned in the below. And the behavior should be same for both dimensions and measures. In a word , need to process the actual data type as per users input. Regards Liang

Re: Hi dev,Apache CarbonData CI now is working for auto-checking all PRs

2016-12-06 Thread Liang Chen
Hi Share the full picture with all of you about Apache CarbonData CI. -- 1.CI Environment For supporting more complex CI test(like cluster), we built the Apache CarbonData Jenkins CI which is running in cloud machine machine with IP

Hi dev,Apache CarbonData CI now is working for auto-checking all PRs

2016-12-05 Thread Liang Chen
Hi dev Apache CarbonData CI now is working for auto-checking all PRs. This is a job in Jenkins CI with name ApacheCarbonPRBuilder, which is running in cloud machine machine with IP http://136.243.101.176:8080/ , anybody can access this machine and check the build status and result. - When a

Re: CarbonData propose major version number increment for next version (to 1.0.0)

2016-12-01 Thread Liang Chen
Hi Thanks for all of your comments, will change the current master-SNAPSHOT version to 1.0.0 Regards Liang Venkata Gollamudi wrote > Hi All, > > CarbonData 0.2.0 has been a good work and stable release with lot of > defects fixed and with number of performance improvements. >

Re: carbon data

2016-11-28 Thread Liang Chen
Hi Lionel Don't need to create table first, please find the example code in ExampleUtils.scala df.write .format("carbondata") .option("tableName", tableName) .option("compress", "true") .option("useKettle", "false") .mode(mode) .save() Preparing API docs is in progress.

Re: Re: CarbonData propose major version number increment for nextversion (to 1.0.0)

2016-11-25 Thread Liang Chen
sh gupta" > tomanishgupta18@ > wrote: > >> +1 >> >> Regards >> Manish Gupta >> >> On Thu, Nov 24, 2016 at 7:30 PM, Kumar Vishal > kumarvishal1802@ > >> wrote: >> >> > +1 >> > >> > -Regards >>

  1   2   >