Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Xiaoqiao He
Hi PPMC, Liang, It is my honor that receive the invitation, and very happy to have chance that participate to build CarbonData community also. I will keep contributing to Apache CarbonData and continue to promoting the practical application on CarbonData. Thank you again and hope CarbonData have

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-19 Thread Xiaoqiao He
6 PM, Ravindra Pesala <ravi.pes...@gmail.com> wrote: > Hi Xiaoqiao, > > Is the problem still exists? > Can you try with clean build with "mvn clean -DskipTests -Pspark-1.6 > package" command. > > Regards, > Ravindra. > > On 16 February 2017 at 08:36, Xiao

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-15 Thread Xiaoqiao He
master/docs/installation-guide.md > > Regards > Liang > > 2017-02-15 3:29 GMT-08:00 Xiaoqiao He <xq.he2...@gmail.com>: > > > hi Manish Gupta, > > > > Thanks for you focus, actually i try to load data following > > https://github.com/apache/incubator-

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-15 Thread Xiaoqiao He
compile and check in assembly > jar. > > Regards > Manish Gupta > > On Tue, Feb 14, 2017 at 11:19 AM, Xiaoqiao He <xq.he2...@gmail.com> wrote: > > > hi, dev, > > > > The latest release version apache-carbondata-1.0.0-incubating-rc2

Exception throws when I load data using carbondata-1.0.0

2017-02-13 Thread Xiaoqiao He
hi, dev, The latest release version apache-carbondata-1.0.0-incubating-rc2 which takes Spark-1.6.2 to build throws exception ` java.lang.ClassNotFoundException: org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD` when i load data following Quick Start Guide. Env: a.

Re: [ANNOUNCE] Apache CarbonData 1.0.0-incubating released

2017-02-05 Thread Xiaoqiao He
Firstly, configuration to *Apache CarbonData 1.0.0-incubating released* and Thanks for the great works. Test about CarbonData 1.0.0-incubating found that this version is better in availability, reliability and performance than previous ones. Especially the performance of loading data improved

Re: [UT Fail Report] UT can not pass when run with branch master

2017-01-04 Thread Xiaoqiao He
check it and work well: 1.pull master branch, 2.compile and run ut, all ut pass. thanks for your repair timely. On Thu, Jan 5, 2017 at 11:37 AM, Liang Chen wrote: > Hi > > It is fixed, now the master can pass compilation. Thanks for you pointed > out > it. > > Regards >

[UT Fail Report] UT can not pass when run with branch master

2017-01-04 Thread Xiaoqiao He
UT fails when run with branch master of carbondata ( https://github.com/apache/incubator-carbondata/tree/master). exception as following: > GrtLtFilterProcessorTestCase: > *** RUN ABORTED *** > java.lang.Exception: DataLoad failure: Due to internal errors, please > check logs for more details.

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-28 Thread Xiaoqiao He
mong all the > alternatives? > > And please pay closer attention to the license of DAT implementation, as > they are under LGPL, generally speaking, it is not legally allowed to be > included. > > Jihong > > -Original Message- > From: Xiaoqiao He [mailto:xq.he

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-27 Thread Xiaoqiao He
Hi Kumar Vishal, I'll create task to trace this issue. Thanks for your suggestions. Regards, He Xiaoqiao On Sun, Nov 27, 2016 at 1:41 AM, Kumar Vishal <kumarvishal1...@gmail.com> wrote: > Hi Xiaoqiao He, > > You can go ahead with DAT implementation, based on the result. > I

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-25 Thread Xiaoqiao He
est result as soon as possible. Thanks your suggestions again. > > > > Regards, > > Xiaoqiao > > > > On Thu, Nov 24, 2016 at 4:48 PM, Kumar Vishal > > > kumarvishal1802@ > > > > > wrote: > > > >> Hi XIaoqiao He, > >> +1, &g

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-24 Thread Xiaoqiao He
suggestions again. Regards, Xiaoqiao On Thu, Nov 24, 2016 at 4:48 PM, Kumar Vishal <kumarvishal1...@gmail.com> wrote: > Hi XIaoqiao He, > +1, > For forward dictionary case it will be very good optimisation, as our case > is very specific storing byte array to int mapping[data to surro

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-24 Thread Xiaoqiao He
; Regards, > > He Xiaoqiao > > > > > > On Thu, Nov 24, 2016 at 7:48 AM, Liang Chen > > > chenliang6136@ > > > wrote: > > > >> Hi xiaoqiao > >> > >> This improvement looks great! > >> Can you please explain the belo

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Xiaoqiao He
what does it mean? > -- > ConcurrentHashMap > ~68MB 14543 > Double Array Trie > ~104MB 12825 > > Regards > Liang > > 2016-11-24 2:04 GMT+08:00 Xiaoqiao He <xq.he2...@gmail.com>: > > > Hi All, > > > > I would like to propose Dictiona

[Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Xiaoqiao He
Hi All, I would like to propose Dictionary improvement which using Trie in place of HashMap. In order to speedup aggregation, reduce run-time memory footprint, enable fast distinct count etc, CarbonData encodes data using dictionary at file level or table level based on cardinality. It is a

Re: [Feature ]Design Document for Update/Delete support in CarbonData

2016-11-20 Thread Xiaoqiao He
hi Aniket Adnaik, It is a great design document about update/delete and very useful feature for CarbonData. For the solution you proposed, i think the most difficult challenge is Compaction. If without careful attention, rewriting data over and over can lead to some serious network and disk

Re: [Feature] proposal for update and delete support in Carbon data

2016-11-15 Thread Xiaoqiao He
hi Vinod, It is an expected feature for many people as Jacky mentioned. I think Update/Delete should be basic module for CarbonData, meanwhile it is complex question for distributed storage system. The solution you proposed is based on traditional 'Base + Delta' approach, which is applied on

Re: [Discussion] Please vote and comment for carbon data file format change

2016-11-01 Thread Xiaoqiao He
Hi Kumar Vishal, I couldn't get Fig. of the file format, could you re-upload them? Thanks. Best Regards On Tue, Nov 1, 2016 at 7:12 PM, Kumar Vishal wrote: > > ​Hello All, > > Improving carbon first time query performance > > Reason: > 1. As file system cache is

Re: Beijing Apache CarbonData meetup:https://www.meetup.com/Apache-Carbondata-Meetup/events/235013117/

2016-10-31 Thread Xiaoqiao He
This meetup is really interesting. it's helpful to understanding arch. and some details of Carbondata: 1.What is the advantages of Carbondata and scenarios are good for; 2.How Carbondata gets to its goal, include arch. and implementation, some differences between carbondata and