RE: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-03-01 Thread Jihong Ma
Hi JB, We are very grateful to have you and appreciate all your hard work on driving this effort! Regards. Jihong -Original Message- From: Jean-Baptiste Onofré [mailto:j...@nanthrax.net] Sent: Wednesday, March 01, 2017 2:20 AM To: dev@carbondata.incubator.apache.org Subject: Re:

RE: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-21 Thread Jihong Ma
Xiaoqiao, Welcome on board! Jihong -Original Message- From: Xiaoqiao He [mailto:xq.he2...@gmail.com] Sent: Monday, February 20, 2017 8:46 PM To: Liang Chen Cc: dev@carbondata.incubator.apache.org Subject: Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer Hi PPMC, Liang, It

[jira] [Created] (CARBONDATA-607) Cleanup ValueCompressionHolder class and all sub-classes

2017-01-07 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-607: Summary: Cleanup ValueCompressionHolder class and all sub-classes Key: CARBONDATA-607 URL: https://issues.apache.org/jira/browse/CARBONDATA-607 Project: CarbonData

[jira] [Created] (CARBONDATA-588) cleanup WriterCompressModel

2017-01-03 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-588: Summary: cleanup WriterCompressModel Key: CARBONDATA-588 URL: https://issues.apache.org/jira/browse/CARBONDATA-588 Project: CarbonData Issue Type

[jira] [Created] (CARBONDATA-550) Add unit test cases for Bigint, Big decimal value compression

2016-12-21 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-550: Summary: Add unit test cases for Bigint, Big decimal value compression Key: CARBONDATA-550 URL: https://issues.apache.org/jira/browse/CARBONDATA-550 Project

[jira] [Created] (CARBONDATA-549) code improvement for bigint compression

2016-12-21 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-549: Summary: code improvement for bigint compression Key: CARBONDATA-549 URL: https://issues.apache.org/jira/browse/CARBONDATA-549 Project: CarbonData Issue

[jira] [Created] (CARBONDATA-548) Miscellaneous code improvements

2016-12-21 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-548: Summary: Miscellaneous code improvements Key: CARBONDATA-548 URL: https://issues.apache.org/jira/browse/CARBONDATA-548 Project: CarbonData Issue Type

RE: [DISCUSSION] CarbonData loading solution discussion

2016-12-15 Thread Jihong Ma
It is great idea to have separate OutputFormat for regular Carbon data files, index files as well as meta data files, For instance: dictionary file, schema file, global index file etc.. for writing Carbon generated files laid out HDFS, and it is orthogonal to the actual data load process.

RE: [Feature Proposal] Spark 2 integration with CarbonData

2016-11-28 Thread Jihong Ma
Integration with Spark 2.x is a great feature for Carbondata as Spark 2.x is getting the momentum gradually. This is a big effort ahead and let's take into consideration of all the complexity involved due to dramatic API level change, realizing it in phases is a good idea. Regards. Jihong

RE: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-28 Thread Jihong Ma
Thank you Xiaoqiao for looking into this issue and sharing your result! Have you tried varied dictionary size for comparison among all the alternatives? And please pay closer attention to the license of DAT implementation, as they are under LGPL, generally speaking, it is not legally allowed

RE: CarbonData propose major version number increment for next version (to 1.0.0)

2016-11-28 Thread Jihong Ma
+1 A rich set of features are planned to be included into next release, and more importantly there will be external API changes introduced as we integrate with Spark 2.x, Carbondata deserves a major version jump as it gets mature/production ready and powerful in terms of rich functionality and

RE: Please vote and advise on building thrift files

2016-11-17 Thread Jihong Ma
+1 for proposal 1. Jihong -Original Message- From: Anurag Srivastava [mailto:anu...@knoldus.com] Sent: Thursday, November 17, 2016 2:32 AM To: dev@carbondata.incubator.apache.org Subject: Re: Please vote and advise on building thrift files +1 for proposal 1 On Thu, Nov 17, 2016 at

RE: Single Pass Data Load Design

2016-11-14 Thread Jihong Ma
Hi Ravindra, Thank you for putting together a proposal for improving data load process! Please find my comments in-lined in the Google doc. Jihong -Original Message- From: Ravindra Pesala [mailto:ravi.pes...@gmail.com] Sent: Sunday, November 13, 2016 4:24 AM To: dev Subject: Single

RE: [VOTE] Apache CarbonData 0.2.0-incubating release

2016-11-09 Thread Jihong Ma
+1 binding. Jihong -Original Message- From: Liang Chen [mailto:chenliang6...@gmail.com] Sent: Wednesday, November 09, 2016 3:18 PM To: dev@carbondata.incubator.apache.org Subject: [VOTE] Apache CarbonData 0.2.0-incubating release Hi all, I submit the CarbonData 0.2.0-incubating to

RE: [Discussion] Please vote and comment for carbon data file format change

2016-11-03 Thread Jihong Ma
Hi Kumar, Please place the proposed format changes in attachment or attach to the associated JIRA, I would like to take a look. Thanks! Jihong -Original Message- From: Jacky Li [mailto:jacky.li...@qq.com] Sent: Thursday, November 03, 2016 7:54 AM To:

RE: please vote and comment: remove thrift solution

2016-10-24 Thread Jihong Ma
+1. I agree shipping the generated JAVA code has drawback, we should explore to publish it on Maven central repository for release, so that with the correct artifacts in place for the corresponding release in pom.xml, we are good. Jihong -Original Message- From: Jacky Li

RE: Discussion(New feature) regarding single pass data loading solution.

2016-10-18 Thread Jihong Ma
or map.lock & map.unlock features. Thanks, Ravi. On 18 October 2016 at 00:08, Jihong Ma <jihong...@huawei.com> wrote: > Hi Ravi, > > I took a quick look at Hazlecast, what they offer is a distributed map > across cluster (on any single node only portion of the map is stored), to

RE: Discussion(New feature) regarding single pass data loading solution.

2016-10-17 Thread Jihong Ma
2. not introducing more dependency: we already using zookeeper and HDFS. > 3. performance? since new dictionary value and synchronization is rare. > > What do you think? > > Regards, > Jacky > > > 在 2016年10月15日,上午2:38,Jihong Ma <jihong...@huawei.com> 写道: > >

RE: Discussion(New feature) regarding single pass data loading solution.

2016-10-14 Thread Jihong Ma
for first time. That solution could be distributed map or KV store. Regards, Ravi. On 14 October 2016 at 23:12, Jihong Ma <jihong...@huawei.com> wrote: > Hi Liang, > > This tool is more or less like the first load, the first time after table > is created, any subsequent loads

RE: Discussion(New feature) regarding single pass data loading solution.

2016-10-13 Thread Jihong Ma
late > materialization when global dictionary is present. > 4. May be we should think of some ways to create global dictionary lazily > as we serve SELECT queries. Implementation may not be that straight > forward. Not sure if its worth the effort. > > Best Regards, &g

RE: Discussion(New feature) regarding single pass data loading solution.

2016-10-11 Thread Jihong Ma
A rather straight option is allow user to supply global dictionary generated somewhere else or we build a separate tool just for generating as well updating dictionary. Then the general normal data loading process will encode columns with local dictionary if not supplied. This should cover

RE: Abstracting CarbonData's Index Interface

2016-10-04 Thread Jihong Ma
, 2016 9:15 PM To: dev@carbondata.incubator.apache.org Subject: Re: Abstracting CarbonData's Index Interface > 在 2016年10月4日,上午8:01,Jihong Ma <jihong...@huawei.com> 写道: > > It is a great idea to open the door for more flexible/scalable way of > accessing index to help with query p

RE: Abstracting CarbonData's Index Interface

2016-10-03 Thread Jihong Ma
It is a great idea to open the door for more flexible/scalable way of accessing index to help with query processing. if our goals are as following: > Goal 1: User can choose the place to store Index data, it can be stored in > processing framework's memory space (like in spark driver memory)

RE: [discussion]When table properties is repeated it only set the last one

2016-09-29 Thread Jihong Ma
wrote: > +1 for option-1- should throw exception.. > Regards, > Aniket > > On 28 Sep 2016 7:01 p.m., "Ravindra Pesala" <ravi.pes...@gmail.com> wrote: > > > +1 for option 1 > > > > On Thu, 29 Sep 2016 02:52 Jihong Ma, <jihong...@huawei.co

RE: [Discussion]: option to disable multi-layered index scan and use full table scan

2016-09-28 Thread Jihong Ma
com> wrote: > I agree with jihong.carbon need to have smart logic to decide > On Wed, 28 Sep 2016 at 6:12 AM, Jihong Ma <jihong...@huawei.com> wrote: > > > Ideally this should be an internal improvement, not necessarily exposing > > it as an config option, Carbon sho

RE: [VOTE] Apache CarbonData 0.1.1-incubating release

2016-09-26 Thread Jihong Ma
+1 binding. Thanks. Jenny -Original Message- From: Liang Big data [mailto:chenliang6...@gmail.com] Sent: Monday, September 26, 2016 2:22 PM To: dev@carbondata.incubator.apache.org Subject: Re: [VOTE] Apache CarbonData 0.1.1-incubating release +1 Regards Liang 2016-09-26 17:11

RE: [Discussion] Support Date/Time format for Timestamp columns to be defined at column level

2016-09-26 Thread Jihong Ma
I agree we should allow specifying timestamp format at column level, and respect that at data loading as well as query time. Is there technical reason which prevent us from doing that? Otherwise, respecting the specification should be the way to go. col1(Date) col2(Date) > > 2016-09-24

RE: [Discuss]Set block_size for table on table level

2016-09-26 Thread Jihong Ma
+1, To avoid potential compatibility issue, we could introduce this param as an optional field, as long as it is not a required field, we are fine with a defined default block size. Regards. Jihong -Original Message- From: Jacky Li [mailto:jacky.li...@qq.com] Sent: Monday, September

RE: [VOTE] Apache CarbonData 0.1.0-incubating release

2016-08-19 Thread Jihong Ma
+1 (binding) Great work! Jihong -Original Message- From: chenliang613 [mailto:chenliang6...@gmail.com] Sent: Friday, August 19, 2016 7:33 PM To: dev@carbondata.incubator.apache.org Subject: Re: [VOTE] Apache CarbonData 0.1.0-incubating release +1 (binding) -- View this message in

Re: [PROPOSAL] How to merge a pull request

2016-08-10 Thread Jihong Ma
+1 Great idea and I am sure it will make our life a lot easier as committer!! Jihong Sent from HUAWEI AnyOffice From: Jacky Li To: dev@carbondata.incubator.apache.org; Subject: Re: [PROPOSAL] How to merge a pull request Time: 2016-08-09 20:56:25 definitely +1 > 在

[jira] [Commented] (CARBONDATA-11) Support carbon spark sql cli and carbon spark shell in carbondata to simplify operations for first time users

2016-06-27 Thread Jihong MA (JIRA)
[ https://issues.apache.org/jira/browse/CARBONDATA-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351551#comment-15351551 ] Jihong MA commented on CARBONDATA-11: - let's incorporate this work with CarbonContext refinement