Re: [DISCUSSION] Interfaces for index frame work

2017-08-14 Thread Liang Chen
Hi Nice feature, +1. -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-Interfaces-for-index-frame-work-tp13274p20217.html Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.

Re: [branch-1.1] delete problem

2017-07-14 Thread Liang Chen
Hi Ashwini K added a comment - 2 days ago delete is working fine for me . could you please share your table schema and data file you are using ? -- View this message in context:

Re: FileNotFoundExceptions while running CarbonData

2017-07-18 Thread Liang Chen
Hi Swapnil Very look forward to seeing your PR. Please let me know your Apache JIRA email id, i will add the contributor right for you. Regards Liang 2017-07-18 6:49 GMT+08:00 Swapnil Shinde : > Thanks. I think I fixed it support maprFS. I will do some more testing

Re: [DISCUSSION] Propose to remove "support spark 1.5" from CarbonData 1.2.0 onwards

2017-07-09 Thread Liang Chen
tegration with spark 1.5 and spark 1.6 almost share > the same code. Is there any overhead or difficulties to maintain spark 1.5 > integration code onward? > > Regards, > Jacky > > > 在 2017年7月8日,下午11:44,Liang Chen <chenliang...@apache.org> 写道: > > > >

Re: [Discussion] Using Lazy Dictionary Decode for Presto Integration

2017-07-18 Thread Liang Chen
+1, use the laze decode to utilize carbondata's dictionary, it would improve aggregation performance. Please consider adding these code to presto integration module, don't directly reuse spark module code. Regards Liang 2017-07-18 23:46 GMT+08:00 Bhavya Aggarwal : > We were

Re: [question] about new table property "sort_column"

2017-07-21 Thread Liang Chen
Hi Jin zhou Yes, your understanding is correct. The MDK(multi-dimension index) will be created as per your specified sort_columns order. Regards Liang 2017-07-21 10:51 GMT+08:00 Jin Zhou : > > Hi,all > > I notice there is a new table property: sort_column and want to confirm:

Re: carbon data performance doubts

2017-07-21 Thread Liang Chen
Hi Some more info : In release 1.1.1, there was a good improvement "measure filter optimization", system will use minmax index to do filter for measure column filter. So for INT column to get good filter: one way you can add the INT column to sort_columns, another way, system will

Re: carbon data performance doubts

2017-07-21 Thread Liang Chen
Hi Swapnil Actually, current system's behavior is : Index and dictionary encoding are decoupled, no relationship. 1. If you want to make some columns have good filter , just add these columns to sort_columns (like tblproperties('sort_columns'='empno')), to build good MDX index for these

Re: carbon data performance doubts

2017-07-21 Thread Liang Chen
Hi Some more info : In release 1.1.1, there was a good improvement "measure filter optimization", system will use minmax index to do filter for measure column filter. So for INT Regards Liang 2017-07-22 9:22 GMT+08:00 Liang Chen <chenliang...@apache.org>: > Hi Swapnil >

Re: carbon data performance doubts

2017-07-23 Thread Liang Chen
Hi simafengyun Can you write a example to introduce how to use sort_columns and update the documents also, thanks. Regards Liang -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/carbon-data-performance-doubts-tp18438p18703.html Sent

Re: Presto+CarbonData optimization work discussion

2017-07-19 Thread Liang Chen
spark > integrations. It will tell whether it is really a lazy decoding issue or > not. > > Regards, > Ravindra > > On 20 July 2017 at 08:04, Liang Chen <chenliang6...@gmail.com> wrote: > > > Hi > > > > For -- 4) Lazy decoding of the dictionary, just i tested 1

Re: [Discussion] CarbonOutputFormat Implementation

2017-07-05 Thread Liang Chen
Hi +1 for supporting OutputFormat. Regards Liang Divya Gupta wrote > Thanks Jacky and Venkata for the suggestions. I am working on the design > part and will post on this discussion in case of any queries. I will share > the design soon. > > Regards > Divya Gupta > Project Lead > > >

Re: XOR encoding for floating point

2017-07-05 Thread Liang Chen
Hi Geetika Very happy to see that you are interested in contributing this feature. Please have the design discussion before you start to code. Regards Liang Geetika Gupta wrote > Hi Community, > > I was looking into CARBONDATA-1128 > https://issues.apache.org/jira/browse/CARBONDATA-1128;. The

Re: Why is slower that build ChunkRowIterator object in presto plugin of carbondata?

2017-07-05 Thread Liang Chen
Hi In Spark-shell, you can use the below script : import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.constants.CarbonCommonConstants CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER, "true")

Re: [DISCUSSION] About partition table query performance

2017-08-17 Thread Liang Chen
Hi +1.Very nice feature, Thanks for your good contribution. Look forward to seeing the test report. Regards Liang lionel061201 wrote > Hi dev, > Partition feature is now available on master and I just created a guidance > doc in > https://github.com/apache/carbondata/pull/1258 > > I added

Apache CarbonData 6th meetup in Shanghai on 2nd Sep,2017 at : https://jinshuju.net/f/X8x5S9?from=timeline

2017-08-23 Thread Liang Chen

Re: Apache CarbonData 6th meetup in Shanghai on 2nd Sep,2017 at : https://jinshuju.net/f/X8x5S9?from=timeline

2017-08-23 Thread Liang Chen
-- View this message in context:

Re: [VOTE] Apache CarbonData 1.1.0 (RC3) release

2017-05-12 Thread Liang Chen
+1(binding) LICENSE,NOTICE are ok no binary file compile is ok with spark 1.6 and 2.1 *mvn clean -Pspark-1.6 package* [INFO] Apache CarbonData :: Parent SUCCESS [ 1.520 s] [INFO] Apache CarbonData :: Common SUCCESS [ 2.546 s] [INFO] Apache

Re: [DISCUSSION] CarbonData storage service

2017-05-16 Thread Liang Chen
Hi jacky One question : Can you explain that proposed CarbonData Storage Service would store what information? For users how to pre-configure memory resource for the service? as big as possible memory?

Re: update bug with carbondata1.1.0 and spark1.6.0

2017-06-19 Thread Liang Chen
Hi Correct my info, can do update as below , it is successful. +---+-++---+ | id| name|city|age| +---+-++---+ | 10|india|city| 10| +---+-++---+ Regards Liang -- View this message in context:

[ANNOUNCE] Ravindra as new Apache CarbonData PMC

2017-05-19 Thread Liang Chen
Hi all We are pleased to announce that the PMC has invited Ravindra as new Apache CarbonData PMC member, and the invite has been accepted ! Congrats to Ravindra and welcome aboard. Thanks The Apache CarbonData team

Re: Comparative testing of CarbonData and Parquet

2017-05-21 Thread Liang Chen
Hi Thank you shared the test result, and very happy to hear that you already started to migrate business to CarbonData. Two suggestions: 1.Can you use the latest release 1.1.0 to test it again, because 1.1.0 introduced V3 format for further improving scan performance(for example:query 6).

Re: Compilation error on presto Branch

2017-05-28 Thread Liang Chen
Hi This issue has been solved at master. Please check it again. Regards Liang -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Compilation-error-on-presto-Branch-tp12673p13325.html Sent from the Apache CarbonData Dev Mailing List

Re: when plan to implemnt merge operation

2017-05-29 Thread Liang Chen
Hi For your this case, use delete and append whether can meet your requirements? Obviously , merge would impact index, so we should find out one best way to implement this feature. please other people give some comment also. Regards Liang 2017-05-27 9:45 GMT+08:00 Mic Sun

Re: [INFO] Jenkins is fixed and GitHub/Jenkins integration is back

2017-06-02 Thread Liang Chen
Yes, now it is working fine. thanks for your help ,JB. Regards Liang 2017-06-02 20:44 GMT+08:00 Jean-Baptiste Onofré : > Hi team, > > I fixed the issue we got on the Apache Jenkins CarbonData jobs. > > We used "Maven (latest)" for our build, and we just got the last Maven >

Re: [DISCUSSION] Whether Carbondata should keep carbon-sql-shell script

2017-06-06 Thread Liang Chen
Hi correct the file name, should be : ./bin/carbon-spark-sql and ./bin/carbon-spark-shell. Are you suggesting removing both file or only carbon-spark-shell ? Regards Liang 2017-06-07 0:24 GMT+08:00 Erlu Chen : > Hi community, > > Recently, I viewed the implementation of

Re: About ColumnGroup feature

2017-06-10 Thread Liang Chen
Hi +1 for removal ColumnGroup. it would be helpful to simplify the current system code. Regards Liang -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/About-ColumnGroup-feature-tp14436p14458.html Sent from the Apache CarbonData Dev

[VOTE] Presto integration version :Re: [DISCUSSION] Whether Carbondata should work with Presto in the next release version(1.2.0)

2017-06-13 Thread Liang Chen
Hi +1 for supporting presto integration. I propose to support 0.166 to match some community users(Ctrip) which already be used in production, please dev vote the presto version also. Regards Liang 2017-06-12 13:56 GMT+08:00 Bhavya Aggarwal : > Hi All, > > We can add the

[DISCUSSION] In 1.2.0, use Spark 2.1 and Hadoop 2.7.2 as default compilation in pom.

2017-06-15 Thread Liang Chen
Hi Dev In 1.2, there are many features developing based on Spark 2.1 and Hadoop 2.7.2, so i propose to use Spark2.1 and Hadoop 2.7.2 as default compilation in pom. Please discuss and vote. Regards Liang

Re: [jira] [Created] (CARBONDATA-1030) Support reading specified segment or carbondata file

2017-05-07 Thread Liang Chen
Hi +1 for this feature. How about the DDL script as below : carbon.sql("select * from carbontable in segmentid(0,3,5,7) where filter conditions").show() Regards Liang 2017-05-05 22:33 GMT+08:00 Jin Zhou (JIRA) : > Jin Zhou created CARBONDATA-1030: >

Re: introduce complex data-type for query "Alter table tableName change columnName copyColumnName dataType"

2017-05-05 Thread Liang Chen
+1. Regards Liang 2017-05-03 13:46 GMT+08:00 rahulcarbondata : > Hi all,currently "Alter table tableName change columnName copyColumnName > dataType" query supports only primitive type . I propose it should also > support complex data type . e.g. CREATE TABLE >

Re: [VOTE] Apache CarbonData 1.2.0(RC3) release

2017-09-23 Thread Liang Chen
1.Source code can be compiled successfully with script "mvn clean -DskipTests -Pspark-2.1 -Pbuild-with-format package" ​ 2.Can query carbondata file properly in Spark-shell. 3.License file looks good. 4.Signature file looks good 5.Hash checksum files look good 6.NOTICE file looks good My vote :

Re: [VOTE] Apache CarbonData 1.2.0(RC2) release

2017-09-18 Thread Liang Chen
Hi 1.Source code can be compiled successfully with script "mvn clean -DskipTests -Pspark-2.1 -Pbuild-with-format package" 2.Can query carbondata file properly in Spark-shell. 3.License file looks good. 4.Signature file looks good 5.Hash checksum files look good 6.NOTICE file looks good My vote :

Re: [DISCUSSION] Support only spark 2 in carbon 1.3.0

2017-10-15 Thread Liang Chen
Hi lionel As per mailing list discussion result, no objection. so can you create an umbrella jira to remove spark 1.5 & 1.6 code in 1.3.0. Regards Liang lionel061201 wrote > Hi community, > Currently we have three spark related module in carbondata(spark 1.5, 1.6, > 2.1), the project has

Re: [Discussion] Support pre-aggregate table to improve OLAP performance

2017-10-15 Thread Liang Chen
Hi Jacky Thanks for you started this discussion, this is a great feature in carbondata. One question: For sub_jar "Handle alter table scenarios for aggregation table", please give more detail info. Just i viewed the pdf attachment as below, looks no need to do any handles for agg table if users

Re: Query failed after "update" statement interruptted

2017-10-16 Thread Liang Chen
Hi Can you provide the full script? what is your update script? how to reproduce ? Regards Liang yixu2001 wrote > dev > > On the process of "update" statement execution, interruption happened. > After that, the "select" statement failed. > Sometimes the "select" statement will recover to

Re: [Discussion] Support pre-aggregate table to improve OLAP performance

2017-10-16 Thread Liang Chen
ble as >>> update scenario” >>> User need to drop the associated aggregate table and perform alter >>> table, >>> or data update/delete, or delete segment operation, then he can create >>> the >>> pre-agg table using CTAS command again, and the

[DISCUSSION] Optimize the default value for some parameters

2017-10-11 Thread Liang Chen
Hi All As you know, some default value of parameters need to adjust for most of cases, this discussion is for collecting which parameters' default value need to be optimized: 1. TABLE_BLOCKSIZE: current default is 1G, propose to adjust to 512M 2. Please append at here if you propose to adjust

Re: [DISCUSSION] support user specified segment reading for query

2017-10-11 Thread Liang Chen
Hi Rahul I suggest only doing "Query HINT". Please finalize the query script : select * from t1 [in SEGMENTS(1,3,5)] or SELECT /*+SEGMENTS(1,3,5) */ from t1 Regards Liang -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Does index be used when doing "join" operation between a big table and a small table?

2017-10-11 Thread Liang Chen
If the index be used, the number of tasks would be less. Can you share your script (create table script and query script), let us check if you created the effective index for filter columns. Regards Liang -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Does index be used when doing "join" operation between a big table and a small table?

2017-10-11 Thread Liang Chen
Hi If the index be used for filtering data, the number of tasks would be more less. Can you share the script(create table and query), let us check if created the effective index for filter columns. Regards Liang Mic Sun wrote > hello, > > I have 2 tables need to do "join" operation by their

Re: Apache CarbonData 6th meetup in Shanghai on 2nd Sep,2017 at : https://jinshuju.net/f/X8x5S9?from=timeline

2017-08-30 Thread Liang Chen
Hi Ohh , Really? a big big welcome! Regards Liang Jean-Baptiste Onofré wrote > Awesome. > > I would love to be there. Let me check if I can. > > Regards > JB > > On Aug 23, 2017, 08:48, at 08:48, Liang Chen > chenliang6136@ > wrote: >>http://apache

Re: Block B-tree loading failed

2017-09-13 Thread Liang Chen
Hi Looks that the path is invalid, can you provide full script: how you created carbonsession? - Caused by: org.apache.carbondata.core.datastore.exception.IndexBuilderException: Invalid carbon data file:

[DISCUSSION] Apache CarbonData 1.3.0 scope

2017-09-29 Thread Liang Chen
Hi all First , on behalf of Apache CarbonData community, thanks for all contributors who are from 20+ different organizations. This mail is for discussing 1.3.0 scope (around 3-4 months), i propose the following feature can be considered. 1)Spark 2.2.0 integration (propose committer Ravindra to

Re: ClassNotFound error when insert carbontable from hive table

2017-08-22 Thread Liang Chen
Hi lionel Can you share with us how did you fix this issue? Regards Liang lionel061201 wrote > This issue had been fixed. > > On Mon, Aug 21, 2017 at 4:04 PM, Lu Cao > whucaolu@ > wrote: > >> Hi dev, >> >> I'm trying to insert data from a hive table to carbon table: >> >> cc.sql("insert

[ANNOUNCE] Manish Gupta as new Apache CarbonData

2017-08-25 Thread Liang Chen
Hi all We are pleased to announce that the PMC has invited Manish Gupta as new Apache CarbonData committer, and the invite has been accepted ! Congrats to Manish Gupta and welcome aboard. Regards The Apache CarbonData PMC

Re: [ANNOUNCE] Manish Gupta as new Apache CarbonData committer

2017-08-25 Thread Liang Chen
Correct the title , to add "committer" info. 2017-08-25 23:56 GMT+08:00 Liang Chen <chenliang...@apache.org>: > Hi all > > We are pleased to announce that the PMC has invited Manish Gupta as new > Apache CarbonData committer, and the invite has been accepted !

Re: Presto+CarbonData optimization work discussion

2017-09-01 Thread Liang Chen
QC | 57467886 | 1385076 SK | 57385152 | 1382364 YT | 57377556 | 1383900 (13 rows) Query 20170902_033821_6_h6g24, FINISHED, 1 node Splits: 50 total, 50 done (100.00%) 0:03 [18M rows, 0B] [6.62M rows/s, 0B/s] Regards Liang Liang Chen wrote > Hi > > For -- 4) L

Re: [DISCUSSION] Apache CarbonData 1.3.0 scope

2017-10-11 Thread Liang Chen
Hi yuhai I have same comment as Jacky,please provide more info about this requirement. It would be better if you could create a new topic to detailedly discuss this requirement. Regards Liang Jacky Li wrote > Hi Cenyuhai, > > Can you further describe your requirement? Currently carbon supports

Re: Where could I download ODBC driver for carbondata

2017-11-24 Thread Liang Chen
Hi Kui Actually, CarbonData doesn't have ODBC drive you can connect to apache spark with ODBC, use spark engine to read carbondata. Regards Liang 2017-11-21 10:58 GMT+08:00 高奎 : > Hi CarbonData Team, > > I am now working on technical research about carbondata. > > We need

Re: Blog on how to use Carbondata with Presto

2017-11-26 Thread Liang Chen
Hi bhavya Thanks for your sharing, a nice blog. Regards Liang bhavya411 wrote > Hi All, > > Please look at the blog to see how we can use CarbonData with Presto. > > > https://blog.knoldus.com/2017/11/20/integrating-presto-with-carbondata/ >

Re: [Discussion]support table level compaction configuration

2017-11-16 Thread Liang Chen
Hi Jin Zhou Look forward to seeing your pull request. Do you have the contributor right of Apache CarbonData JIRA? If no, please let me know your email id of jira account. Regards Liang Jin Zhou wrote > @xm_zzc, yes, I'm working on this improvement. > > > > -- > Sent from: >

test if can receive mailing list mail

2017-11-16 Thread Liang Chen
-- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

test whether dev@mailing list working fine, or not ?

2017-11-16 Thread Liang Chen

test

2017-11-16 Thread Liang Chen
-- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion]support user specified segments in major compation

2017-11-20 Thread Liang Chen
Hi Jin Zhou OK, Thanks for your proposal. can you raise one PRs to support the two features? Regards Liang Jin Zhou wrote > @Liang Chen, thank you for your reply. > > > After seriously thinking about your suggestion, I also think the two > problems should be consid

Re: After MAJOR index lost

2017-11-01 Thread Liang Chen
Hi Yes, checked the log message, looks have some issues. Can you share the reproduce steps: Did you use how many machines to do data load, and load how many times? Regards Liang yixu2001 wrote > dev > environment spark.2.1.1 carbondata 1.1.1 hadoop 2.7.2 > > run ALTER table

Re: [DISCUSSION] Regarding to redundancy code and some issues.

2017-11-04 Thread Liang Chen
+1, all are good proposals. Regards Liang David CaiQiang wrote > Hi All, >Here, I listed the following points to improve the code. > > Redundancy: > 1. CarbonLoadModel.isDirectLoad > It is always true, better to remove the related code. > Now CarbonData doesn't pre-partition the input data

Re: Version upgrade for Presto Integration to 0.186

2017-11-03 Thread Liang Chen
+1 Can you raise one PR for this. Regards Liang bhavya411 wrote > Hi All, > > Presto 0.186 version has as lot of improvements that will increase the > performance and improve the reliability. Some of the major issues and > improvements are listed below. > > >- Fix excessive GC overhead

Re: [DISCUSSION] Optimize the default value for some parameters

2017-10-26 Thread Liang Chen
to configure while creating a table. > > Regards, > Ravindra. > > On 11 October 2017 at 13:36, Liang Chen > chenliang6136@ > wrote: > >> Hi All >> >> As you know, some default value of parameters need to adjust for most of >> cases, this discussio

Re: [Discussion]support user specified segments in major compation

2017-10-26 Thread Liang Chen
Hi Jin Zhou Thanks for starting this discussion. 1. For your first proposal : Currently , segment is the system internal concept, not expose to outside. Can you provide what exact problems do you encounter? we can find the alternative solution for your problems.

Re: [Discussion] Merging carbonindex files for each segments and across segments

2017-10-20 Thread Liang Chen
+1 for this proposal and solution, thanks, Ravi Regards Liang 2017-10-20 19:13 GMT+05:30 Ravindra Pesala : > Hi, > > Problem : > The first-time query of carbon becomes very slow. It is because of reading > many small carbonindex files and cache to the driver at the first

Re: [Disscussion] Support Streaming Ingest

2017-10-22 Thread Liang Chen
Hi One question: Why not supports structured streaming to replace spark streaming ? --- In first phase implementation, it should support kafka and spark streaming integration. More streaming framework support is preferable in the future. Regards

Re: [VOTE] Apache CarbonData 1.4.0(RC2) release

2018-05-23 Thread Liang Chen
Hi +1(binding) Regards Liang ravipesala wrote > Hi > > > I submit the Apache CarbonData 1.4.0 (RC2) for your vote. > > > 1.Release Notes: > > *https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=1234100 > https://link.getmailspring.com/link/ >

Re: after load data using SaveMode.Overwrite, query through beeline return all null field

2018-05-23 Thread Liang Chen
Hi Thank you reported this issue. Let us check it and response to you asap. Regards Liang 喜之郎 wrote > hi dev. > carbon version :1.3.1 > spark version:2.2.1 > 1) First I create a carbon table through beeline. > 2) Then I use spark-submit and dataframe load data to carbon. Query is OK。 > 3)

Re: Carbondata集成Presto的问题请教

2018-06-14 Thread Liang Chen
Hi Please send your questions to mailing list.(cc to mailing list) Currently, "presto read streaming carbondata table" is not supporting. Can you share with the community , why need to support this feature, what are your exact requirements? Regards Liang kevintop 于2018年6月13日周三 上午9:48写道: > 陈总

Re: Support updating/deleting data for stream table

2018-06-03 Thread Liang Chen
data for one year, > he > need to delete one year ago of data everyday. On the other hand, solution > 2 > is more complicated than solution 1, we need to consider the implement of > solution 2 in depth. > Based on the above reasons, Liang Chen, Jacky, David and I prefered to >

Re: MODERATE for dev@carbondata.apache.org

2018-06-03 Thread Liang Chen
Hi 1. You can get table detail info with the below script: sql("desc formatted xx_your tablename") 2. You can find the more detail docs about datamap at : ../docs/datamap Regards Liang 2018-05-31 17:59 GMT+08:00 < dev-reject-1527760781.11669.gamfjekkdhlpcbigj...@carbondata.apache.org>: > > To

Updated release notes . Re: [ANNOUNCE] Apache CarbonData 1.4.0 release

2018-06-05 Thread Liang Chen
Hi Please find the updated 1.4.0 release notes: https://cwiki.apache.org/confluence/display/CARBONDATA/Apache+CarbonData+1.4.0+Release Regards Liang -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[ANNOUNCE] Chuanyin Xu as new Apache CarbonData committer

2018-05-01 Thread Liang Chen
Hi all We are pleased to announce that the PMC has invited Chuanyin Xu as new Apache CarbonData committer, and the invite has been accepted! Congrats to Chuanyin Xu and welcome aboard. Regards Apache CarbonData PMC

Re: Change the 'comment' content for column when execute command 'desc formatted table_name'

2018-04-26 Thread Liang Chen
Hi Ravi Good thinking. Because the inverted index columns by default are the same as sort_column columns, from the user perspective, he only need to set no_inverted_index columns in sort_column columns, so i proposed to display only the no_inverted_index columns info which be set by user.

Re: [Discussion] Merging carbonindex files for each segments and across segments

2017-10-26 Thread Liang Chen
Yes, Jin Zhou. Merge all index files to one in a segment would be useful feature. it would significantly improve query performance. Regards Liang Jin Zhou wrote > Hi, ravipesala > > Thank you for your proposal, merging index file is a very useful feature > as > we have already met serious

Re: [PROPOSAL] Tag Pull Request with feature tag

2017-10-28 Thread Liang Chen
+1, agree with this proposal. Regards Liang -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[ANNOUNCE] Kumar Vishal as new PMC for Apache CarbonData

2018-01-10 Thread Liang Chen
Hi We are pleased to announce that Kumar Vishal as new PMC for Apache CarbonData. Congrats to Kumar Vishal! Apache CarbonData PMC

Re: Select" query failed when executing "COMPACT" and "CLEAN".

2018-01-19 Thread Liang Chen
Hi I can't reproduce it with "spark2.1+ carbondata1.1.1" Maybe not completely finish compaction , then you do the query. Have you tried: Execute the query in the same shell after " compaction and clean"? Regards Liang yixu2001 wrote > dev > > > spark2.1+ carbondata1.1.1 > > "Select"

Re: CarbonData保存CSV找不到方法com.univocity.parsers.csv.CsvWriterSettings.setQuoteEscapingEnabled

2018-01-24 Thread Liang Chen
ok,Thanks for your feedback. Please modify the pom file under processing, see if it can work in 1.2.0. com.univocity univocity-parsers 2.2.1 Regards Liang 2018-01-25 11:56 GMT+08:00 Luo Colin : > Chenliang, > > > >环境:Apache Spark 2.1,

Re: 分区表load数据然后update,结果数据被delete

2018-02-02 Thread Liang Chen
set > to LOCAL_SORT > 18/02/02 10:06:55 AUDIT CarbonDataRDDFactory$: > [ubuntu][bigdata][Thread-1]Data update is successful for default.test3 > ++ > || > ++ > ++ > > > scala> carbon.sql("SELECT * FROM test3").show() > +---+-+++ > | id| name| age|city| > +---+-

Re: Help, carbondata issues on spark

2018-02-03 Thread Liang Chen
Hi 1.no multiple levels partitions , we need three levels partitions, like year,day,hour Reply : Year,day,hour belong to one column(field) or three columns ? Can you explain, what are your exact scenarios? we can help you to design partition + sort columns to solve your specific query

Re: [VOTE] Apache CarbonData 1.3.0(RC2) release

2018-02-03 Thread Liang Chen
Hi +1(binding) Regards Liang 2018-02-04 5:54 GMT+08:00 Ravindra Pesala : > Hi > > I submit the Apache CarbonData 1.3.0 (RC2) for your vote. > > 1.Release Notes: > *https://issues.apache.org/jira/secure/ReleaseNote.jspa? > projectId=12320220=12341004 >

Re: New-bie JIRAs for new contributors

2018-07-31 Thread Liang Chen
vikashtalanki wrote > Hi Vikash > > Welcome to Apache CarbonData community. > 1. Firstly, please let me know your apache jira account(email id), i will > add you as contributor. > 2. Secondly,You can run the simple example as per : >

Re: [VOTE] Apache CarbonData 1.4.1(RC1) release

2018-07-31 Thread Liang Chen
ravipesala wrote > Hi > > > I submit the Apache CarbonData 1.4.1 (RC1) for your vote. > > > 1.Release Notes: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=12343148 > > Some key features and improvements in this release: > >1. Supported Local

Re: [VOTE] Apache CarbonData 1.4.1(RC1) release

2018-07-31 Thread Liang Chen
Hi These PR, it is better to merge in 1.4.1 also https://github.com/apache/carbondata/pull/2588 https://github.com/apache/carbondata/pull/2565 Regards Liang ravipesala wrote > Hi > > > I submit the Apache CarbonData 1.4.1 (RC1) for your vote. > > > 1.Release Notes: > >

Re: How to look up date segment details in carbon without partition.

2018-08-13 Thread Liang Chen
Hi In Carbondata system, the segment concept may be different with other system. One data load is one segment for carbondata. Actually, carbondata currently support partition with global sort also, you can use date as partition column to check data size for under each partition folder. Regards

Re: [Discussion] Propose to upgrade the version of integration/presto from 0.187 to 0.206

2018-08-13 Thread Liang Chen
now > using the dictionary_aggregation feature for optimization. The other bug > fixes are also important for carbondata integration. > However, they have changed the connector interface as well, so we might > need to change our interface accordingly. > > Thanks and regards > Bhavya > >

Re: [VOTE] Apache CarbonData 1.4.1(RC2) release

2018-08-13 Thread Liang Chen
Hi +1. many good improvements and bug fixs. Regards Liang ravipesala wrote > Hi > > > I submit the Apache CarbonData 1.4.1 (RC2) for your vote. > > > 1.Release Notes: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=12343148 > > Some key features and

Re: [DISCUSSION] Support Standard Spark's FileFormat interface in Carbondata

2018-08-23 Thread Liang Chen
HI +1, agree to support standard spark file format interface in carbondata, it will be significantly helpful for broadening apache carbondata's ecosystem. Regards Liang ravipesala wrote > Hi, > > Current Carbondata has deep integration with Spark to provide > optimizations > in performance

Re: [DISCUSSION] Implement file-level Min/Max index for streaming segment

2018-08-26 Thread Liang Chen
Hi +1 for this proposal. Regards Liang David CaiQiang wrote > Hi All, > Currently, the filter queries on the streaming table always scan all > streaming files, even though there are no data in streaming files that > meet > the filter conditions. > So I try to support file-level min/max

Re: Change the 'comment' content for column when execute command 'desc formatted table_name'

2018-08-21 Thread Liang Chen
Hi 1. Agree with likun's comments(4 points) : 2. About 'select sql' for CTAS , you can leave it. we can consider it later. Regards Liang Jacky Li wrote > Hi ZZC, > > I have checked the doc in CARBONDATA-2595. I have following comments: > 1. In the Table Basic Information section, it is

[Discussion] Propose to upgrade the version of integration/presto from 0.187 to 0.206

2018-07-24 Thread Liang Chen
Hi Dev The presto community already released 0.206 last week (refer the detail at https://prestodb.io/docs/current/release/release-0.206.html), this release fixed many issues, so propose Apache CarbonData community to upgrade to the latest presto version for carbondata integration. please

Re: [DISCUSSION] Updates to CarbonData documentation and structure

2018-09-04 Thread Liang Chen
Hi Raghu +1, all these optimizations are very good. Regards Liang sraghunandan wrote > Dear All, > > I wanted to propose some updates and changes to our current > documentation,Please let me know your inputs and comments. > > > 1.Split Our carbondata command into DDL and DML > > 2.Add

Re: Index file cache will not work when the table has invalid segment.

2018-07-12 Thread Liang Chen
Hi Currently, CarbonData doesn't support map data type Regards Liang carbondata-newuser wrote > Carbon version is 1.4 rc2. > create table( > col1 string, > col2 int, > col2 string, > date string > ) > > *First step:* > insert into table carbonTest select col1,col2,col3,"20180707" from >

Re: [Discussion] About syntax of compaction on specified segments

2018-03-14 Thread Liang Chen
Hi Thank jinzhou started this discussion session. I also propose to use the proposed solution from manish, not impacts the current Major and Minor compaction behaviors. Regards Liang manishgupta88 wrote > Hi, > > I agree with @gvramana https://github.com/gvramana; > >1. We should *not

Re: query on string type return error

2018-04-16 Thread Liang Chen
Hi >From the log message, seems like can't find the data files. Can you provide more detail info : 1. How you created carbonsession and how loaded data. 2. Have you deployed cluster or only single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run a query "select count(*)

Re: Storing Data Frame as CarbonData Table

2018-04-02 Thread Liang Chen
Hi Michael Yes, it is very easy to save any spark data to carbondata. Just need to do small change based on your script, as below : myDF.write .format("carbondata") .option("tableName" "MyTable") .mode(SaveMode.Overwrite) .save() For more detail, you can refer to examples:

Re: Problem on carbondata quering performance tuning

2018-04-02 Thread Liang Chen
HI Which carbondata+spark version? and can you provide the full configuration inside "carbondata.properties" Mick Yuan wrote > Hi,all > I have a quering performane tuning case on carbondata. > > *Enviroment is as below:*: > spark on yarn > 4 nodemanagers > 102G,55 cores each

Re: Getting [Problem in loading segment blocks] error after doing multi update operations

2018-03-20 Thread Liang Chen
Hi Thanks for your feedback. Let me first reproduce this issue, and check the detail. Regards Liang yixu2001 wrote > I'm using carbondata1.3+spark2.1.1+hadoop2.7.1 to do multi update > operations > here is the replay step: > > import org.apache.spark.sql.SparkSession > import

Re: [VOTE] Apache CarbonData 1.3.1(RC1) release

2018-03-05 Thread Liang Chen
Hi +1(binding) Regards Liang ravipesala wrote > Hi > > I submit the Apache CarbonData 1.3.1 (RC1) for your vote. > > 1.Release Notes: > *https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=12342754 >

Re: [Discuss] Removing search mode

2018-11-06 Thread Liang Chen
Hi +1, but one suggestion, in the future we can first try these alpha features in the separate branch . once it is confirmed, then merge into master. Regards Liang akashrn5 wrote > +1 > yes, after search mode implementation we didnt get much advantage as > expected and simply code will be

Re: [VOTE] Apache CarbonData 1.5.0(RC2) release

2018-10-10 Thread Liang Chen
+1 Regards Liang Ravindra Pesala 于2018年10月10日周三 上午3:15写道: > Hi > > I submit the Apache CarbonData 1.5.0 (RC2) for your vote. > > 1.Release Notes: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=12341006 > > Some key features and improvements in this release: >

Re: error occur when I load data to s3

2018-09-03 Thread Liang Chen
Hi kunal Can you list all S3 issues PR, we may need to give 1.4.2 patch release. Because aaron plan to use carbondata in production this month. To arron : First please you try master, see if can solve your problems. Regards Liang kunalkapoor wrote > Hi aaron, > Many issues like this have been

  1   2   >