[jira] [Created] (CARBONDATA-465) Spark streaming dataframe support

2016-11-28 Thread WilliamZhu (JIRA)
WilliamZhu created CARBONDATA-465: - Summary: Spark streaming dataframe support Key: CARBONDATA-465 URL: https://issues.apache.org/jira/browse/CARBONDATA-465 Project: CarbonData Issue Type:

[jira] [Created] (CARBONDATA-464) Too many tiems GC occurs in qurey if we increase the blocklet size

2016-11-28 Thread suo tong (JIRA)
suo tong created CARBONDATA-464: --- Summary: Too many tiems GC occurs in qurey if we increase the blocklet size Key: CARBONDATA-464 URL: https://issues.apache.org/jira/browse/CARBONDATA-464 Project:

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-28 Thread Xiaoqiao He
Hi Jihong, Thanks for your attentions and reply. 1. Actually I has done benchmark with English/Chinese dictionary size in {100K,200K,300K,400K,500K,600K} separately, and test result is basic same as mentioned in this mail flow before, I will submit and open the benchmark code and dictionary

RE: [Feature Proposal] Spark 2 integration with CarbonData

2016-11-28 Thread Jihong Ma
Integration with Spark 2.x is a great feature for Carbondata as Spark 2.x is getting the momentum gradually. This is a big effort ahead and let's take into consideration of all the complexity involved due to dramatic API level changeļ¼Œ realizing it in phases is a good idea. Regards. Jihong

RE: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-28 Thread Jihong Ma
Thank you Xiaoqiao for looking into this issue and sharing your result! Have you tried varied dictionary size for comparison among all the alternatives? And please pay closer attention to the license of DAT implementation, as they are under LGPL, generally speaking, it is not legally allowed

RE: CarbonData propose major version number increment for next version (to 1.0.0)

2016-11-28 Thread Jihong Ma
+1 A rich set of features are planned to be included into next release, and more importantly there will be external API changes introduced as we integrate with Spark 2.x, Carbondata deserves a major version jump as it gets mature/production ready and powerful in terms of rich functionality and

Re: [Feature Proposal] Spark 2 integration with CarbonData

2016-11-28 Thread QiangCai
+1 I think I can finish some tasks. please assign some tasks to me. -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Feature-Proposal-Spark-2-integration-with-CarbonData-tp3236p3320.html Sent from the Apache CarbonData Mailing List archive

Re: carbon data

2016-11-28 Thread Liang Chen
Hi Lionel Don't need to create table first, please find the example code in ExampleUtils.scala df.write .format("carbondata") .option("tableName", tableName) .option("compress", "true") .option("useKettle", "false") .mode(mode) .save() Preparing API docs is in progress.

carbon data

2016-11-28 Thread Lu Cao
Hi team, I'm trying to save spark dataframe to carbondata file. I see the example in your wiki option("tableName", "carbontable"). Does that mean I have to create a carbondata table first and then save data into the table? Can I save it directly without creating the carbondata table? the code is

[jira] [Created] (CARBONDATA-463) Extract spark-common module

2016-11-28 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-463: --- Summary: Extract spark-common module Key: CARBONDATA-463 URL: https://issues.apache.org/jira/browse/CARBONDATA-463 Project: CarbonData Issue Type: Sub-task

[jira] [Created] (CARBONDATA-462) Clean up code before moving to spark-common package

2016-11-28 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-462: --- Summary: Clean up code before moving to spark-common package Key: CARBONDATA-462 URL: https://issues.apache.org/jira/browse/CARBONDATA-462 Project: CarbonData

[jira] [Created] (CARBONDATA-461) Clean partitioner in RDD package

2016-11-28 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-461: --- Summary: Clean partitioner in RDD package Key: CARBONDATA-461 URL: https://issues.apache.org/jira/browse/CARBONDATA-461 Project: CarbonData Issue Type:

[jira] [Created] (CARBONDATA-460) Add Unit Tests For core.writer.sortindex package

2016-11-28 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-460: Summary: Add Unit Tests For core.writer.sortindex package Key: CARBONDATA-460 URL: https://issues.apache.org/jira/browse/CARBONDATA-460 Project: CarbonData

[jira] [Created] (CARBONDATA-459) Block distribution is wrong in case of dynamic allocation=true

2016-11-28 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-459: --- Summary: Block distribution is wrong in case of dynamic allocation=true Key: CARBONDATA-459 URL: https://issues.apache.org/jira/browse/CARBONDATA-459 Project:

Re: [Feature Proposal] Spark 2 integration with CarbonData

2016-11-28 Thread Jacky Li
Hi Ramana, Sure, I can work out a subtasks list and put it under CARBONDATA-322 Regards, Jacky -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Feature-Proposal-Spark-2-integration-with-CarbonData-tp3236p3278.html Sent from the Apache

[jira] [Created] (CARBONDATA-458) Improving carbon first time query performance

2016-11-28 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-458: --- Summary: Improving carbon first time query performance Key: CARBONDATA-458 URL: https://issues.apache.org/jira/browse/CARBONDATA-458 Project: CarbonData

[jira] [Created] (CARBONDATA-457) Add Unit Tests For core.writer package

2016-11-28 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-457: Summary: Add Unit Tests For core.writer package Key: CARBONDATA-457 URL: https://issues.apache.org/jira/browse/CARBONDATA-457 Project: CarbonData Issue