[jira] [Created] (CARBONDATA-441) Add module for spark2

2016-11-23 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-441: --- Summary: Add module for spark2 Key: CARBONDATA-441 URL: https://issues.apache.org/jira/browse/CARBONDATA-441 Project: CarbonData Issue Type: Improvement Af

[jira] [Created] (CARBONDATA-442) SELECT querry result mismatched with hive result

2016-11-23 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-442: Summary: SELECT querry result mismatched with hive result Key: CARBONDATA-442 URL: https://issues.apache.org/jira/browse/CARBONDATA-442 Project: CarbonData I

Re: [Feature ]Design Document for Update/Delete support in CarbonData

2016-11-23 Thread manish gupta
Hi Vimal, I have few queries regarding regarding the 1st suggestion. 1. Dimensions can both be dictionary and no dictionary. If we update the dictionary file then we will have to maintain 2 flows one for dictionary columns and 1 for no dictionary columns. Will that be ok? 2. We write dictionary

Re: B-Tree LRU cache (New Feature)

2016-11-23 Thread mohdshahidkhan
Please find Design document for B-Tree LRU cache https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=sharing -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-cache-New-Feature-tp2366p3130.html Sent from the Apache

CarbonData propose major version number increment for next version (to 1.0.0)

2016-11-23 Thread Venkata Gollamudi
Hi All, CarbonData 0.2.0 has been a good work and stable release with lot of defects fixed and with number of performance improvements. https://issues.apache.org/jira/browse/CARBONDATA-320?jql=project%20%3D%20CARBONDATA%20AND%20fixVersion%20%3D%200.2.0-incubating%20ORDER%20BY%20updated%20DESC%2C%2

[Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Xiaoqiao He
Hi All, I would like to propose Dictionary improvement which using Trie in place of HashMap. In order to speedup aggregation, reduce run-time memory footprint, enable fast distinct count etc, CarbonData encodes data using dictionary at file level or table level based on cardinality. It is a gene

Re: [Feature ]Design Document for Update/Delete support in CarbonData

2016-11-23 Thread Aniket Adnaik
Hi Vimal, Thanks for your suggestions. For the 1st point, i tend to agree with Manish's comments. But, it's worth looking into different ways to optimize the performance. I guess, query performance may take priority over update performance. Basically, we may need better compaction approach to merg

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Liang Chen
Hi xiaoqiao This improvement looks great! Can you please explain the below data, what does it mean? -- ConcurrentHashMap ~68MB 14543 Double Array Trie ~104MB 12825 Regards Liang 2016-11-24 2:04 GMT+08:00 Xiaoqiao He : > Hi All, > > I would like to propose Dictionary improvement which u

????????????Some questions about CarbonSqlParser

2016-11-23 Thread ????
ok thanks ??2016??11??23?? 09:36??Jay : Hi, for create table sql, because it's completely meet hive syntax, so there is no need to create another parse way. for loadData sql, carbon's sql is like LOAD DATA INTO .OPTIONS(...), and OPTIONS is optional. Because hive's syntax h

[jira] [Created] (CARBONDATA-443) Implement nosort dataloading

2016-11-23 Thread QiangCai (JIRA)
QiangCai created CARBONDATA-443: --- Summary: Implement nosort dataloading Key: CARBONDATA-443 URL: https://issues.apache.org/jira/browse/CARBONDATA-443 Project: CarbonData Issue Type: Improvement

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Xiaoqiao He
hi Liang, Thanks for your reply, i need to correct the experiment result because it's wrong order NO.1 column of result data table. In order to compare performance between Trie and HashMap, Two different structures are constructed using the same dictionary data which size is 600K and each item's

Re: [Improvement] Use Trie in place of HashMap to reduce memory footprint of Dictionary

2016-11-23 Thread Liang Chen
Hi xiaoqiao For the below example, 600K dictionary data: It is to say that using "DAT" can save 36M memory against "ConcurrentHashMap", whereas the performance just lost less (1718ms) ? One more question:if increases the dictionary data size, what's the comparison results "ConcurrentHashMap" VS "

[jira] [Created] (CARBONDATA-444) Improved integration test-case for AllDataTypesTestCase1

2016-11-23 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-444: Summary: Improved integration test-case for AllDataTypesTestCase1 Key: CARBONDATA-444 URL: https://issues.apache.org/jira/browse/CARBONDATA-444 Project: CarbonData

[jira] [Created] (CARBONDATA-445) Improved integration test-case for AllDataTypesTestCase3

2016-11-23 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-445: Summary: Improved integration test-case for AllDataTypesTestCase3 Key: CARBONDATA-445 URL: https://issues.apache.org/jira/browse/CARBONDATA-445 Project: CarbonData