[4/4] carbondata git commit: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as column compressor in final store

2018-09-12 Thread jackylk
[CARBONDATA-2851][CARBONDATA-2852] Support zstd as column compressor in final store 1. add zstd compressor for compressing column data 2. add zstd support in thrift 3. since zstd does not support zero-copy while compressing, offheap will not take effect for zstd 4. support lazy load for

[2/4] carbondata git commit: [CARBONDATA-2851][CARBONDATA-2852] Support zstd as column compressor in final store

2018-09-12 Thread jackylk
http://git-wip-us.apache.org/repos/asf/carbondata/blob/8f08c4ab/core/src/main/java/org/apache/carbondata/core/localdictionary/PageLevelDictionary.java -- diff --git

carbondata git commit: [CARBONDATA-2845][BloomDataMap] Merge bloom index files of multi-shards for each index column

2018-09-12 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 22958d941 -> 7b31b9168 [CARBONDATA-2845][BloomDataMap] Merge bloom index files of multi-shards for each index column Currently a bloom index file will be generated per task per load, the query performance will be bad if we have many

carbondata git commit: [CARBONDATA-2929][DataMap] Add block skipped info for explain command

2018-09-12 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 83f9f12a8 -> 22958d941 [CARBONDATA-2929][DataMap] Add block skipped info for explain command This pr will add block skipped info by counting distinct file path from hit blocklet This closes #2711 Project:

carbondata git commit: [HOTFIX] Removed scala dependency from carbon core module

2018-09-12 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 8f1a029b9 -> 83f9f12a8 [HOTFIX] Removed scala dependency from carbon core module Removed scala dependencies from carbon-core and sdk modules This closes #2709 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

carbondata git commit: [CARBONDATA-2923] Log the hit information of streaming segments

2018-09-10 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master e32551b8f -> 0528a7985 [CARBONDATA-2923] Log the hit information of streaming segments Log the hit information of streaming segments after the hit information of batch segments This closes #2700 Project:

carbondata git commit: [HOTFIX] Fix streaming issue of 2.3 CI

2018-09-10 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 2ccdbb78c -> e32551b8f [HOTFIX] Fix streaming issue of 2.3 CI This closes #2701 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e32551b8 Tree:

carbondata git commit: [CARBONDATA-2908] Add SORT_SCOPE option support in dataframe API

2018-09-09 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master b0589e502 -> 0483b46e9 [CARBONDATA-2908] Add SORT_SCOPE option support in dataframe API This closes #2684 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

carbondata git commit: [CARBONDATA-2902][DataMap] Fix showing negative pruning result for explain command

2018-09-06 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 50248f51b -> f04850f39 [CARBONDATA-2902][DataMap] Fix showing negative pruning result for explain command #2676 used method ByteBuffer.getShort(int index) to get number of blocklets in block, but it used wrong parameter. The index

carbondata git commit: [CARBONDATA-2888] Support multi level subfolder for SDK read and fileformat read

2018-09-05 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 21a72bf2e -> 70fe5144d [CARBONDATA-2888] Support multi level subfolder for SDK read and fileformat read This PR supports multi-level subfolders read for SDK Reader and spark's carbon fileformat reader. This closes #2661 Project:

[1/2] carbondata git commit: [CARBONDATA-2853] Implement min/max index for streaming segment

2018-09-05 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 526e3bfa1 -> 21a72bf2e http://git-wip-us.apache.org/repos/asf/carbondata/blob/21a72bf2/streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java

[2/2] carbondata git commit: [CARBONDATA-2853] Implement min/max index for streaming segment

2018-09-05 Thread jackylk
[CARBONDATA-2853] Implement min/max index for streaming segment Implement file-level min/max index (driver side) and blocklet level min/max index (worker side) on stream files to improve read performance. This closes #2644 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo

carbondata git commit: [CARBONDATA-2902][DataMap] Fix showing negative pruning result for explain command

2018-09-03 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 257672790 -> 3cbabcde0 [CARBONDATA-2902][DataMap] Fix showing negative pruning result for explain command For legacy store, print no pruning info For cache_level = block, get number of blocklets of hit blocks, print skipped blocklet

carbondata git commit: [CARBONDATA-2901] Fixed JVM crash in Load scenario when unsafe memory allocation is failed.

2018-09-03 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master d980d4cd8 -> 257672790 [CARBONDATA-2901] Fixed JVM crash in Load scenario when unsafe memory allocation is failed. When allocation failed, set row page reference to null. So, that next thread will not do any operation. This closes

carbondata git commit: [CARBONDATA-2884] Rename the methods of ByteUtil class to avoid the misuse

2018-08-30 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 612552e61 -> f012f5b13 [CARBONDATA-2884] Rename the methods of ByteUtil class to avoid the misuse The method toBytes will execute XOR operation on data. So the result is not the byte array of the real value. Better to rename the

carbondata git commit: [CARBONDATA-2835] [MVDataMap] Block MV datamap on streaming table

2018-08-30 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 7fc0c6b9f -> 612552e61 [CARBONDATA-2835] [MVDataMap] Block MV datamap on streaming table This PR block creating MV datamap on streaming table and also block setting streaming property for table which has MV datamap. This closes

carbondata git commit: [CARBONDATA-2856][BloomDataMap] Fix bug in bloom index on multiple dictionary columns

2018-08-30 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master de0f54516 -> 7fc0c6b9f [CARBONDATA-2856][BloomDataMap] Fix bug in bloom index on multiple dictionary columns During direct generating bloom index for dictionary and date columns, we will use MdkKeyGenerator to get the surrogate key.

[carbondata] Git Push Summary

2018-08-29 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/carbonstore [deleted] edb8aa7d7

carbondata git commit: [CARBONDATA-2819] Fixed cannot drop preagg datamap on table which has other type datamaps

2018-08-29 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 4df5f2f1b -> d801548aa [CARBONDATA-2819] Fixed cannot drop preagg datamap on table which has other type datamaps As we know, carbon now write preagg datamap info to main table schema and write other type datamap info like bloom or

carbondata git commit: [CARBONDATA-2862][DataMap] Fix exception message for datamap rebuild command

2018-08-29 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 2a9604cd8 -> 4df5f2f1b [CARBONDATA-2862][DataMap] Fix exception message for datamap rebuild command Since datamap rebuild command support execute without specify table name, and it will scan all datamap instead, so error message has

carbondata git commit: [CARBONDATA-2826] support select using distributed carbon store

2018-08-16 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/carbonstore 9f10122af -> 94d0b54ac [CARBONDATA-2826] support select using distributed carbon store Provides select support with select columns pruning and filter pushdown using new RDD for distributed carbon store This closes 2631

[carbondata] Git Push Summary

2018-08-16 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/store [deleted] 94d0b54ac

carbondata git commit: [CARBONDATA-2826] support select using distributed carbon store

2018-08-16 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/store [created] 94d0b54ac [CARBONDATA-2826] support select using distributed carbon store Provides select support with select columns pruning and filter pushdown using new RDD for distributed carbon store This closes 2631 Project:

[45/50] [abbrv] carbondata git commit: [CARBONDATA-2823] Support streaming property with datamap

2018-08-07 Thread jackylk
[CARBONDATA-2823] Support streaming property with datamap Since during query, carbondata get splits from streaming segment and columnar segments repectively, we can support streaming with index datamap. For preaggregate datamap, it already supported streaming table, so here we will remove the

[39/50] [abbrv] carbondata git commit: [Documentation] Editorial review comment fixed

2018-08-07 Thread jackylk
[Documentation] Editorial review comment fixed Minor issues fixed (spelling, syntax, and missing info) This closes #2603 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/12725b75 Tree:

[43/50] [abbrv] carbondata git commit: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merge index on older V1 V2 store

2018-08-07 Thread jackylk
[CARBONDATA-2829][CARBONDATA-2832] Fix creating merge index on older V1 V2 store Block merge index creation for the old store V1 V2 versions This closes #2608 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[46/50] [abbrv] carbondata git commit: [CARBONDATA-2836]Fixed data loading performance issue

2018-08-07 Thread jackylk
[CARBONDATA-2836]Fixed data loading performance issue Problem: Data Loading is taking more time when number of records are high(3.5 billion) records Root Cause: In case of Final merge sort temp row conversion is done in main thread because of this final step processing became slower.

[50/50] [abbrv] carbondata git commit: [CARBONDATA-2768][CarbonStore] Fix error in tests for external csv format

2018-08-07 Thread jackylk
[CARBONDATA-2768][CarbonStore] Fix error in tests for external csv format In previous implementation earlier than PR2495, we only supportted csv as external format for carbondata. And we validated the restriction while creating the table.PR2495 added kafka support, so it removed the validation,

[35/50] [abbrv] carbondata git commit: [CARBONDATA-2812] Implement freeMemory for complex pages

2018-08-07 Thread jackylk
[CARBONDATA-2812] Implement freeMemory for complex pages Problem: The memory used by the ColumnPageWrapper (for complex data types) is not cleared and so it requires more memory to Load and Query. Solution: Clear the used memory in the freeMemory method. This closes #2599 Project:

[20/50] [abbrv] carbondata git commit: [CARBONDATA-2625] While BlockletDataMap loading, avoid multiple times listing of files

2018-08-07 Thread jackylk
[CARBONDATA-2625] While BlockletDataMap loading, avoid multiple times listing of files CarbonReader is very slow for many files as blockletDataMap lists files of folder while loading each segment. This optimization lists once across segment loads. This closes #2441 Project:

[47/50] [abbrv] carbondata git commit: [CARBONDATA-2585] Fix local dictionary for both table level and system level property based on priority

2018-08-07 Thread jackylk
[CARBONDATA-2585] Fix local dictionary for both table level and system level property based on priority Added a System level Property for local dictionary Support. Property 'carbon.local.dictionary.enable' can be set to true/false to enable/disable local dictionary at system level. If table

[22/50] [abbrv] carbondata git commit: Problem: Insert into select is failing as both are running as single task, both are sharing the same taskcontext and resources are cleared once if any one of the

2018-08-07 Thread jackylk
Problem: Insert into select is failing as both are running as single task, both are sharing the same taskcontext and resources are cleared once if any one of the RDD(Select query's ScanRDD) is completed, so the other RDD(LoadRDD) running is crashing as it is trying to access the cleared memory.

[38/50] [abbrv] carbondata git commit: [CARBONDATA-2804] fix the bug when bloom filter or preaggregate datamap tried to be created on older V1-V2 version stores

2018-08-07 Thread jackylk
[CARBONDATA-2804] fix the bug when bloom filter or preaggregate datamap tried to be created on older V1-V2 version stores This PR change read carbon file version from carbondata file header to carbonindex file header, because the version filed of carondata file header is not compatible with

[36/50] [abbrv] carbondata git commit: [CARBONDATA-2813] Fixed code to get data size from LoadDetails if size is written there

2018-08-07 Thread jackylk
[CARBONDATA-2813] Fixed code to get data size from LoadDetails if size is written there Problem: In 1.3.x when index files are merged to form mergeindex file a mapping of which index files if merged to which mergeindex is kept in the segments file. In 1.4.x both the index and merge index files

[17/50] [abbrv] carbondata git commit: [CARBONDATA-2798] Fix Dictionary_Include for ComplexDataType

2018-08-07 Thread jackylk
[CARBONDATA-2798] Fix Dictionary_Include for ComplexDataType Problem1: Select Filter is throwing BufferUnderFlow Exception as cardinality is filled for Non-Dictionary columns. Solution: Check if a complex column has Encoding => Dictionary and fill cardinality for that column only. Problem2:

[40/50] [abbrv] carbondata git commit: [CARBONDATA-2815][Doc] Add documentation for spilling memory and datamap rebuild

2018-08-07 Thread jackylk
[CARBONDATA-2815][Doc] Add documentation for spilling memory and datamap rebuild Add documentation for:1.spilling unsafe memory for data loading,2.datamap rebuild for index datamap This closes #2604 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[16/50] [abbrv] carbondata git commit: [HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in incorrect query result with bloom datamap

2018-08-07 Thread jackylk
[HotFix][CARBONDATA-2788][BloomDataMap] Fix bugs in incorrect query result with bloom datamap This PR solve two problems which will affect the correctness of the query on bloom. Revert PR2539 After review the code, we found that modification in PR2539 is not needed, so we revert that PR.

[28/50] [abbrv] carbondata git commit: [CARBONDATA-2753][Compatibility] Merge Index file not getting created with blocklet information for old store

2018-08-07 Thread jackylk
[CARBONDATA-2753][Compatibility] Merge Index file not getting created with blocklet information for old store Problem Merge Index file not getting created with blocklet information for old store Analysis In legacy store (store <= 1.1 version), blocklet information is not written in the carbon

[15/50] [abbrv] carbondata git commit: [CARBONDATA-2585]disable local dictionary by default

2018-08-07 Thread jackylk
[CARBONDATA-2585]disable local dictionary by default make local dictionary false by default This closes #2570 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/34ca0214 Tree:

[09/50] [abbrv] carbondata git commit: [HOTFIX] Removed file existence check to improve dataMap loading performance

2018-08-07 Thread jackylk
[HOTFIX] Removed file existence check to improve dataMap loading performance Problem DataMap loading performance degraded after adding file existence check. Analysis When carbonIndex file is read and carbondata file path to its metadata Info map is prepared, file physical existence is getting

[24/50] [abbrv] carbondata git commit: [CARBONDATA-2800][Doc] Add useful tips about bloomfilter datamap

2018-08-07 Thread jackylk
[CARBONDATA-2800][Doc] Add useful tips about bloomfilter datamap add useful tips about bloomfilter datamap This closes #2581 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/a302cd1c Tree:

[25/50] [abbrv] carbondata git commit: [CARBONDATA-2806] Delete delete delta files upon clean files for flat folder

2018-08-07 Thread jackylk
[CARBONDATA-2806] Delete delete delta files upon clean files for flat folder Problem: Delete delta files are not removed after clean files operation. Solution: Get the delta files using Segment Status Manager and remove them during clean operation. This closes #2587 Project:

[13/50] [abbrv] carbondata git commit: [CARBONDATA-2801]Added documentation for flat folder

2018-08-07 Thread jackylk
[CARBONDATA-2801]Added documentation for flat folder [CARBONDATA-2801]Added documentation for flat folder This closes #2582 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/790cde87 Tree:

[32/50] [abbrv] carbondata git commit: [CARBONDATA-2799][BloomDataMap] Fix bugs in querying with bloom datamap on preagg with dictionary column

2018-08-07 Thread jackylk
[CARBONDATA-2799][BloomDataMap] Fix bugs in querying with bloom datamap on preagg with dictionary column For preaggregate table, if the groupby column is dictionary column in parent table, the preaggregate table will inherit the dictionary encoding as well as the dictionary file from the parent

[33/50] [abbrv] carbondata git commit: [CARBONDATA-2803]fix wrong datasize calculation and Refactoring for better readability and handle local dictionary for older tables

2018-08-07 Thread jackylk
[CARBONDATA-2803]fix wrong datasize calculation and Refactoring for better readability and handle local dictionary for older tables Changes in this PR: 1.data size was calculation wrongly, indexmap contains duplicate paths as it stores all blocklets, so remove duplicate and maintain uniq block

[37/50] [abbrv] carbondata git commit: [CARBONDATA-2802][BloomDataMap] Remove clearing cache after rebuiding index datamap

2018-08-07 Thread jackylk
[CARBONDATA-2802][BloomDataMap] Remove clearing cache after rebuiding index datamap This is no need to clear cache after rebuilding index datamap due to the following reasons: 1.currently it will clear all the caches for all index datamaps, not only for the current rebuilding one 2.the life

[12/50] [abbrv] carbondata git commit: [CARBONDATA-2789] Support Hadoop 2.8.3 eco-system integration

2018-08-07 Thread jackylk
[CARBONDATA-2789] Support Hadoop 2.8.3 eco-system integration Add hadoop 2.8.3 profile and passed the compile This closes #2566 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/7b538906 Tree:

[26/50] [abbrv] carbondata git commit: [CARBONDATA-2796][32K]Fix data loading problem when table has complex column and long string column

2018-08-07 Thread jackylk
[CARBONDATA-2796][32K]Fix data loading problem when table has complex column and long string column currently both varchar column and complex column believes itself is the last one member in noDictionary group when converting carbon row from raw format to 3-parted format. Since they need to be

[03/50] [abbrv] carbondata git commit: [CARBONDATA-2775] Adaptive encoding fails for Unsafe OnHeap. if, target datatype is SHORT_INT

2018-08-07 Thread jackylk
[CARBONDATA-2775] Adaptive encoding fails for Unsafe OnHeap. if, target datatype is SHORT_INT problem: [CARBONDATA-2775] Adaptive encoding fails for Unsafe OnHeap if, target data type is SHORT_INT solution: If ENABLE_OFFHEAP_SORT = false, in carbon property. UnsafeFixLengthColumnPage.java

[01/50] [abbrv] carbondata git commit: [CARBONDATA-2782]delete dead code in class 'CarbonCleanFilesCommand' [Forced Update!]

2018-08-07 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/external-format ccf64ce5a -> 12ab57992 (forced update) [CARBONDATA-2782]delete dead code in class 'CarbonCleanFilesCommand' The variables(dms、indexDms) in function processMetadata are nerver used. This closes #2557 Project:

[19/50] [abbrv] carbondata git commit: [CARBONDATA-2790][BloomDataMap]Optimize default parameter for bloomfilter datamap

2018-08-07 Thread jackylk
[CARBONDATA-2790][BloomDataMap]Optimize default parameter for bloomfilter datamap To provide better query performance for bloomfilter datamap by default, we optimize bloom_size from 32000 to 64 and optimize bloom_fpp from 0.01 to 0.1. This closes #2567 Project:

[10/50] [abbrv] carbondata git commit: [CARBONDATA-2749][dataload] In HDFS Empty tablestatus file is written during datalaod, iud or compaction when disk is full.

2018-08-07 Thread jackylk
[CARBONDATA-2749][dataload] In HDFS Empty tablestatus file is written during datalaod, iud or compaction when disk is full. Problem: When a failure happens due to disk full during load, IUD or Compaction, then while updating the tablestatus file, the tablestaus.tmp file during atomic file

[27/50] [abbrv] carbondata git commit: [CARBONDATA-2478] Added datamap-developer-guide.md file to Readme.md

2018-08-07 Thread jackylk
[CARBONDATA-2478] Added datamap-developer-guide.md file to Readme.md [CARBONDATA-2478] Added datamap-developer-guide.md file to Readme.md This closes #2305 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/77642cff

[05/50] [abbrv] carbondata git commit: [HOTFIX] CreateDataMapPost Event was skipped in case of preaggregate datamap

2018-08-07 Thread jackylk
[HOTFIX] CreateDataMapPost Event was skipped in case of preaggregate datamap CreateDataMapPost Event was skipped in case of preaggregate datamap This closes #2562 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[08/50] [abbrv] carbondata git commit: [CARBONDATA-2794]Distinct count fails on ArrayOfStruct

2018-08-07 Thread jackylk
[CARBONDATA-2794]Distinct count fails on ArrayOfStruct This PR fixes Code Generator Error thrown when Select filter contains more than one count of distinct of ArrayofStruct with group by Clause This closes #2573 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[14/50] [abbrv] carbondata git commit: [CARBONDATA-2606][Complex DataType Enhancements]Fix Null result if projection column have null primitive column and struct

2018-08-07 Thread jackylk
[CARBONDATA-2606][Complex DataType Enhancements]Fix Null result if projection column have null primitive column and struct Problem: In case if the actual value of the primitive data type is null, by PR#2489, we are moving all the null values to the end of the collected row without considering

[48/50] [abbrv] carbondata git commit: [CARBONDATA-2613] Support csv based carbon table

2018-08-07 Thread jackylk
http://git-wip-us.apache.org/repos/asf/carbondata/blob/1a26ac16/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddSegmentCommand.scala -- diff --git

[23/50] [abbrv] carbondata git commit: [CARBONDATA-2793][32k][Doc] Add 32k support in document

2018-08-07 Thread jackylk
[CARBONDATA-2793][32k][Doc] Add 32k support in document This closes #2572 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/f9b02a5c Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/f9b02a5c Diff:

[41/50] [abbrv] carbondata git commit: [CARBONDATA-2795] Add documentation for S3

2018-08-07 Thread jackylk
[CARBONDATA-2795] Add documentation for S3 This closes #2576 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e26a742c Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e26a742c Diff:

[11/50] [abbrv] carbondata git commit: Fixed Spelling

2018-08-07 Thread jackylk
Fixed Spelling Fixed Spelling This closes #2584 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/1cf3f398 Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/1cf3f398 Diff:

[34/50] [abbrv] carbondata git commit: [Documentation] [Unsafe Configuration] Added carbon.unsafe.driver.working.memory.in.mb parameter to differentiate between driver and executor unsafe memory

2018-08-07 Thread jackylk
[Documentation] [Unsafe Configuration] Added carbon.unsafe.driver.working.memory.in.mb parameter to differentiate between driver and executor unsafe memory Added carbon.unsafe.driver.working.memory.in.mb parameter to differentiate between driver and executor unsafe memory Usually in

[07/50] [abbrv] carbondata git commit: [CARBONDATA-2791]Fix Encoding for Double if exceeds LONG.Max_value

2018-08-07 Thread jackylk
[CARBONDATA-2791]Fix Encoding for Double if exceeds LONG.Max_value If Factor(decimalcount) * absMaxValue exceeds LONG.MAX_VALUE, then go for direct compression. This closes #2569 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[42/50] [abbrv] carbondata git commit: [CARBONDATA-2750] Updated documentation on Local Dictionary Supoort

2018-08-07 Thread jackylk
[CARBONDATA-2750] Updated documentation on Local Dictionary Supoort Updated Documentation on Local Dictionary Support. Changed default scenario for Local dictionary to false This closes #2590 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

[04/50] [abbrv] carbondata git commit: [HOTFIX] Fixed random test failure

2018-08-07 Thread jackylk
[HOTFIX] Fixed random test failure Fixed random test failure This closes #2553 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/f5d3c17b Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/f5d3c17b Diff:

[02/50] [abbrv] carbondata git commit: [CARBONDATA-2753][Compatibility] Row count of page is calculated wrong for old store(V2 store)

2018-08-07 Thread jackylk
[CARBONDATA-2753][Compatibility] Row count of page is calculated wrong for old store(V2 store) Row count of page is calculated wrong for V2 store. Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/8d3e8b82 Tree:

[49/50] [abbrv] carbondata git commit: [CARBONDATA-2613] Support csv based carbon table

2018-08-07 Thread jackylk
[CARBONDATA-2613] Support csv based carbon table 1. create csv based carbon table using CREATE TABLE fact_table (col1 bigint, col2 string, ..., col100 string) STORED BY 'CarbonData' TBLPROPERTIES( 'foramt'='csv', 'csv.delimiter'=',', 'csv.header'='col1,col2,col100') 2. Load data to this

[21/50] [abbrv] carbondata git commit: [CARBONDATA-2781] Added fix for Null Pointer Excpetion when create datamap killed from UI

2018-08-07 Thread jackylk
[CARBONDATA-2781] Added fix for Null Pointer Excpetion when create datamap killed from UI What was the issue? In undo meta, datamap was not being dropped. In case of Pre-aggregate table or timeseries table, the datamap was not being dropped from schema as undo meta method was not handling the

[44/50] [abbrv] carbondata git commit: [CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap

2018-08-07 Thread jackylk
[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials, lucene, bloomfilter, we will block 'deferred rebuild' for them as well as block

[06/50] [abbrv] carbondata git commit: [CARBONDATA-2784][CARBONDATA-2786][SDK writer] Fixed:Forever blocking wait with more than 21 batch of data

2018-08-07 Thread jackylk
[CARBONDATA-2784][CARBONDATA-2786][SDK writer] Fixed:Forever blocking wait with more than 21 batch of data problem: [CARBONDATA-2784] [SDK writer] Forever blocking wait with more than 21 batch of data, when consumer is dead due to data loading exception (bad record / out of memory) root cause:

[30/50] [abbrv] carbondata git commit: [HOTFIX][PR 2575] Fixed modular plan creation only if valid datamaps are available

2018-08-07 Thread jackylk
[HOTFIX][PR 2575] Fixed modular plan creation only if valid datamaps are available update query is failing in spark-2.2 cluster if mv jars are available because catalogs are not empty if datamap are created for other table also and returns true from isValidPlan() inside MVAnalyzerRule. This

carbondata git commit: [CARBONDATA-2539]Fix mv classcast exception issue

2018-08-07 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 78438451b -> 3d7fa1276 [CARBONDATA-2539]Fix mv classcast exception issue Class cast exception happens during min type aggregate function happening. It is corrected in this PR This closes #2602 Project:

carbondata git commit: [CARBONDATA-2585] Fix local dictionary for both table level and system level property based on priority

2018-08-07 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master f27efb3e3 -> 78438451b [CARBONDATA-2585] Fix local dictionary for both table level and system level property based on priority Added a System level Property for local dictionary Support. Property 'carbon.local.dictionary.enable' can

carbondata git commit: [CARBONDATA-2823] Support streaming property with datamap

2018-08-07 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master abcd4f6e2 -> b9e510640 [CARBONDATA-2823] Support streaming property with datamap Since during query, carbondata get splits from streaming segment and columnar segments repectively, we can support streaming with index datamap. For

carbondata git commit: [CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap

2018-08-07 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master b702a1b01 -> abcd4f6e2 [CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials,

carbondata git commit: [CARBONDATA-2802][BloomDataMap] Remove clearing cache after rebuiding index datamap

2018-08-02 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 38384cb9f -> 26d9f3d8e [CARBONDATA-2802][BloomDataMap] Remove clearing cache after rebuiding index datamap This is no need to clear cache after rebuilding index datamap due to the following reasons: 1.currently it will clear all the

carbondata git commit: [CARBONDATA-2806] Delete delete delta files upon clean files for flat folder

2018-08-01 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master a302cd1ce -> af984101e [CARBONDATA-2806] Delete delete delta files upon clean files for flat folder Problem: Delete delta files are not removed after clean files operation. Solution: Get the delta files using Segment Status Manager

carbondata git commit: [CARBONDATA-2800][Doc] Add useful tips about bloomfilter datamap

2018-08-01 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master f9b02a5c1 -> a302cd1ce [CARBONDATA-2800][Doc] Add useful tips about bloomfilter datamap add useful tips about bloomfilter datamap This closes #2581 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

carbondata git commit: [CARBONDATA-2793][32k][Doc] Add 32k support in document

2018-08-01 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master de9246066 -> f9b02a5c1 [CARBONDATA-2793][32k][Doc] Add 32k support in document This closes #2572 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

carbondata git commit: [CARBONDATA-2790][BloomDataMap]Optimize default parameter for bloomfilter datamap

2018-08-01 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master c29aef880 -> 6351c3a07 [CARBONDATA-2790][BloomDataMap]Optimize default parameter for bloomfilter datamap To provide better query performance for bloomfilter datamap by default, we optimize bloom_size from 32000 to 64 and optimize

carbondata git commit: [CARBONDATA-2776][CarbonStore] Support ingesting data from Kafka service

2018-07-31 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/carbonstore 2d4628868 -> a6027ae11 [CARBONDATA-2776][CarbonStore] Support ingesting data from Kafka service This closes #2544 Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo Commit:

carbondata git commit: [CARBONDATA-2782]delete dead code in class 'CarbonCleanFilesCommand'

2018-07-26 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master d62fe9e65 -> c79fc90d5 [CARBONDATA-2782]delete dead code in class 'CarbonCleanFilesCommand' The variables(dms、indexDms) in function processMetadata are nerver used. This closes #2557 Project:

carbondata git commit: [CARBONDATA-2767][CarbonStore] Fix task locality issue

2018-07-25 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/carbonstore 7ad2fd951 -> 2d4628868 [CARBONDATA-2767][CarbonStore] Fix task locality issue If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode. This closes #2528

[1/2] carbondata git commit: [CARBONDATA-2613] Support csv based carbon table

2018-07-25 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/external-format a37a2ff7f -> 0d2769f75 http://git-wip-us.apache.org/repos/asf/carbondata/blob/0d2769f7/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddSegmentCommand.scala

[2/2] carbondata git commit: [CARBONDATA-2613] Support csv based carbon table

2018-07-25 Thread jackylk
[CARBONDATA-2613] Support csv based carbon table 1. create csv based carbon table using CREATE TABLE fact_table (col1 bigint, col2 string, ..., col100 string) STORED BY 'CarbonData' TBLPROPERTIES( 'foramt'='csv', 'csv.delimiter'=',', 'csv.header'='col1,col2,col100') 2. Load data to this

[carbondata] Git Push Summary

2018-07-25 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/external-format [created] a37a2ff7f

carbondata git commit: [CARBONDATA-2539][MV] Fix predicate subquery which uses leftsemi join

2018-07-25 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 68e5b52c4 -> 9a75ce53b [CARBONDATA-2539][MV] Fix predicate subquery which uses leftsemi join Problem: References to the top plan is not getting right when predicate subquery is present. Solution: Correct the refrences. This closes

carbondata git commit: [HOTFIX] Fix a spelling mistake after PR2511 merged

2018-07-25 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master b0aee53f5 -> 68e5b52c4 [HOTFIX] Fix a spelling mistake after PR2511 merged spelling mistakes: AtomicFileOperationsF modfity to: AtomicFileOperationFactory.getAtomicFileOperations This closes #2551 Project:

carbondata git commit: [CARBONDATA-2531][MV] Fix alias not working on MV query

2018-07-24 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master bea277f83 -> a75b9db6a [CARBONDATA-2531][MV] Fix alias not working on MV query Problem: when alias present on actual query then MV match not happening because alias is not ignored. Solution : Do semantic check while doing match This

carbondata git commit: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][MV] Add validations for unsupported MV queries

2018-07-24 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 6f1767b5a -> 498502d2b [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][MV] Add validations for unsupported MV queries Problem: Validations are missing on the unsupported MV queries while creating MV datamap. Solution: Added

carbondata git commit: [CARBONDATA-2694][32k] Show longstring table property in descformatted

2018-07-24 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master e5c1568de -> 42a80564c [CARBONDATA-2694][32k] Show longstring table property in descformatted add longstring table property in the output of desc formantted command This closes #2456 Project:

carbondata git commit: [CARBONDATA-2769] Fix bug when getting shard name from data before version 1.4

2018-07-24 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 1345dc6a3 -> fb2f9d33b [CARBONDATA-2769] Fix bug when getting shard name from data before version 1.4 datamap creation needs shardname. carbon imported segment id in carbondata file since version 1.4. we should return proper

carbondata git commit: [CARBONDATA-2512][32k] Support writing longstring through SDK

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master ce53b48a1 -> 1345dc6a3 [CARBONDATA-2512][32k] Support writing longstring through SDK Support writing longstring through SDK. User can specify the datatype as 'varchar' for longstring columns. Please note that, the 'varchar' column

carbondata git commit: [CARBONDATA-2770][BloomDataMap] Optimize code to get blocklet id when rebuilding datamap

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 83562ae75 -> ce53b48a1 [CARBONDATA-2770][BloomDataMap] Optimize code to get blocklet id when rebuilding datamap we should get exactly number of blocklet id from blocklet scanned result instead of building it ourselves. This closes

carbondata git commit: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix limit and average function issue in MV query

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master d820e3d51 -> 83562ae75 [CARBONDATA-2550][CARBONDATA-2576][MV] Fix limit and average function issue in MV query Problem: Limit is not working on mv queries and the average is also not working. Solution: Correct the limit queries and

carbondata git commit: [CARBONDATA-2542][MV] Fix the mv query from table with different database

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 2c291d620 -> d820e3d51 [CARBONDATA-2542][MV] Fix the mv query from table with different database Problem: database name is not added to the table name while generating mv query. Solution: Add the database name to the table name while

carbondata git commit: [CARBONDATA-2534][MV] Fix substring expression not working in MV creation

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 4014b0f54 -> 2c291d620 [CARBONDATA-2534][MV] Fix substring expression not working in MV creation Problem: The column generated when subquery expression column present is wrong while creating of MV table. Solution: Corrected the

carbondata git commit: [CARBONDATA-2530][MV] Disable the MV datamaps after main table load

2018-07-23 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/master 7ab670652 -> 0ab03f21f [CARBONDATA-2530][MV] Disable the MV datamaps after main table load Problem: MV datamaps are not disabled after the main table load is done. So the wrong data is displaying. Solution: Disable the MV datamaps

carbondata git commit: [CARBONDATA-2736][CARBONSTORE] Kafka integration with Carbon StreamSQL

2018-07-18 Thread jackylk
Repository: carbondata Updated Branches: refs/heads/carbonstore 239a6cadb -> 9ac55a5a6 [CARBONDATA-2736][CARBONSTORE] Kafka integration with Carbon StreamSQL Modification in this PR: 1.Pass source table properties to streamReader.load() 2.Do not pass schema when sparkSession.readStream

[20/50] [abbrv] carbondata git commit: [CARBONDATA-2722] [CARBONDATA-2721] JsonWriter issue fixes

2018-07-17 Thread jackylk
[CARBONDATA-2722] [CARBONDATA-2721] JsonWriter issue fixes [CARBONDATA-2722][SDK] [JsonWriter] NPE when schema and data are not of same length or Data is null. problem: Null data is not handled in the json object to carbon row conversion. solution: add a null check when object is fetched from

[35/50] [abbrv] carbondata git commit: [CARBONDATA-2609] Change RPC implementation to Hadoop RPC framework

2018-07-17 Thread jackylk
http://git-wip-us.apache.org/repos/asf/carbondata/blob/d9b40bf9/store/search/src/main/scala/org/apache/spark/rpc/Scheduler.scala -- diff --git a/store/search/src/main/scala/org/apache/spark/rpc/Scheduler.scala

<    1   2   3   4   5   6   7   8   9   10   >