[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2612 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6528/ ---
[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2612 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7804/ ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user brijoobopanna commented on the issue: https://github.com/apache/carbondata/pull/2594 retest this please ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/2608 LGTM ---
[GitHub] carbondata issue #2605: [CARBONDATA-2585] Fix local dictionary for both tabl...
Github user akashrn5 commented on the issue: https://github.com/apache/carbondata/pull/2605 retest this please ---
[GitHub] carbondata pull request #2586: [wip]Ui kill
Github user akashrn5 closed the pull request at: https://github.com/apache/carbondata/pull/2586 ---
[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2612 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6183/ ---
[GitHub] carbondata pull request #2612: [CARBONDATA-2834] Remove unnecessary nested l...
GitHub user kunal642 opened a pull request: https://github.com/apache/carbondata/pull/2612 [CARBONDATA-2834] Remove unnecessary nested looping over loadMetadatadetails. removed nested for loop which causes query performance degradation if⦠Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kunal642/carbondata nestedloop_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2612.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2612 commit ebe22d331dc4ea4ef6904e779702801c3eb5d859 Author: kunal642 Date: 2018-08-06T12:47:28Z removed nested for loop which causes query performance degradation if number of segments are too many ---
[jira] [Created] (CARBONDATA-2834) Refactor code to remove nested for loop to extract invalidTimestampRange.
Kunal Kapoor created CARBONDATA-2834: Summary: Refactor code to remove nested for loop to extract invalidTimestampRange. Key: CARBONDATA-2834 URL: https://issues.apache.org/jira/browse/CARBONDATA-2834 Project: CarbonData Issue Type: Bug Reporter: Kunal Kapoor Assignee: Kunal Kapoor Reactor getInvalidTimestampRange method in SegmentUpdateStatusManager because it has an unnecessary nested loop to get timestamp from invalid segments. This will cause query performance degradation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2594 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6527/ ---
[jira] [Closed] (CARBONDATA-2809) Manually rebuilding non-lazy datamap cause error
[ https://issues.apache.org/jira/browse/CARBONDATA-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin closed CARBONDATA-2809. -- Resolution: Duplicate duplicated with CARBONDATA-2821 > Manually rebuilding non-lazy datamap cause error > > > Key: CARBONDATA-2809 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2809 > Project: CarbonData > Issue Type: Bug >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Minor > Time Spent: 3h 50m > Remaining Estimate: 0h > > Steps to reproduce: > 1. create base table > 2. load data to base table > 3. create index datamap (such as bloomfilter datamap) on base table > 4. rebuild datamap This will give error > In step3, the data of datamap has already been generated, if we trigger > rebuild, the procedure does not clean the files properly, thus causing the > error. > Actually, the rebuild is not required. We can fix this issue by skipping the > rebuild procedure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin reopened CARBONDATA-2820: > Block rebuilding for preagg, bloom and lucene datamap > - > > Key: CARBONDATA-2820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2820 > Project: CarbonData > Issue Type: Improvement >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > > currently we will block rebuilding these datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin closed CARBONDATA-2820. -- Resolution: Duplicate > Block rebuilding for preagg, bloom and lucene datamap > - > > Key: CARBONDATA-2820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2820 > Project: CarbonData > Issue Type: Improvement >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > > currently we will block rebuilding these datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571049#comment-16571049 ] xuchuanyin edited comment on CARBONDATA-2820 at 8/7/18 2:40 AM: duplicated with CARBONDATA-2821 was (Author: xuchuanyin): duplicated with CARBONDATA-2823 > Block rebuilding for preagg, bloom and lucene datamap > - > > Key: CARBONDATA-2820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2820 > Project: CarbonData > Issue Type: Improvement >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > > currently we will block rebuilding these datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin closed CARBONDATA-2820. -- Resolution: Duplicate duplicated with CARBONDATA-2823 > Block rebuilding for preagg, bloom and lucene datamap > - > > Key: CARBONDATA-2820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2820 > Project: CarbonData > Issue Type: Improvement >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > > currently we will block rebuilding these datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2594 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7803/ ---
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2611 @kumarvishal09 can you explain this modification? In previous implementation, we split a record to 'dict-sort', 'nodict-sort' and 'noSortDims & measures'. 'noSortDims & measures' is packed to bytes to avoid serialization-deserialization for them during reading/writing records to sort temp. In previous implementation, we can see about 8% enhancement in data loading. ---
[jira] [Commented] (CARBONDATA-2833) NPE when we do a insert over a insert failure operation
[ https://issues.apache.org/jira/browse/CARBONDATA-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571028#comment-16571028 ] xuchuanyin commented on CARBONDATA-2833: steps in issue description cannot reproduce the problem, I've tried with another steps, but still cannot reproduce it: ``` test("test") { CarbonProperties.getInstance().addProperty("bad_records_logger_enable", "true") CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL") sql("CREATE DATABASE test1") sql("use test1") sql("DROP TABLE IF EXISTS ab") sql("CREATE TABLE ab (a integer, b string) stored by 'carbondata'") sql("CREATE DATAMAP dm ON TABLE ab using 'bloomfilter' DMPROPERTIES('index_columns'='a,b')") try { sql("insert into ab select 'berb', 'abc', 'ggg', '1'") } catch { case e : Exception => LOGGER.error(e) } LOGGER.error("XU second run") try { sql("insert into ab select 'berb', 'abc', 'ggg', '1'") } catch { case e : Exception => LOGGER.error(e) } sql("select * from ab").show(false) sql("DROP TABLE IF EXISTS ab") sql("DROP DATABASE IF EXISTS test1") sql("use default") CarbonProperties.getInstance().addProperty("bad_records_logger_enable", CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE_DEFAULT) CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL") } ``` The load statement complains about the bad_record error, no NPE is reported. > NPE when we do a insert over a insert failure operation > --- > > Key: CARBONDATA-2833 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2833 > Project: CarbonData > Issue Type: Bug >Reporter: Brijoo Bopanna >Priority: Major > > jdbc:hive2://10.18.5.188:23040/default> CREATE TABLE > 0: jdbc:hive2://10.18.5.188:23040/default> IF NOT EXISTS test_table( > 0: jdbc:hive2://10.18.5.188:23040/default> id string, > 0: jdbc:hive2://10.18.5.188:23040/default> name string, > 0: jdbc:hive2://10.18.5.188:23040/default> city string, > 0: jdbc:hive2://10.18.5.188:23040/default> age Int) > 0: jdbc:hive2://10.18.5.188:23040/default> STORED BY 'carbondata'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.191 seconds) > 0: jdbc:hive2://10.18.5.188:23040/default> > 0: jdbc:hive2://10.18.5.188:23040/default> > 0: jdbc:hive2://10.18.5.188:23040/default> > 0: jdbc:hive2://10.18.5.188:23040/default> desc test_table > 0: jdbc:hive2://10.18.5.188:23040/default> ; > +---++--+--+ > | col_name | data_type | comment | > +---++--+--+ > | id | string | NULL | > | name | string | NULL | > | city | string | NULL | > | age | int | NULL | > +---++--+--+ > 4 rows selected (0.081 seconds) > 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select > 'berb','abc','ggg','1'; > Error: java.lang.Exception: Data load failed due to bad record: The value > with column name a and column data type INT is not a valid INT type.Please > enable bad record logger to know the detail reason. (state=,code=0) > 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select > 'berb','abc','ggg','1'; > *Error: java.lang.NullPointerException (state=,code=0)* > 0: jdbc:hive2://10.18.5.188:23040/default> insert into test_table select > 'berb','abc','ggg',1; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.127 seconds) > 0: jdbc:hive2://10.18.5.188:23040/default> show tables > 0: jdbc:hive2://10.18.5.188:23040/default> ; > +---+-+--+--+ > | database | tableName | isTemporary | > +---+-+--+--+ > | praveen | a | false | > | praveen | ab | false | > | praveen | bbc | false | > | praveen | test_table | false | > +---+-+--+--+ > 4 rows selected (0.041 seconds) > 0: jdbc:hive2://10.18.5.188:23040/default> > 0: jdbc:hive2://10.18.5.188:23040/default> desc ab > 0: jdbc:hive2://10.18.5.188:23040/default> ; > +---++--+--+ > | col_name | data_type | comment | > +---++--+--+ > | a | int | NULL | > | b | string | NULL | > +---++--+--+ > 2 rows selected (0.074 seconds) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2594 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6182/ ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2594 @ravipesala Fixed. The root cause is that MV is actually 'deferred rebuild', but we didn't specify it while we create the datamap. To make compliance, we will enable 'deferred rebuild' for MV datamap no matter whether the flag is enabled by user or not. ---
[GitHub] carbondata issue #2606: [CARBONDATA-2817]Thread Leak in Update and in No sor...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2606 @BJangir Please handle thread leak scenario for BatchSortWriter in case of any exception. DataWriterBatchProcessorStepImpl.java ---
[GitHub] carbondata pull request #2606: [CARBONDATA-2817]Thread Leak in Update and in...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2606#discussion_r207961430 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java --- @@ -169,24 +171,36 @@ private void doExecute(Iterator iterator, int iteratorIndex) thr if (rowsNotExist) { rowsNotExist = false; dataHandler = CarbonFactHandlerFactory.createCarbonFactHandler(model); +this.carbonFactHandlers.add(dataHandler); dataHandler.initialise(); } processBatch(iterator.next(), dataHandler, iteratorIndex); } -if (!rowsNotExist) { - finish(dataHandler, iteratorIndex); +try { + if (!rowsNotExist) { +finish(dataHandler, iteratorIndex); + } +} finally { + carbonFactHandlers.remove(dataHandler); } + + } @Override protected String getStepName() { return "Data Writer"; } private void finish(CarbonFactHandler dataHandler, int iteratorIndex) { +CarbonDataWriterException exception = null; --- End diff -- Please handle for closeHandler method as it can also throw exception ---
[jira] [Updated] (CARBONDATA-2827) Refactor Segment Status Manager Interface
[ https://issues.apache.org/jira/browse/CARBONDATA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G updated CARBONDATA-2827: - Attachment: Segment Status Management interface design_V1_Ramana_reviewed.docx > Refactor Segment Status Manager Interface > - > > Key: CARBONDATA-2827 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2827 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Priority: Major > Attachments: Segment Status Management interface design_V1.docx, > Segment Status Management interface design_V1_Ramana_reviewed.docx > > > Carbon uses tablestatus file to record segment status and details of each > segment during each load. This tablestatus enables carbon to support > concurrent loads and reads without data inconsistency or corruption. > So it is very important feature of carbondata and we should have clean > interfaces to maintain it. Current tablestatus updation is shattered to > multiple places and there is no clean interface, so I am proposing to > refactor current SegmentStatusManager interface and bringing all tablestatus > operations to single interface. > This new interface allows to add table status to any other storage like DB. > This is needed for S3 type object stores as these are eventually consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2609 LGTM ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2594 @xuchuanyin Please check MVTests, it is failing ---
[jira] [Created] (CARBONDATA-2833) NPE when we do a insert over a insert failure operation
Brijoo Bopanna created CARBONDATA-2833: -- Summary: NPE when we do a insert over a insert failure operation Key: CARBONDATA-2833 URL: https://issues.apache.org/jira/browse/CARBONDATA-2833 Project: CarbonData Issue Type: Bug Reporter: Brijoo Bopanna jdbc:hive2://10.18.5.188:23040/default> CREATE TABLE 0: jdbc:hive2://10.18.5.188:23040/default> IF NOT EXISTS test_table( 0: jdbc:hive2://10.18.5.188:23040/default> id string, 0: jdbc:hive2://10.18.5.188:23040/default> name string, 0: jdbc:hive2://10.18.5.188:23040/default> city string, 0: jdbc:hive2://10.18.5.188:23040/default> age Int) 0: jdbc:hive2://10.18.5.188:23040/default> STORED BY 'carbondata'; +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.191 seconds) 0: jdbc:hive2://10.18.5.188:23040/default> 0: jdbc:hive2://10.18.5.188:23040/default> 0: jdbc:hive2://10.18.5.188:23040/default> 0: jdbc:hive2://10.18.5.188:23040/default> desc test_table 0: jdbc:hive2://10.18.5.188:23040/default> ; +---++--+--+ | col_name | data_type | comment | +---++--+--+ | id | string | NULL | | name | string | NULL | | city | string | NULL | | age | int | NULL | +---++--+--+ 4 rows selected (0.081 seconds) 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 'berb','abc','ggg','1'; Error: java.lang.Exception: Data load failed due to bad record: The value with column name a and column data type INT is not a valid INT type.Please enable bad record logger to know the detail reason. (state=,code=0) 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 'berb','abc','ggg','1'; *Error: java.lang.NullPointerException (state=,code=0)* 0: jdbc:hive2://10.18.5.188:23040/default> insert into test_table select 'berb','abc','ggg',1; +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.127 seconds) 0: jdbc:hive2://10.18.5.188:23040/default> show tables 0: jdbc:hive2://10.18.5.188:23040/default> ; +---+-+--+--+ | database | tableName | isTemporary | +---+-+--+--+ | praveen | a | false | | praveen | ab | false | | praveen | bbc | false | | praveen | test_table | false | +---+-+--+--+ 4 rows selected (0.041 seconds) 0: jdbc:hive2://10.18.5.188:23040/default> 0: jdbc:hive2://10.18.5.188:23040/default> desc ab 0: jdbc:hive2://10.18.5.188:23040/default> ; +---++--+--+ | col_name | data_type | comment | +---++--+--+ | a | int | NULL | | b | string | NULL | +---++--+--+ 2 rows selected (0.074 seconds) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2611 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6526/ ---
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2611 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7802/ ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2610 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6524/ ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2610 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7800/ ---
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2611 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6181/ ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2594 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6523/ ---
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2611 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6180/ ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2594 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7799/ ---
[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2611 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7801/ ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6522/ ---
[GitHub] carbondata pull request #2611: [WIP]Fixed data loading performance issue
GitHub user kumarvishal09 opened a pull request: https://github.com/apache/carbondata/pull/2611 [WIP]Fixed data loading performance issue Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata dataloadPerFix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2611.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2611 commit 5a2ebf3d056794387f2622818c9cf7be7ec4ec61 Author: kumarvishal09 Date: 2018-08-06T13:30:27Z Fixed data loading performance issue ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7798/ ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2610 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6179/ ---
[jira] [Commented] (CARBONDATA-2822) Carbon Configuration - "carbon.invisible.segments.preserve.count" configuration property is not working as expected.
[ https://issues.apache.org/jira/browse/CARBONDATA-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570184#comment-16570184 ] Indhumathi Muthumurugesh commented on CARBONDATA-2822: -- Hi Prasanna, The carbon configuration *"carbon.invisible.segments.preserve.count"* is actually for TableStatusFile. When set this property, if the number of invisible segment info files exceeds the given value, then, those files will be removed and written to tablestatus.history file. Thanks & Regards, Indhumathi M > Carbon Configuration - "carbon.invisible.segments.preserve.count" > configuration property is not working as expected. > - > > Key: CARBONDATA-2822 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2822 > Project: CarbonData > Issue Type: Bug > Components: core, file-format > Environment: 3 Node ANT cluster. >Reporter: Prasanna Ravichandran >Priority: Minor > > For the *carbon.invisible.segments.preserve.count* configuration, it is not > working as expected. > +*Steps to reproduce:*+ > 1) Setting up "*carbon.invisible.segments.preserve.count=20"* in > carbon.properties and restarting the thrift server. > > 2) After performing Loading 40 times and Compaction 4 times. > 3) Perform clean files, so that the tablestatus.history file would be > generated with invisible segments details. > So Total 44 segments would be created including visible and invisible > segments.(40 load segment (like segment ID from 0,1,2...39) + 4 compacted new > segment(like 0.1,20.1,22.1,0.2)) > In that, *41 segments information are present in the "tablestatus.history" > file(*which holds invisible(marked for delete and compacted) segments > details) and 3 segments information are present in the "tablestatus" > file(which holds visible segments(0 .2 -final compacted segment) along with > (1^st^ segment - 0th segment) and (last segment-39th segment)). *But > invisible segment preserve count is configured to 20, which is not followed > for the tablestatus.history file.* > +*Expected result:*+ > tablestatus.history file should preserve only the latest 20 segments, as per > the configuration. > +*Actual result:*+ > tablestatus.history file is having 41 invisible segments details.(which is > above the configured value: 20) > > This is tested with ANT cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2594 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6178/ ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/2608 retest this please ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7796/ ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6520/ ---
[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2594 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6177/ ---
[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2608 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6176/ ---
[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2568 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6519/ ---
[jira] [Assigned] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table
[ https://issues.apache.org/jira/browse/CARBONDATA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat reassigned CARBONDATA-2832: --- Assignee: dhatchayani > Block loading error for select query executed after merge index command > executed on V1/V2 store table > - > > Key: CARBONDATA-2832 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2832 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: dhatchayani >Priority: Minor > > Steps : > *Create and load data in V1/V2 carbon store:* > create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='1'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE > brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > *In 1.4.1* > refresh table brinjal; > alter table brinjal compact 'segment_index'; > select * from brinjal where AMSize='8RAM size'; > > *Issue : Block loading error for select query executed after merge index > command executed on V1/V2 store table.* > 0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where > AMSize='8RAM size'; > *Error: java.io.IOException: Problem in loading segment blocks. > (state=,code=0)* > *Expected :* select query executed after merge index command executed on > V1/V2 store table should return correct result set without error** -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2568 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7795/ ---
[jira] [Created] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table
Chetan Bhat created CARBONDATA-2832: --- Summary: Block loading error for select query executed after merge index command executed on V1/V2 store table Key: CARBONDATA-2832 URL: https://issues.apache.org/jira/browse/CARBONDATA-2832 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.4.1 Environment: Spark 2.1 Reporter: Chetan Bhat Steps : *Create and load data in V1/V2 carbon store:* create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry string, Activecity string,gamePointId double,deviceInformationId double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1'); LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); *In 1.4.1* refresh table brinjal; alter table brinjal compact 'segment_index'; select * from brinjal where AMSize='8RAM size'; *Issue : Block loading error for select query executed after merge index command executed on V1/V2 store table.* 0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where AMSize='8RAM size'; *Error: java.io.IOException: Problem in loading segment blocks. (state=,code=0)* *Expected :* select query executed after merge index command executed on V1/V2 store table should return correct result set without error** -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2568 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6175/ ---
[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2568 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6174/ ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2610 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6518/ ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2610 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7793/ ---
[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...
Github user vandana7 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2568#discussion_r207833427 --- Diff: integration/presto/presto-integration-technical-note.md --- @@ -0,0 +1,253 @@ + + +# Presto Integration Technical Note +Presto Integration with Carbon data include the below steps: + +* Setting up Presto Cluster + +* Setting up cluster to use carbondata as a catalog along with other catalogs provided by presto. + +In this technical note we will first learn about the above two points and after that we will see how we can do performance tuning with Presto. + +## **Let us begin with the first step of Presto Cluster Setup:** + + +* ### Installing Presto + + 1. Download the 0.187 version of Presto using: + `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz` + + 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`. + + 3. Download the Presto CLI for the coordinator and name it presto. + + ``` +wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + +mv presto-cli-0.187-executable.jar presto + +chmod +x presto + ``` + +### Create Configuration Files + + 1. Create `etc` folder in presto-server-0.187 directory. + 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files. + 3. Install uuid to generate a node.id. + + ``` + sudo apt-get install uuid + + uuid + ``` + + +# Contents of your node.properties file + + ``` + node.environment=production + node.id= + node.data-dir=/home/ubuntu/data + ``` + +# Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +# Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations + +# Contents of your config.properties + ``` + coordinator=true + node-scheduler.include-coordinator=false + http-server.http.port=8086 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery-server.enabled=true + discovery.uri=:8086 + ``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>`. + +### Worker Configurations + +# Contents of your config.properties + + ``` + coordinator=false + http-server.http.port=8086 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery.uri=:8086 + ``` + +**Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`.(generated by uuid command). + +### **With this we are ready with the Presto Cluster setup but to integrate with carbon data further steps are required which are as follows:** + +### Catalog Configurations + +1. Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator. + +# Configuring Carbondata in Presto +1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. + +### Add Plugins + +1. Create a directory named `carbondata` in plugin directory of presto. +2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes. + +### Start Presto Server on all nodes + +``` +./presto-server-0.187/bin/launcher start +``` +To run it as a background process. + +``` +./presto-server-0.187/bin/launcher run +``` +To run it in foreground. + +### Start Presto CLI +``` +./presto +``` +To connect to carbondata catalog use the following command: + +``` +./presto --server :8086 --catalog carbondata --schema +```
[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...
Github user vandana7 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2568#discussion_r207833200 --- Diff: integration/presto/performance-report-of-presto-with-carbon.md --- @@ -0,0 +1,27 @@
[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...
Github user vandana7 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2568#discussion_r207832179 --- Diff: integration/presto/presto-integration-in-carbondata.md --- @@ -0,0 +1,134 @@ + + +# PRESTO INTEGRATION IN CARBONDATA + +1. [Document Purpose](#document-purpose) +1. [Purpose](#purpose) +1. [Scope](#scope) +1. [Definitions and Acronyms](#definitions-and-acronyms) +1. [Requirements addressed](#requirements-addressed) +1. [Design Considerations](#design-considerations) +1. [Row Iterator Implementation](#row-iterator-implementation) +1. [ColumnarReaders or StreamReaders approach](#columnarreaders-or-streamreaders-approach) +1. [Module Structure](#module-structure) +1. [Detailed design](#detailed-design) +1. [Modules](#modules) +1. [Functions Developed](#functions-developed) +1. [Integration Tests](#integration-tests) +1. [Tools and languages used](#tools-and-languages-used) +1. [References](#references) + +## Document Purpose + + * _Purpose_ + The purpose of this document is to outline the technical design of the Presto Integration in CarbonData. + + Its main purpose is to - + * Provide the link between the Functional Requirement and the detailed Technical Design documents. + * Detail the functionality which will be provided by each component or group of components and show how the various components interact in the design. + + This document is not intended to address installation and configuration details of the actual implementation. Installation and configuration details are provided in technology guides provided on CarbonData wiki page.As is true with any high level design, this document will be updated and refined based on changing requirements. + * _Scope_ + Presto Integration with CarbonData will allow execution of CarbonData queries on the Presto CLI. Â CarbonData can be added easily as a Data Source among the multiple heterogeneous data sources for Presto. + * _Definitions and Acronyms_ + **CarbonData :** CarbonData is a fully indexed columnar and Hadoop native data-store for processing heavy analytical workloads and detailed queries on big data. In customer benchmarks, CarbonData has proven to manage Petabyte of data running on extraordinarily low-cost hardware and answers queries around 10 times faster than the current open source solutions (column-oriented SQL on Hadoop data-stores). + + **Presto :** Presto is a distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. + +## Requirements addressed +This integration of Presto mainly serves two purpose: + * Support of Apache CarbonData as Data Source in Presto. + * Execution of Apache CarbonData Queries on Presto. + +## Design Considerations --- End diff -- Done ---
[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2610 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6173/ ---
[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2609 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6517/ ---
[GitHub] carbondata pull request #2590: [CARBONDATA-2750] Updated documentation on Lo...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2590 ---
[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2609 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7792/ ---
[GitHub] carbondata pull request #2610: [CARBONDATA-2831] Added Support Merge index f...
GitHub user ajantha-bhat opened a pull request: https://github.com/apache/carbondata/pull/2610 [CARBONDATA-2831] Added Support Merge index files read from non transactional table problem : Currently SDK read/ nontransactional table read from external table gives null output when carbonMergeindex file is present instead of carobnindex files. cause : In LatestFileReadCommitted, while taking snapshot, merge index files were not considered. solution: consider the merge index files while taking snapshot Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done. Added UT - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata issue_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2610.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2610 commit 4a2ca45bd80e542db4a0e461ffdbfb6e55f29d27 Author: ajantha-bhat Date: 2018-08-06T08:45:41Z [CARBONDATA-2831] Added Support Merge index files read from non transactional table. problem : Currently SDK read/ nontransactional table read from external table gives null output when carbonMergeindex file is present instead of carobnindex files. cause : In LatestFileReadCommitted, while taking snapshot, merge index files were not considered. solution: consider the merge index files while taking snapshot ---
[jira] [Created] (CARBONDATA-2830) Support Merge index files read from non transactional table.
Ajantha Bhat created CARBONDATA-2830: Summary: Support Merge index files read from non transactional table. Key: CARBONDATA-2830 URL: https://issues.apache.org/jira/browse/CARBONDATA-2830 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat problem : Currently SDK read/ nontransactional table read from external table gives null output when carbonMergeindex file is present instead of carobnindex files. cause : In LatestFileReadCommitted, while taking snapshot, merge index files were not considered. solution: consider the merge index files while taking snapshot -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2831) Support Merge index files read from non transactional table.
Ajantha Bhat created CARBONDATA-2831: Summary: Support Merge index files read from non transactional table. Key: CARBONDATA-2831 URL: https://issues.apache.org/jira/browse/CARBONDATA-2831 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat problem : Currently SDK read/ nontransactional table read from external table gives null output when carbonMergeindex file is present instead of carobnindex files. cause : In LatestFileReadCommitted, while taking snapshot, merge index files were not considered. solution: consider the merge index files while taking snapshot -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2607#discussion_r207812426 --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala --- @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] { * @param dictionaryData * @return */ - private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = { + private def createSliceArrayBlock(dictionaryData: Dictionary): Block = { val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks -val sliceArray = new Array[Slice](chunks.getSize + 1) -// Initialize Slice Array with Empty Slice as per Presto's code -sliceArray(0) = Slices.EMPTY_SLICE -var count = 1 +val positionCount = chunks.getSize; +val offsetVector : Array[Int] = new Array[Int](positionCount + 2 ) +val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1) +isNullVector(0) = true +isNullVector(1) = true --- End diff -- ok. ---
[GitHub] carbondata issue #2590: [CARBONDATA-2750] Updated documentation on Local Dic...
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/2590 LGTM ---
[jira] [Resolved] (CARBONDATA-2763) Create table with partition and no_inverted_index on long_string column is not blocked
[ https://issues.apache.org/jira/browse/CARBONDATA-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin resolved CARBONDATA-2763. Resolution: Fixed Fix Version/s: 1.4.1 > Create table with partition and no_inverted_index on long_string column is > not blocked > -- > > Key: CARBONDATA-2763 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2763 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1, 2.2 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 1.4.1 > > > Steps : > # Create table with partition using long_string column > CREATE TABLE local_no_inverted_index(id int, name string, description > string,address string, note string) STORED BY 'org.apache.carbondata.format' > tblproperties('no_inverted_index'='note','long_string_columns'='note'); > 2. Create table with no_inverted_index > CREATE TABLE local1_partition(id int,name string, description > string,address string) partitioned by (note string) STORED BY > 'org.apache.carbondata.format' tblproperties('long_string_columns'='note'); > > Actual Output : The Create table with partition and no_inverted_index on > long_string column is successful. > 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE > local_no_inverted_index(id int, name string, description string,address > string, note string) STORED BY 'org.apache.carbondata.format' > tblproperties('no_inverted_index'='note','long_string_columns'='note'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (2.604 seconds) > 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE local1_partition(id > int,name string, description string,address string) partitioned by (note > string) STORED BY 'org.apache.carbondata.format' > tblproperties('long_string_columns'='note'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.989 seconds) > Expected Output - The Create table with partition and no_inverted_index on > long_string column should be blocked. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2609 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6172/ ---
[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206
Github user bhavya411 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2607#discussion_r207804289 --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala --- @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] { * @param dictionaryData * @return */ - private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = { + private def createSliceArrayBlock(dictionaryData: Dictionary): Block = { val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks -val sliceArray = new Array[Slice](chunks.getSize + 1) -// Initialize Slice Array with Empty Slice as per Presto's code -sliceArray(0) = Slices.EMPTY_SLICE -var count = 1 +val positionCount = chunks.getSize; +val offsetVector : Array[Int] = new Array[Int](positionCount + 2 ) +val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1) +isNullVector(0) = true +isNullVector(1) = true --- End diff -- We are talking about dictionary here , so In dictionary there will be only one null and the key value will be 1 by default in CarbonData, hence the isNullVector will be populated only once with null value it has no bearing on actual data. The Carbondata key starts from 1 so we need a filler at 0th position and 1 index is actually Null to map to carbondata null values . The offset index will be like 0th Position -> 0 (As it is filler) 1st Position -> 0 (For actual Null) 2nd Postion -> 0 as the byte[] is still null so starting point will be 0 only ---
[jira] [Resolved] (CARBONDATA-2762) Long string column displayed as string in describe formatted
[ https://issues.apache.org/jira/browse/CARBONDATA-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin resolved CARBONDATA-2762. Resolution: Fixed Fix Version/s: 1.4.1 > Long string column displayed as string in describe formatted > > > Key: CARBONDATA-2762 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2762 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 1.4.1 > > > Steps : > User creates a table with long string column and executes the describe > formatted table command. > 0: jdbc:hive2://10.18.98.101:22550/default> create table t2(c1 string, c2 > string) stored by 'carbondata' tblproperties('long_string_columns' = 'c2'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (3.034 seconds) > 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2; > Actual Output : The describe formatted displays the c2 column as string > instead of long string. > 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2; > +---+---+---+--+ > | col_name | data_type | comment | > +---+---+---+--+ > | c1 | string | KEY COLUMN,null | > *| c2 | string | KEY COLUMN,null |* > | | | | > | ##Detailed Table Information | | | > | Database Name | default | | > | Table Name | t2 | | > | CARBON Store Path | > hdfs://hacluster/user/hive/warehouse/carbon.store/default/t2 | | > | Comment | | | > | Table Block Size | 1024 MB | | > | Table Data Size | 0 | | > | Table Index Size | 0 | | > | Last Update Time | 0 | | > | SORT_SCOPE | LOCAL_SORT | LOCAL_SORT | > | CACHE_LEVEL | BLOCK | | > | Streaming | false | | > | Local Dictionary Enabled | true | | > | Local Dictionary Threshold | 1 | | > | Local Dictionary Include | c1,c2 | | > | | | | > | ##Detailed Column property | | | > | ADAPTIVE | | | > | SORT_COLUMNS | c1 | | > +---+---+---+--+ > 22 rows selected (2.847 seconds) > > Expected Output : The describe formatted should display the c2 column as long > string. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6516/ ---
[jira] [Resolved] (CARBONDATA-2796) Fix data loading problem when table has complex column and long string column
[ https://issues.apache.org/jira/browse/CARBONDATA-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin resolved CARBONDATA-2796. Resolution: Fixed Fix Version/s: 1.4.1 > Fix data loading problem when table has complex column and long string column > -- > > Key: CARBONDATA-2796 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2796 > Project: CarbonData > Issue Type: Sub-task >Reporter: jiangmanhua >Assignee: jiangmanhua >Priority: Major > Fix For: 1.4.1 > > Time Spent: 3h > Remaining Estimate: 0h > > currently both varchar column and complex column believes itself is the last > one member in noDictionary group when converting carbon row from raw format > to 3-parted format. Since they need to be proceeded in different way, > exception will occur if we deal the column in wrong way. > To fix this, we marked the info of complex columns explicitly like varchar > columns, and keep the order of noDictionary group as : normal Dim & varchar & > complex -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2608 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7791/ ---
[GitHub] carbondata pull request #2609: [CARBONDATA-2823] Support streaming property ...
GitHub user xuchuanyin opened a pull request: https://github.com/apache/carbondata/pull/2609 [CARBONDATA-2823] Support streaming property with datamap Since during query, carbondata get splits from streaming segment and columnar segments repectively, we can support streaming with index datamap. For preaggregate datamap, it already supported streaming table, so here we will remove the outdated comments. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? `NO` - [x] Any backward compatibility impacted? `NO` - [x] Document update required? `NO` - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? `NO` - How it is tested? Please attach test report. `Tested in local` - Is it a performance related change? Please attach the performance test report. `NO` - Any additional information to help reviewers in testing this change. `NA` - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata issue2823_streaming_support_preagg_index_dm Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2609.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2609 commit a7772d8fd2ece3c362f16925299c63b848657c9a Author: xuchuanyin Date: 2018-08-06T07:34:51Z Support streaming property with datamap Since during query, carbondata get splits from streaming segment and columnar segments repectively, we can support streaming with index datamap. For preaggregate datamap, it already supported streaming table, so here we will remove the outdated comments. ---
[jira] [Updated] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-2823: Description: Steps : # create table # create bloom/lucene datamap # load data # alter table set tblProperties 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format'; +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.43 seconds) 0: jdbc:hive2://10.18.98.101:22550/default> CREATE DATAMAP dm_uniqdata1_tmstmp6 ON TABLE uniqdata_load USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.828 seconds) 0: jdbc:hive2://10.18.98.101:22550/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_load OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (4.903 seconds) 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set tblproperties('local_dictionary_include'='CUST_NAME'); Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: streaming is not supported for index datamap (state=,code=0) Issue : Alter table set local dictionary include fails with incorrect error. 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set tblproperties('local_dictionary_include'='CUST_NAME'); *Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: streaming is not supported for index datamap (state=,code=0)* Expected : Operation should be success. If the operation is unsupported it should throw correct error message. was: Steps : In old version V3 store create table and load data. CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format'; LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_load OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); In 1.4.1 version refresh the table of old V3 store. refresh table uniqdata_load; Create bloom filter and merge index. CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); Alter table set local dictionary include. alter table uniqdata_load set tblproperties('local_dictionary_include'='CUST_NAME'); Issue : Alter table set local dictionary include fails with incorrect error. 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set tblproperties('local_dictionary_include'='CUST_NAME'); *Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: streaming is not supported for index datamap (state=,code=0)* Expected : Operation should be success. If the operation is unsupported it should throw correct error message. > Alter table set local dictionary include after bloom creation and merge index > on old V3 store fails throwing incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: xuchuanyin >Priority: Minor > > Steps : > # create table > # create bloom/lucene datamap > # load data > # alter table set tblProperties > 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, >
[jira] [Updated] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-2823: Summary: Alter table set local dictionary include after bloom creation fails throwing incorrect error (was: Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error) > Alter table set local dictionary include after bloom creation fails throwing > incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: xuchuanyin >Priority: Minor > > Steps : > # create table > # create bloom/lucene datamap > # load data > # alter table set tblProperties > 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY > 'org.apache.carbondata.format'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.43 seconds) > 0: jdbc:hive2://10.18.98.101:22550/default> CREATE DATAMAP > dm_uniqdata1_tmstmp6 ON TABLE uniqdata_load USING 'bloomfilter' DMPROPERTIES > ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.828 seconds) > 0: jdbc:hive2://10.18.98.101:22550/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_load > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (4.903 seconds) > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0) > > Issue : Alter table set local dictionary include fails with incorrect error. > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > *Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0)* > > Expected : Operation should be success. If the operation is unsupported it > should throw correct error message. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569825#comment-16569825 ] xuchuanyin commented on CARBONDATA-2823: since we get the splits from streaming segment and columnar segments respectively, we can support streaming with index datamap > Alter table set local dictionary include after bloom creation and merge index > on old V3 store fails throwing incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: xuchuanyin >Priority: Minor > > Steps : > In old version V3 store create table and load data. > CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_load OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > In 1.4.1 version refresh the table of old V3 store. > refresh table uniqdata_load; > Create bloom filter and merge index. > CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' > DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', > 'BLOOM_FPP'='0.1'); > Alter table set local dictionary include. > alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > > Issue : Alter table set local dictionary include fails with incorrect error. > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > *Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0)* > > Expected : Operation should be success. If the operation is unsupported it > should throw correct error message. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823 ] xuchuanyin edited comment on CARBONDATA-2823 at 8/6/18 7:22 AM: As for CARBONDATA-2823, it can simply be reproduced by 1. create table 2. create bloom/lucene datamap 3. load data 4. alter table set tblProperties was (Author: xuchuanyin): As for CARBONDATA-2823, it can simply reproduced by 1. create table 2. create bloom/lucene datamap 3. load data 4. alter table set tblProperties > Alter table set local dictionary include after bloom creation and merge index > on old V3 store fails throwing incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Priority: Minor > > Steps : > In old version V3 store create table and load data. > CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_load OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > In 1.4.1 version refresh the table of old V3 store. > refresh table uniqdata_load; > Create bloom filter and merge index. > CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' > DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', > 'BLOOM_FPP'='0.1'); > Alter table set local dictionary include. > alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > > Issue : Alter table set local dictionary include fails with incorrect error. > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > *Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0)* > > Expected : Operation should be success. If the operation is unsupported it > should throw correct error message. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin reassigned CARBONDATA-2823: -- Assignee: xuchuanyin > Alter table set local dictionary include after bloom creation and merge index > on old V3 store fails throwing incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: xuchuanyin >Priority: Minor > > Steps : > In old version V3 store create table and load data. > CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_load OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > In 1.4.1 version refresh the table of old V3 store. > refresh table uniqdata_load; > Create bloom filter and merge index. > CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' > DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', > 'BLOOM_FPP'='0.1'); > Alter table set local dictionary include. > alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > > Issue : Alter table set local dictionary include fails with incorrect error. > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > *Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0)* > > Expected : Operation should be success. If the operation is unsupported it > should throw correct error message. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error
[ https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823 ] xuchuanyin commented on CARBONDATA-2823: As for CARBONDATA-2823, it can simply reproduced by 1. create table 2. create bloom/lucene datamap 3. load data 4. alter table set tblProperties > Alter table set local dictionary include after bloom creation and merge index > on old V3 store fails throwing incorrect error > > > Key: CARBONDATA-2823 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2823 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Priority: Minor > > Steps : > In old version V3 store create table and load data. > CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_load OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > In 1.4.1 version refresh the table of old V3 store. > refresh table uniqdata_load; > Create bloom filter and merge index. > CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' > DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', > 'BLOOM_FPP'='0.1'); > Alter table set local dictionary include. > alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > > Issue : Alter table set local dictionary include fails with incorrect error. > 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set > tblproperties('local_dictionary_include'='CUST_NAME'); > *Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > streaming is not supported for index datamap (state=,code=0)* > > Expected : Operation should be success. If the operation is unsupported it > should throw correct error message. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2608 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6171/ ---
[GitHub] carbondata pull request #2608: [CARBONDATA-2829] Fix creating merge index on...
GitHub user dhatchayani opened a pull request: https://github.com/apache/carbondata/pull/2608 [CARBONDATA-2829] Fix creating merge index on older V1 V2 store Block merge index creation for the old store V1 V2 versions - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done Manual Testing - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhatchayani/carbondata CARBONDATA-2829 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2608.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2608 commit 45130534fed38bc4b4ac684c41d1afb1a33770be Author: dhatchayani Date: 2018-08-06T06:45:26Z [CARBONDATA-2829] Fix creating merge index on older V1 V2 store ---
[jira] [Created] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store
dhatchayani created CARBONDATA-2829: --- Summary: Fix creating merge index on older V1 V2 store Key: CARBONDATA-2829 URL: https://issues.apache.org/jira/browse/CARBONDATA-2829 Project: CarbonData Issue Type: Improvement Reporter: dhatchayani Assignee: dhatchayani Block creating merge index on older V1 V2 version -- This message was sent by Atlassian JIRA (v7.6.3#76005)