[GitHub] carbondata pull request #2540: [CARBONDATA-2649] Handled executor min/max pr...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2540#discussion_r204630970 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java --- @@ -221,7 +223,30 @@ public void setNumberOfPages(int numberOfPages) { output.writeInt(measureChunksLength.get(i)); } writeChunkInfoForOlderVersions(output); +serializeMinMaxValues(output); + } + /** + * serialize min max values + * + * @param output + * @throws IOException + */ + private void serializeMinMaxValues(DataOutput output) throws IOException { --- End diff -- I don't think it is required to serailaize the min/max from driver. if columns are not cached then read footer from executor side. ---
[GitHub] carbondata issue #2387: [CARBONDATA-2621][BloomDataMap] Lock problem in inde...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2387 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6186/ ---
[GitHub] carbondata issue #2456: [CARBONDATA-2694][32k] Show longstring table propert...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2456 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6187/ ---
[GitHub] carbondata issue #2530: [CARBONDATA-2753] Fix Compatibility issues
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2530 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5973/ ---
[GitHub] carbondata issue #2387: [CARBONDATA-2621][BloomDataMap] Lock problem in inde...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2387 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7427/ ---
[GitHub] carbondata issue #2537: [CARBONDATA-2768][CarbonStore] Fix error in tests fo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2537 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6185/ ---
[GitHub] carbondata issue #2456: [CARBONDATA-2694][32k] Show longstring table propert...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2456 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7428/ ---
[GitHub] carbondata issue #2533: [CARBONDATA-2765]handle flat folder support for impl...
Github user akashrn5 commented on the issue: https://github.com/apache/carbondata/pull/2533 retest sdv please ---
[GitHub] carbondata issue #2538: [CARBONDATA-2769] Fix bug when getting shard name fr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2538 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6184/ ---
[GitHub] carbondata issue #2543: [HOTFIX] Minor optimization for getBlockletNumber to...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2543 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6183/ ---
[GitHub] carbondata issue #2538: [CARBONDATA-2769] Fix bug when getting shard name fr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2538 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7425/ ---
[GitHub] carbondata issue #2537: [CARBONDATA-2768][CarbonStore] Fix error in tests fo...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2537 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7426/ ---
[GitHub] carbondata issue #2543: [HOTFIX] Minor optimization for getBlockletNumber to...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2543 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7423/ ---
[GitHub] carbondata issue #2533: [CARBONDATA-2765]handle flat folder support for impl...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2533 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5972/ ---
[GitHub] carbondata pull request #2456: [CARBONDATA-2694][32k] Show longstring table ...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2456#discussion_r204616746 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala --- @@ -114,6 +114,13 @@ private[sql] case class CarbonDescribeFormattedCommand( CarbonCommonConstants.CACHE_LEVEL_DEFAULT_VALUE), "")) val isStreaming = tblProps.asScala.getOrElse("streaming", "false") results ++= Seq(("Streaming", isStreaming, "")) + +// longstring related info +if (tblProps.containsKey(CarbonCommonConstants.LONG_STRING_COLUMNS)) { --- End diff -- ok~ ---
[GitHub] carbondata pull request #2456: [CARBONDATA-2694][32k] Show longstring table ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2456#discussion_r204615191 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala --- @@ -114,6 +114,13 @@ private[sql] case class CarbonDescribeFormattedCommand( CarbonCommonConstants.CACHE_LEVEL_DEFAULT_VALUE), "")) val isStreaming = tblProps.asScala.getOrElse("streaming", "false") results ++= Seq(("Streaming", isStreaming, "")) + +// longstring related info +if (tblProps.containsKey(CarbonCommonConstants.LONG_STRING_COLUMNS)) { --- End diff -- ok, we should stick to that ---
[GitHub] carbondata issue #2387: [CARBONDATA-2621][BloomDataMap] Lock problem in inde...
Github user brijoobopanna commented on the issue: https://github.com/apache/carbondata/pull/2387 retest this please ---
[GitHub] carbondata pull request #2538: [CARBONDATA-2769] Fix bug when getting shard ...
Github user kevinjmh commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2538#discussion_r204614072 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java --- @@ -665,9 +665,21 @@ public static String getCarbonIndexFileName(String actualBlockName) { * @return */ public static String getShardName(String actualBlockName) { -return DataFileUtil.getTaskNo(actualBlockName) + "-" + DataFileUtil.getBucketNo(actualBlockName) -+ "-" + DataFileUtil.getSegmentNo(actualBlockName) + "-" + DataFileUtil -.getTimeStampFromFileName(actualBlockName); +String segmentNoStr = DataFileUtil.getSegmentNo(actualBlockName); +StringBuilder shardName = new StringBuilder(); +// data before version 1.4 does not have segmentNo in filename --- End diff -- Fixed ---
[GitHub] carbondata issue #2544: [WIP][CarbonStore] Support ingesting data from DIS
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2544 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7424/ ---
[GitHub] carbondata pull request #2537: [CARBONDATA-2768][CarbonStore] Fix error in t...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2537#discussion_r204613470 --- Diff: core/src/main/java/org/apache/carbondata/core/statusmanager/FileFormatProperties.java --- @@ -17,10 +17,19 @@ package org.apache.carbondata.core.statusmanager; +import java.util.HashSet; +import java.util.Set; + /** * Provides the constant name for the file format properties */ public class FileFormatProperties { + public static Set SUPPORTED_EXTERNAL_FORMAT = new HashSet() { --- End diff -- OK ---
[GitHub] carbondata pull request #2544: [WIP][CarbonStore] Support ingesting data fro...
GitHub user QiangCai opened a pull request: https://github.com/apache/carbondata/pull/2544 [WIP][CarbonStore] Support ingesting data from DIS Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/carbondata support_dis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2544.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2544 commit 0adf03ba5f666a79c73a73ef1b9bb1e34dfee814 Author: QiangCai Date: 2018-07-19T06:50:38Z fix task locality issue dependency commit 0f0f6be778a10cd8d8443bb13455bb4883d7823b Author: QiangCai Date: 2018-07-24T03:18:59Z support ingesting data from DIS ---
[GitHub] carbondata pull request #2537: [CARBONDATA-2768][CarbonStore] Fix error in t...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2537#discussion_r204612717 --- Diff: core/src/main/java/org/apache/carbondata/core/statusmanager/FileFormatProperties.java --- @@ -17,10 +17,19 @@ package org.apache.carbondata.core.statusmanager; +import java.util.HashSet; +import java.util.Set; + /** * Provides the constant name for the file format properties */ public class FileFormatProperties { + public static Set SUPPORTED_EXTERNAL_FORMAT = new HashSet() { --- End diff -- OK ---
[GitHub] carbondata pull request #2528: [CARBONDATA-2767][CarbonStore] Fix task local...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2528#discussion_r204612669 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1882,6 +1882,13 @@ public static final String CARBON_MERGE_INDEX_IN_SEGMENT_DEFAULT = "true"; + /** + * config carbon scan task locality --- End diff -- Please provide more detail, like what scheduling behavior will be used for true and false ---
[jira] [Assigned] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result
[ https://issues.apache.org/jira/browse/CARBONDATA-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin reassigned CARBONDATA-2773: -- Assignee: xuchuanyin > Load one file for multiple times in one load command cause wrong query result > - > > Key: CARBONDATA-2773 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2773 > Project: CarbonData > Issue Type: Bug >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > > CarbonData now support load multiple files in one load command. The file path > can be comma separated. > But when I try to load one file for multiple times in one load command, the > query result is wrong. > The load command looks like below: > ``` > LOAD DATA LOCAL INPATH 'file1,file1,file1' INTO TABLE test_table; > ``` > The expected result should be the triple of the file content, but actually > the result is the file content not tripled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2456: [CARBONDATA-2694][32k] Show longstring table ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2456#discussion_r204611831 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala --- @@ -114,6 +114,13 @@ private[sql] case class CarbonDescribeFormattedCommand( CarbonCommonConstants.CACHE_LEVEL_DEFAULT_VALUE), "")) val isStreaming = tblProps.asScala.getOrElse("streaming", "false") results ++= Seq(("Streaming", isStreaming, "")) + +// longstring related info +if (tblProps.containsKey(CarbonCommonConstants.LONG_STRING_COLUMNS)) { --- End diff -- Please use a more human readable string like `Long String Columns` ---
[jira] [Created] (CARBONDATA-2773) Load one file for multiple times in one load command cause wrong query result
xuchuanyin created CARBONDATA-2773: -- Summary: Load one file for multiple times in one load command cause wrong query result Key: CARBONDATA-2773 URL: https://issues.apache.org/jira/browse/CARBONDATA-2773 Project: CarbonData Issue Type: Bug Reporter: xuchuanyin CarbonData now support load multiple files in one load command. The file path can be comma separated. But when I try to load one file for multiple times in one load command, the query result is wrong. The load command looks like below: ``` LOAD DATA LOCAL INPATH 'file1,file1,file1' INTO TABLE test_table; ``` The expected result should be the triple of the file content, but actually the result is the file content not tripled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2512) Support long_string_columns in sdk
[ https://issues.apache.org/jira/browse/CARBONDATA-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2512. -- Resolution: Fixed Fix Version/s: 1.4.1 1.5.0 > Support long_string_columns in sdk > -- > > Key: CARBONDATA-2512 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2512 > Project: CarbonData > Issue Type: Sub-task >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > Fix For: 1.5.0, 1.4.1 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2455: [CARBONDATA-2512][32k] Support writing longst...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2455 ---
[GitHub] carbondata issue #2455: [CARBONDATA-2512][32k] Support writing longstring th...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2455 LGTM ---
[GitHub] carbondata pull request #2537: [CARBONDATA-2768][CarbonStore] Fix error in t...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2537#discussion_r204610009 --- Diff: core/src/main/java/org/apache/carbondata/core/statusmanager/FileFormatProperties.java --- @@ -17,10 +17,19 @@ package org.apache.carbondata.core.statusmanager; +import java.util.HashSet; +import java.util.Set; + /** * Provides the constant name for the file format properties */ public class FileFormatProperties { + public static Set SUPPORTED_EXTERNAL_FORMAT = new HashSet() { --- End diff -- better to make it private and provide function to validate whether user input format is supported ---
[GitHub] carbondata pull request #2538: [CARBONDATA-2769] Fix bug when getting shard ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2538#discussion_r204609104 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java --- @@ -665,9 +665,21 @@ public static String getCarbonIndexFileName(String actualBlockName) { * @return */ public static String getShardName(String actualBlockName) { -return DataFileUtil.getTaskNo(actualBlockName) + "-" + DataFileUtil.getBucketNo(actualBlockName) -+ "-" + DataFileUtil.getSegmentNo(actualBlockName) + "-" + DataFileUtil -.getTimeStampFromFileName(actualBlockName); +String segmentNoStr = DataFileUtil.getSegmentNo(actualBlockName); +StringBuilder shardName = new StringBuilder(); +// data before version 1.4 does not have segmentNo in filename --- End diff -- And please add comment in function header to describe what is the component in shard name string ---
[GitHub] carbondata pull request #2538: [CARBONDATA-2769] Fix bug when getting shard ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2538#discussion_r204608952 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java --- @@ -665,9 +665,21 @@ public static String getCarbonIndexFileName(String actualBlockName) { * @return */ public static String getShardName(String actualBlockName) { -return DataFileUtil.getTaskNo(actualBlockName) + "-" + DataFileUtil.getBucketNo(actualBlockName) -+ "-" + DataFileUtil.getSegmentNo(actualBlockName) + "-" + DataFileUtil -.getTimeStampFromFileName(actualBlockName); +String segmentNoStr = DataFileUtil.getSegmentNo(actualBlockName); +StringBuilder shardName = new StringBuilder(); +// data before version 1.4 does not have segmentNo in filename --- End diff -- Should this be moved to else block? ---
[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2540 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5971/ ---
[GitHub] carbondata issue #2543: [HOTFIX] Minor optimization for getBlockletNumber to...
Github user kevinjmh commented on the issue: https://github.com/apache/carbondata/pull/2543 LGTM ---
[GitHub] carbondata pull request #2543: [HOTFIX] Minor optimization for getBlockletNu...
GitHub user jackylk opened a pull request: https://github.com/apache/carbondata/pull/2543 [HOTFIX] Minor optimization for getBlockletNumber to return integer Change BlockletScannedResult.getBlockletNumber to return integer - [X] Any interfaces changed? No - [X] Any backward compatibility impacted? No - [X] Document update required? No - [X] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. No testcase is added, rerun existing test suites - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/jackylk/incubator-carbondata parseint Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2543.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2543 commit 44dbb6632b075cb869966dbb756c845818091fac Author: Jacky Li Date: 2018-07-24T02:33:14Z change getBlockletNumber to return integer ---
[GitHub] carbondata pull request #2539: [CARBONDATA-2770][BloomDataMap] Optimize code...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2539 ---
[GitHub] carbondata issue #2539: [CARBONDATA-2770][BloomDataMap] Optimize code to get...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2539 LGTM ---
[GitHub] carbondata issue #2539: [CARBONDATA-2770][BloomDataMap] Optimize code to get...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2539 retest sdv please ---
[GitHub] carbondata issue #2538: [CARBONDATA-2769] Fix bug when getting shard name fr...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2538 LGTM ---
[GitHub] carbondata issue #2539: [CARBONDATA-2770][BloomDataMap] Optimize code to get...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2539 LGTM Waiting for the build ---
[GitHub] carbondata issue #2539: [CARBONDATA-2770][BloomDataMap] Optimize code to get...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2539 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5970/ ---
[GitHub] carbondata issue #2464: [CARBONDATA-2618][32K] Split to multiple pages if va...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2464 retest sdv please ---
[jira] [Resolved] (CARBONDATA-2550) [MV] Limit is ignored when data fetched from MV, Query rewrite is Wrong
[ https://issues.apache.org/jira/browse/CARBONDATA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2550. -- Resolution: Fixed Fix Version/s: 1.4.1 1.5.0 > [MV] Limit is ignored when data fetched from MV, Query rewrite is Wrong > --- > > Key: CARBONDATA-2550 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2550 > Project: CarbonData > Issue Type: Bug >Reporter: Babulal >Assignee: xubo245 >Priority: Major > Fix For: 1.5.0, 1.4.1 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > 0: jdbc:hive2://10.18.222.231:23040> create table mvtable1(name string,age > int,salary int) stored by 'carbondata'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.279 seconds) > 0: jdbc:hive2://10.18.222.231:23040> insert into mvtable1 select 'n1',12,12; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (11.973 seconds) > 0: jdbc:hive2://10.18.222.231:23040> insert into mvtable1 select 'n1',12,12; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (9.92 seconds) > 0: jdbc:hive2://10.18.222.231:23040> insert into mvtable1 select 'n3',12,12; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (9.883 seconds) > 0: jdbc:hive2://10.18.222.231:23040> insert into mvtable1 select 'n4',12,12; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (10.488 seconds) > 0: jdbc:hive2://10.18.222.231:23040> select name,sum(salary) from mvtable1 > group by name; > +---+--+--+ > | name | sum(salary) | > +---+--+--+ > | n3 | 12 | > | n1 | 24 | > | n4 | 12 | > +---+--+–+ > 0: jdbc:hive2://10.18.222.231:23040> select name,sum(salary) from mvtable1 > group by name limit 2; > +---+--+--+ > | name | sum(salary) | > +---+--+--+ > | n3 | 12 | > | n1 | 24 | > +---+--+--+ > 2 rows selected (4.175 seconds) > 0: jdbc:hive2://10.18.222.231:23040> create datamap map1 using 'mv' as select > name,sum(salary) from mvtable1 group by name; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.396 seconds) > 0: jdbc:hive2://10.18.222.231:23040> rebuild datamap map1; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (13.246 seconds) > > *0: jdbc:hive2://10.18.222.231:23040> select name,sum(salary) from mvtable1 > group by name limit 2;* > ++-+--+ > | mvtable1_name | sum_salary | > ++-+--+ > | n3 | 12 | > | n1 | 24 | > | n4 | 12 | > ++-+--+ > 3 rows selected (2.453 seconds) > *0: jdbc:hive2://10.18.222.231:23040> select name,sum(salary) from mvtable1 > group by name limit 1;* > ++-+--+ > | mvtable1_name | sum_salary | > ++-+--+ > | n3 | 12 | > | n1 | 24 | > | n4 | 12 | > ++-+--+ > 3 rows selected (0.347 seconds) > 0: jdbc:hive2://10.18.222.231:23040> > > > Even limit is given MV returns all the records from MV table. > Cause:- > When Rewriting MV query ,limit is ignored, > 0: jdbc:hive2://10.18.222.231:23040> explain select name,sum(salary) from > mvtable1 *group by name limit 2;* > | plan | > | == CarbonData Profiler == > Table Scan on map1_table > - total blocklets: 2 > - filter: none > - pruned by Main DataMap > - skipped blocklets: 0 > | > | == Physical Plan == > *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table > name :map1_table, Schema > :Some(StructType(StructField(mvtable1_name,StringType,true), > StructField(sum_salary,LongType,true))) ] > default.map1_table[mvtable1_name#4438,sum_salary#4614L] | > +--+--+ > 2 rows selected (0.36 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2480: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix li...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2480 ---
[jira] [Resolved] (CARBONDATA-2542) MV creation is failed for other than default database
[ https://issues.apache.org/jira/browse/CARBONDATA-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2542. -- Resolution: Fixed Assignee: Ravindra Pesala (was: xubo245) Fix Version/s: 1.4.1 1.5.0 > MV creation is failed for other than default database > - > > Key: CARBONDATA-2542 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2542 > Project: CarbonData > Issue Type: Bug >Reporter: Babulal >Assignee: Ravindra Pesala >Priority: Major > Fix For: 1.5.0, 1.4.1 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > 0: jdbc:hive2://10.18.222.231:23040> CREATE TABLE fact10 (empname String, > designation String, doj Timestamp, workgroupcategory int, > workgroupcategoryname String, deptno int, deptname String, projectcode int, > projectjoindate Timestamp, projectenddate Timestamp,attendance int, > utilization int,salary int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='256'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.962 seconds) > 0: jdbc:hive2://10.18.222.231:23040> LOAD DATA local inpath > '/tmp/babu/data_big_1.csv' INTO TABLE fact10 OPTIONS('DELIMITER'= ',', > 'QUOTECHAR'= > '"','timestampformat'='dd-MM-','FILEHEADER'='empno,empname,designation,doj,workgroupcategory,workgroupcategoryname,deptno,deptname,projectcode,projectjoindate,projectenddate,attendance,utilization,salary'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (6.188 seconds) > 0: jdbc:hive2://10.18.222.231:23040> create datamap datamap66 using 'mv' as > select doj,sum(salary) from *babu.fact10* group by doj; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.893 seconds) > 0: jdbc:hive2://10.18.222.231:23040> create datamap datamap68 using 'mv' as > select doj,sum(salary) *from fact10* group by doj; > Error: org.apache.spark.sql.AnalysisException: *Table or view not found: > fact10*; line 1 pos 49 (state=,code=0) > 0: jdbc:hive2://10.18.222.231:23040> > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2480: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix limit and...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2480 The conflict is due to test case adding in #2476 It is not a code problem and I have resolved the conflict. Merging this to master ---
[GitHub] carbondata pull request #2479: [CARBONDATA-2542][MV] Fix the mv query from t...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2479 ---
[GitHub] carbondata issue #2479: [CARBONDATA-2542][MV] Fix the mv query from table wi...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2479 The conflict is due to test case adding in #2476 I have resolved the conflict and merging this to master ---
[jira] [Resolved] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()
[ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-2534. -- Resolution: Fixed Fix Version/s: 1.4.1 1.5.0 > MV Dataset - MV creation is not working with the substring() > - > > Key: CARBONDATA-2534 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2534 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: 3 node opensource ANT cluster >Reporter: Prasanna Ravichandran >Priority: Minor > Labels: CarbonData, MV, Materialistic_Views > Fix For: 1.5.0, 1.4.1 > > Attachments: MV_substring.docx, data.csv > > Time Spent: 3h > Remaining Estimate: 0h > > MV creation is not working with the sub string function. We are getting the > spark.sql.AnalysisException while trying to create a MV with the substring > and aggregate function. > *Spark -shell test queries:* > scala> carbon.sql("create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation").show(200,false) > *org.apache.spark.sql.AnalysisException: Cannot create a table having a > column whose name contains commas in Hive metastore. Table: > `default`.`mv_substr_table`; Column: substring_empname,_2,_5;* > *at* > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316) > at > org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95) > at > org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68) > at > org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103) > at > org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53) > at > org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155) > at
[GitHub] carbondata pull request #2476: [CARBONDATA-2534][MV] Fix substring expressio...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2476 ---
[GitHub] carbondata issue #2441: [CARBONDATA-2625] optimize CarbonReader performance
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2441 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5969/ ---
[GitHub] carbondata issue #2480: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix limit and...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2480 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6182/ ---
[GitHub] carbondata issue #2476: [CARBONDATA-2534][MV] Fix substring expression not w...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2476 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6178/ ---
[GitHub] carbondata issue #2484: [HOTFIX] added hadoop conf to thread local
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2484 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5968/ ---
[GitHub] carbondata issue #2475: [CARBONDATA-2531][MV] Fix alias not working on MV qu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2475 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6179/ ---
[GitHub] carbondata issue #2539: [CARBONDATA-2770][BloomDataMap] Optimize code to get...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2539 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5967/ ---
[GitHub] carbondata issue #2524: [CARBONDATA-2532][Integration] Carbon to support spa...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2524 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6181/ ---
[GitHub] carbondata issue #2542: [CARBONDATA-2772] Size based dictionary fallback is ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2542 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6180/ ---
[GitHub] carbondata issue #2477: [CARBONDATA-2539][MV] Fix predicate subquery which u...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2477 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6177/ ---
[GitHub] carbondata issue #2477: [CARBONDATA-2539][MV] Fix predicate subquery which u...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2477 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7420/ ---
[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2478 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7419/ ---
[GitHub] carbondata issue #2479: [CARBONDATA-2542][MV] Fix the mv query from table wi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2479 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6175/ ---
[GitHub] carbondata issue #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2478 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6176/ ---
[GitHub] carbondata issue #2542: [CARBONDATA-2772] Size based dictionary fallback is ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2542 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7422/ ---
[GitHub] carbondata issue #2524: [CARBONDATA-2532][Integration] Carbon to support spa...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7421/ ---
[GitHub] carbondata issue #2479: [CARBONDATA-2542][MV] Fix the mv query from table wi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2479 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7418/ ---
[GitHub] carbondata issue #2517: [CARBONDATA-2749][dataload] In HDFS Empty tablestatu...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2517 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5966/ ---
[GitHub] carbondata issue #2480: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix limit and...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2480 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7417/ ---
[GitHub] carbondata issue #2542: [CARBONDATA-2772] Size based dictionary fallback is ...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2542 retest this please ---
[GitHub] carbondata issue #2511: [CARBONDATA-2745] Added atomic file operations for S...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2511 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6172/ ---
[GitHub] carbondata issue #2541: [CARBONDATA-2771]block update and delete on table if...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2541 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6174/ ---
[GitHub] carbondata issue #2475: [CARBONDATA-2531][MV] Fix alias not working on MV qu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2475 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7415/ ---
[GitHub] carbondata issue #2533: [CARBONDATA-2765]handle flat folder support for impl...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2533 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6171/ ---
[GitHub] carbondata issue #2476: [CARBONDATA-2534][MV] Fix substring expression not w...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2476 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7414/ ---
[GitHub] carbondata issue #2542: [CARBONDATA-2772] Size based dictionary fallback is ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2542 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7416/ ---
[GitHub] carbondata issue #2511: [CARBONDATA-2745] Added atomic file operations for S...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2511 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7413/ ---
[GitHub] carbondata issue #2542: [CARBONDATA-2772] Size based dictionary fallback is ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2542 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6173/ ---
[GitHub] carbondata issue #2484: [HOTFIX] added hadoop conf to thread local
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2484 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6166/ ---
[GitHub] carbondata issue #2533: [CARBONDATA-2765]handle flat folder support for impl...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2533 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7412/ ---
[GitHub] carbondata issue #2525: [CARBONDATA-2756] added BSD license for zstd-jni dep...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2525 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5965/ ---
[GitHub] carbondata issue #2535: [CARBONDATA-2606]Fix Complex array Pushdown
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2535 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7411/ ---
[GitHub] carbondata issue #2535: [CARBONDATA-2606]Fix Complex array Pushdown
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2535 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6170/ ---
[GitHub] carbondata pull request #2480: [CARBONDATA-2550][CARBONDATA-2576][MV] Fix li...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2480#discussion_r204481327 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -429,7 +436,7 @@ object MVHelper { /** * Updates the flagspec of given select plan with attributes of relation select plan */ - private def updateSortOrder(keepAlias: Boolean, + private def updateFlagSpec(keepAlias: Boolean, select: Select, relation: Select, aliasMap: Map[AttributeKey, NamedExpression]) = { --- End diff -- ok ---
[GitHub] carbondata issue #2540: [CARBONDATA-2649] Handled executor min/max pruning w...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2540 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6169/ ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479397 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479352 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479283 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479337 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479261 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479220 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { + throw new UnsupportedOperationException("MV is not supported for this query") --- End diff -- ok ---
[GitHub] carbondata pull request #2542: [CARBONDATA-2772] Size based dictionary fallb...
GitHub user BJangir opened a pull request: https://github.com/apache/carbondata/pull/2542 [CARBONDATA-2772] Size based dictionary fallback is failing even thre⦠Issue:- Size Based Fallback happened even threshold is not reached. RootCause:- Current size calculation is wrong. it is calculated for each data. instead of generated dictionary data . Solution :- Current size should be calculated only for generated dictionary data. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Tests manually in 3 Node setup with 2 billion records. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BJangir/incubator-carbondata CARBONDATA-2772 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2542.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2542 commit 13219ad527a0e8d0cd6c1f46b4e132696562f5f1 Author: BJangir Date: 2018-07-23T16:44:12Z [CARBONDATA-2772] Size based dictionary fallback is failing even threshold is not reached. ---
[jira] [Created] (CARBONDATA-2772) Size based dictionary fallback is failing even threshold is not reached.
Babulal created CARBONDATA-2772: --- Summary: Size based dictionary fallback is failing even threshold is not reached. Key: CARBONDATA-2772 URL: https://issues.apache.org/jira/browse/CARBONDATA-2772 Project: CarbonData Issue Type: Bug Reporter: Babulal create table and load data ~2 Billion Check fallback logs. for some column fallback happens with below message even threshold is not reached. "Unable to generate dictionary. Dictionary Size crossed 2GB limit" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2541: [CARBONDATA-2771]block update and delete on table if...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2541 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7410/ ---
[GitHub] carbondata pull request #2477: [CARBONDATA-2539][MV] Fix predicate subquery ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2477#discussion_r204474392 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -463,6 +485,23 @@ object MVHelper { } } + // Create the aliases using two plan outputs mappings. + def createAliases(mappings: Seq[(NamedExpression, NamedExpression)]): Seq[NamedExpression] = { +val oList = for ((o1, o2) <- mappings) yield { --- End diff -- ok ---
[GitHub] carbondata issue #2387: [CARBONDATA-2621][BloomDataMap] Lock problem in inde...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2387 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7409/ ---
[GitHub] carbondata issue #2533: [CARBONDATA-2765]handle flat folder support for impl...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2533 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7408/ ---
[GitHub] carbondata issue #2538: [CARBONDATA-2769] Fix bug when getting shard name fr...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2538 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5964/ ---
[GitHub] carbondata pull request #2517: [CARBONDATA-2749][dataload] In HDFS Empty tab...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2517#discussion_r204456375 --- Diff: core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java --- @@ -70,12 +78,20 @@ public AtomicFileOperationsImpl(String filePath, FileType fileType) { if (null != dataOutStream) { CarbonUtil.closeStream(dataOutStream); CarbonFile tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - if (!tempFile.renameForce(filePath)) { -throw new IOException("temporary file renaming failed, src=" -+ tempFile.getPath() + ", dest=" + filePath); + if (!this.setFailed) { +if (!tempFile.renameForce(filePath)) { + throw new IOException( + "temporary file renaming failed, src=" + tempFile.getPath() + ", dest=" + filePath); +} } +} else { + LOGGER.warn("The temporary file renaming skipped due to I/O error, deleting file " --- End diff -- Here actually delete code is added. AtomicFileOperationsImpl is taking care of overwriting temp file even if it exists. So, please correct the message accordingly. ---
[GitHub] carbondata issue #2535: [CARBONDATA-2606]Fix Complex array Pushdown
Github user Indhumathi27 commented on the issue: https://github.com/apache/carbondata/pull/2535 Retest sdv please ---
[GitHub] carbondata issue #2535: [CARBONDATA-2606]Fix Complex array Pushdown
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2535 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7407/ ---