[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2478 ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479397 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479352 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479283 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479337 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479261 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204479220 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { + throw new UnsupportedOperationException("MV is not supported for this query") --- End diff -- ok ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204436654 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { + throw new UnsupportedOperationException("MV is not supported for this query") --- End diff -- Can you explain for specificlly like `Only Select-Predicate-Join-Groupby-Having query is supported for MV' ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204435991 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { + throw new UnsupportedOperationException("MV with same query present") +} +if (!modularPlan.isSPJGH) { --- End diff -- I think it is better to move this before line 139 ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204435669 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] --- End diff -- Does casting `null` to `SummaryDatasetCatalog` ok? ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204435022 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider --- End diff -- useless statement ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204434793 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { --- End diff -- add `: Unit` ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2478#discussion_r204432239 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -118,6 +122,43 @@ object MVHelper { DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema) } + private def validateMVQuery(sparkSession: SparkSession, + logicalPlan: LogicalPlan) { +val dataMapProvider = DataMapManager.get().getDataMapProvider(null, + new DataMapSchema("", DataMapClassProvider.MV.getShortName), sparkSession) +dataMapProvider +var catalog = DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider, + DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog] +if (catalog == null) { + catalog = new SummaryDatasetCatalog(sparkSession) +} +val modularPlan = + catalog.mvSession.sessionState.modularizer.modularize( + catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized + +val isValid = modularPlan match { + case g: GroupBy => +// Make sure all predicates are present in projections. +g.predicateList.forall{p => + g.outputList.exists{ +case a: Alias => + a.semanticEquals(p) || a.child.semanticEquals(p) +case other => other.semanticEquals(p) + } +} + case _ => true +} +if (!isValid) { + throw new UnsupportedOperationException("Group by columns must be present in project columns") +} +if(catalog.isMVWithSameQueryPresent(logicalPlan)) { --- End diff -- add space after `if` ---
[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...
GitHub user ravipesala opened a pull request: https://github.com/apache/carbondata/pull/2478 [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][MV] Add validations for unsupported MV queries This PR depends on https://github.com/apache/carbondata/pull/2453 Problem: Validations are missing on the unsupported MV queries while creating MV datamap. Solution: Added validation for the unsupported MV queries while creating datamap. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata pr-2540 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2478.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2478 commit 844cff764e94f18ff8d115ccd636a06a0931bffb Author: ravipesala Date: 2018-06-14T06:10:07Z Fixed order by in mv and aggregation functions inside projection expressions are fixed commit 642d05b9dab3a001c2338a5dca920d952f0ccd5c Author: ravipesala Date: 2018-07-09T10:19:57Z Added validation for unsupported MV queries ---