[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2478


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479397
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479352
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479283
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479337
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479261
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
+  throw new UnsupportedOperationException("MV with same query present")
+}
+if (!modularPlan.isSPJGH)  {
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204479220
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
+  throw new UnsupportedOperationException("MV with same query present")
+}
+if (!modularPlan.isSPJGH)  {
+  throw new UnsupportedOperationException("MV is not supported for 
this query")
--- End diff --

ok


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204436654
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
+  throw new UnsupportedOperationException("MV with same query present")
+}
+if (!modularPlan.isSPJGH)  {
+  throw new UnsupportedOperationException("MV is not supported for 
this query")
--- End diff --

Can you explain for specificlly like `Only 
Select-Predicate-Join-Groupby-Having query is supported for MV'


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204435991
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
+  throw new UnsupportedOperationException("MV with same query present")
+}
+if (!modularPlan.isSPJGH)  {
--- End diff --

I think it is better to move this before line 139


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204435669
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
--- End diff --

Does casting `null` to `SummaryDatasetCatalog` ok?


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204435022
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
--- End diff --

useless statement


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204434793
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
--- End diff --

add `: Unit`


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-23 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2478#discussion_r204432239
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -118,6 +122,43 @@ object MVHelper {
 DataMapStoreManager.getInstance().saveDataMapSchema(dataMapSchema)
   }
 
+  private def validateMVQuery(sparkSession: SparkSession,
+  logicalPlan: LogicalPlan) {
+val dataMapProvider = DataMapManager.get().getDataMapProvider(null,
+  new DataMapSchema("", DataMapClassProvider.MV.getShortName), 
sparkSession)
+dataMapProvider
+var catalog = 
DataMapStoreManager.getInstance().getDataMapCatalog(dataMapProvider,
+  
DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
+if (catalog == null) {
+  catalog = new SummaryDatasetCatalog(sparkSession)
+}
+val modularPlan =
+  catalog.mvSession.sessionState.modularizer.modularize(
+
catalog.mvSession.sessionState.optimizer.execute(logicalPlan)).next().semiHarmonized
+
+val isValid = modularPlan match {
+  case g: GroupBy =>
+// Make sure all predicates are present in projections.
+g.predicateList.forall{p =>
+  g.outputList.exists{
+case a: Alias =>
+  a.semanticEquals(p) || a.child.semanticEquals(p)
+case other => other.semanticEquals(p)
+  }
+}
+  case _ => true
+}
+if (!isValid) {
+  throw new UnsupportedOperationException("Group by columns must be 
present in project columns")
+}
+if(catalog.isMVWithSameQueryPresent(logicalPlan)) {
--- End diff --

add space after `if`


---


[GitHub] carbondata pull request #2478: [CARBONDATA-2540][CARBONDATA-2560][CARBONDATA...

2018-07-10 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/carbondata/pull/2478

[CARBONDATA-2540][CARBONDATA-2560][CARBONDATA-2568][MV] Add validations for 
unsupported MV queries 

This PR depends on https://github.com/apache/carbondata/pull/2453
Problem: Validations are missing on the unsupported MV queries while 
creating MV datamap.
Solution: Added validation for the unsupported MV queries while creating 
datamap.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata pr-2540

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2478.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2478


commit 844cff764e94f18ff8d115ccd636a06a0931bffb
Author: ravipesala 
Date:   2018-06-14T06:10:07Z

Fixed order by in mv and aggregation functions inside projection 
expressions are fixed

commit 642d05b9dab3a001c2338a5dca920d952f0ccd5c
Author: ravipesala 
Date:   2018-07-09T10:19:57Z

Added validation for unsupported MV queries




---