[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2453


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-13 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202504966
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -147,6 +149,34 @@ object MVHelper {
 }.filter(_.isDefined).map(_.get)
   }
 
+
+  /**
+   * Check if we can do incremental load on the mv table. Some cases like 
aggregation functions
+   * which are present inside other expressions like sum(a)+sum(b) cannot 
be incremental loaded.
+   */
+  private def isFullReload(logicalPlan: LogicalPlan): Boolean = {
+var isFullReload = false
+logicalPlan.transformAllExpressions {
+  case a: Alias =>
+a
+  case agg: AggregateExpression => agg
+  case c: Cast =>
+isFullReload = c.child.find {
+  case agg: AggregateExpression => false
+  case _ => false
+}.isDefined || isFullReload
+c
+  case exp: Expression =>
+// Check any aggregation function present inside other expression.
+isFullReload = exp.find {
+  case agg: AggregateExpression => true
--- End diff --

Can you modify comment to tell like `return true if  `



---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202244579
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/SummaryDatasetCatalog.scala
 ---
@@ -39,6 +40,12 @@ private[mv] case class SummaryDataset(signature: 
Option[Signature],
 dataMapSchema: DataMapSchema,
 relation: ModularPlan)
 
+case class MVPlanWrapper(plan: ModularPlan, dataMapSchema: DataMapSchema) 
extends ModularPlan {
--- End diff --

No, it is to provide datamap schema information along with relation. Added 
comment as well.


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202244376
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -290,16 +327,18 @@ object MVHelper {
* @return Updated modular plan.
*/
   def updateDataMap(subsumer: ModularPlan, rewrite: QueryRewrite): 
ModularPlan = {
-subsumer match {
+ subsumer match {
--- End diff --

ok


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202244323
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -179,7 +209,7 @@ object MVHelper {
   case _ => false
 }
 
-override def hashCode: Int = exp.hashCode
+override def hashCode: Int = 1
--- End diff --

added


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202244238
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -147,6 +149,34 @@ object MVHelper {
 }.filter(_.isDefined).map(_.get)
   }
 
+
+  /**
+   * Check if we can do incremental load on the mv table. Some cases like 
aggregation functions
--- End diff --

Yes, it checks whether a logical plan can support full refresh or 
incremental refresh. 


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202244060
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -147,6 +149,34 @@ object MVHelper {
 }.filter(_.isDefined).map(_.get)
   }
 
+
+  /**
+   * Check if we can do incremental load on the mv table. Some cases like 
aggregation functions
+   * which are present inside other expressions like sum(a)+sum(b) cannot 
be incremental loaded.
+   */
+  private def isFullReload(logicalPlan: LogicalPlan): Boolean = {
+var isFullReload = false
+logicalPlan.transformAllExpressions {
+  case a: Alias =>
+a
+  case agg: AggregateExpression => agg
+  case c: Cast =>
+isFullReload = c.child.find {
+  case agg: AggregateExpression => false
+  case _ => false
+}.isDefined || isFullReload
+c
+  case exp: Expression =>
+// Check any aggregation function present inside other expression.
+isFullReload = exp.find {
+  case agg: AggregateExpression => true
--- End diff --

no, it should be true. If the aggregation function present inside any other 
expression like add(sum(a) + sum(b)) then we should always do full refresh 
instead of incremental refresh


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202224158
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/SummaryDatasetCatalog.scala
 ---
@@ -39,6 +40,12 @@ private[mv] case class SummaryDataset(signature: 
Option[Signature],
 dataMapSchema: DataMapSchema,
 relation: ModularPlan)
 
+case class MVPlanWrapper(plan: ModularPlan, dataMapSchema: DataMapSchema) 
extends ModularPlan {
--- End diff --

Why is this needed? package visibility problem?


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202224057
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -290,16 +327,18 @@ object MVHelper {
* @return Updated modular plan.
*/
   def updateDataMap(subsumer: ModularPlan, rewrite: QueryRewrite): 
ModularPlan = {
-subsumer match {
+ subsumer match {
--- End diff --

incorrect identation


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202224012
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -179,7 +209,7 @@ object MVHelper {
   case _ => false
 }
 
-override def hashCode: Int = exp.hashCode
+override def hashCode: Int = 1
--- End diff --

please provide comment for 1


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202223973
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -147,6 +149,34 @@ object MVHelper {
 }.filter(_.isDefined).map(_.get)
   }
 
+
+  /**
+   * Check if we can do incremental load on the mv table. Some cases like 
aggregation functions
--- End diff --

A bit confused here, the function name is `isFullReload`, this function 
will return true if incremental load is not supported on input logicPlan?


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-12 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2453#discussion_r202223776
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -147,6 +149,34 @@ object MVHelper {
 }.filter(_.isDefined).map(_.get)
   }
 
+
+  /**
+   * Check if we can do incremental load on the mv table. Some cases like 
aggregation functions
+   * which are present inside other expressions like sum(a)+sum(b) cannot 
be incremental loaded.
+   */
+  private def isFullReload(logicalPlan: LogicalPlan): Boolean = {
+var isFullReload = false
+logicalPlan.transformAllExpressions {
+  case a: Alias =>
+a
+  case agg: AggregateExpression => agg
+  case c: Cast =>
+isFullReload = c.child.find {
+  case agg: AggregateExpression => false
+  case _ => false
+}.isDefined || isFullReload
+c
+  case exp: Expression =>
+// Check any aggregation function present inside other expression.
+isFullReload = exp.find {
+  case agg: AggregateExpression => true
--- End diff --

shouldn't it be false?


---


[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...

2018-07-05 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/carbondata/pull/2453

[CARBONDATA-2528][MV] Fixed order by in mv and aggregation functions inside 
projection expressions are fixed

Problem: 
Order by queries and the queries with functions like sum(a)+sum(b) are not 
working in MV.
Please check jira for more details.
Solution:
The queries which have projection functions like sum(a)+sum(b) cannot be 
incrementally loaded, so introduced a new internal DM property to avoid group 
by on the final query. In this way the queries like below
```
select empname,sum(salary)+sum(utilization) as total from fact_table1 group 
by empname order by empname DESC
```
the above query will be rewritten as follows.
```
SELECT mv_order_table.`fact_table1_empname` AS `empname`, 
mv_order_table.`total` 
FROM
  (SELECT mv_order_table.`fact_table1_empname`, mv_order_table.`total` 
  FROM
MV_order_table) MV_order_table 
ORDER BY `empname` DESC NULLS LAST
```


Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata mv-2522

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2453.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2453


commit 7da36c39c6a2354b8f9e0810c13814cba6b61339
Author: ravipesala 
Date:   2018-06-14T06:10:07Z

Fixed order by in mv and aggregation functions inside projection 
expressions are fixed




---