[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2453 ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202504966 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -147,6 +149,34 @@ object MVHelper { }.filter(_.isDefined).map(_.get) } + + /** + * Check if we can do incremental load on the mv table. Some cases like aggregation functions + * which are present inside other expressions like sum(a)+sum(b) cannot be incremental loaded. + */ + private def isFullReload(logicalPlan: LogicalPlan): Boolean = { +var isFullReload = false +logicalPlan.transformAllExpressions { + case a: Alias => +a + case agg: AggregateExpression => agg + case c: Cast => +isFullReload = c.child.find { + case agg: AggregateExpression => false + case _ => false +}.isDefined || isFullReload +c + case exp: Expression => +// Check any aggregation function present inside other expression. +isFullReload = exp.find { + case agg: AggregateExpression => true --- End diff -- Can you modify comment to tell like `return true if ` ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202244579 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/SummaryDatasetCatalog.scala --- @@ -39,6 +40,12 @@ private[mv] case class SummaryDataset(signature: Option[Signature], dataMapSchema: DataMapSchema, relation: ModularPlan) +case class MVPlanWrapper(plan: ModularPlan, dataMapSchema: DataMapSchema) extends ModularPlan { --- End diff -- No, it is to provide datamap schema information along with relation. Added comment as well. ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202244376 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -290,16 +327,18 @@ object MVHelper { * @return Updated modular plan. */ def updateDataMap(subsumer: ModularPlan, rewrite: QueryRewrite): ModularPlan = { -subsumer match { + subsumer match { --- End diff -- ok ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202244323 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -179,7 +209,7 @@ object MVHelper { case _ => false } -override def hashCode: Int = exp.hashCode +override def hashCode: Int = 1 --- End diff -- added ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202244238 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -147,6 +149,34 @@ object MVHelper { }.filter(_.isDefined).map(_.get) } + + /** + * Check if we can do incremental load on the mv table. Some cases like aggregation functions --- End diff -- Yes, it checks whether a logical plan can support full refresh or incremental refresh. ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202244060 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -147,6 +149,34 @@ object MVHelper { }.filter(_.isDefined).map(_.get) } + + /** + * Check if we can do incremental load on the mv table. Some cases like aggregation functions + * which are present inside other expressions like sum(a)+sum(b) cannot be incremental loaded. + */ + private def isFullReload(logicalPlan: LogicalPlan): Boolean = { +var isFullReload = false +logicalPlan.transformAllExpressions { + case a: Alias => +a + case agg: AggregateExpression => agg + case c: Cast => +isFullReload = c.child.find { + case agg: AggregateExpression => false + case _ => false +}.isDefined || isFullReload +c + case exp: Expression => +// Check any aggregation function present inside other expression. +isFullReload = exp.find { + case agg: AggregateExpression => true --- End diff -- no, it should be true. If the aggregation function present inside any other expression like add(sum(a) + sum(b)) then we should always do full refresh instead of incremental refresh ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202224158 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/SummaryDatasetCatalog.scala --- @@ -39,6 +40,12 @@ private[mv] case class SummaryDataset(signature: Option[Signature], dataMapSchema: DataMapSchema, relation: ModularPlan) +case class MVPlanWrapper(plan: ModularPlan, dataMapSchema: DataMapSchema) extends ModularPlan { --- End diff -- Why is this needed? package visibility problem? ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202224057 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -290,16 +327,18 @@ object MVHelper { * @return Updated modular plan. */ def updateDataMap(subsumer: ModularPlan, rewrite: QueryRewrite): ModularPlan = { -subsumer match { + subsumer match { --- End diff -- incorrect identation ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202224012 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -179,7 +209,7 @@ object MVHelper { case _ => false } -override def hashCode: Int = exp.hashCode +override def hashCode: Int = 1 --- End diff -- please provide comment for 1 ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202223973 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -147,6 +149,34 @@ object MVHelper { }.filter(_.isDefined).map(_.get) } + + /** + * Check if we can do incremental load on the mv table. Some cases like aggregation functions --- End diff -- A bit confused here, the function name is `isFullReload`, this function will return true if incremental load is not supported on input logicPlan? ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2453#discussion_r202223776 --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala --- @@ -147,6 +149,34 @@ object MVHelper { }.filter(_.isDefined).map(_.get) } + + /** + * Check if we can do incremental load on the mv table. Some cases like aggregation functions + * which are present inside other expressions like sum(a)+sum(b) cannot be incremental loaded. + */ + private def isFullReload(logicalPlan: LogicalPlan): Boolean = { +var isFullReload = false +logicalPlan.transformAllExpressions { + case a: Alias => +a + case agg: AggregateExpression => agg + case c: Cast => +isFullReload = c.child.find { + case agg: AggregateExpression => false + case _ => false +}.isDefined || isFullReload +c + case exp: Expression => +// Check any aggregation function present inside other expression. +isFullReload = exp.find { + case agg: AggregateExpression => true --- End diff -- shouldn't it be false? ---
[GitHub] carbondata pull request #2453: [CARBONDATA-2528][MV] Fixed order by in mv an...
GitHub user ravipesala opened a pull request: https://github.com/apache/carbondata/pull/2453 [CARBONDATA-2528][MV] Fixed order by in mv and aggregation functions inside projection expressions are fixed Problem: Order by queries and the queries with functions like sum(a)+sum(b) are not working in MV. Please check jira for more details. Solution: The queries which have projection functions like sum(a)+sum(b) cannot be incrementally loaded, so introduced a new internal DM property to avoid group by on the final query. In this way the queries like below ``` select empname,sum(salary)+sum(utilization) as total from fact_table1 group by empname order by empname DESC ``` the above query will be rewritten as follows. ``` SELECT mv_order_table.`fact_table1_empname` AS `empname`, mv_order_table.`total` FROM (SELECT mv_order_table.`fact_table1_empname`, mv_order_table.`total` FROM MV_order_table) MV_order_table ORDER BY `empname` DESC NULLS LAST ``` Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata mv-2522 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2453.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2453 commit 7da36c39c6a2354b8f9e0810c13814cba6b61339 Author: ravipesala Date: 2018-06-14T06:10:07Z Fixed order by in mv and aggregation functions inside projection expressions are fixed ---