[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-11 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
@ravipesala @jackylk It looks like the discussion become uninteresting and 
unproductive now.  I just try to make some constructive comments.  The comments 
I made in the previous discussions remain true.  In terms of MV, the 
performance bottleneck comes from the amount of data we need to process and how 
we process them, e.g. using cache, rather than matching time of the logical 
plan.  The reason I take the approach in chowbranch because the old code is 
difficult to understand and to extend.  This is not just my opinion.  I will 
stop here and not make any further comment.   



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-11 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
LGTM


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-11 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
@eychih The code added to rewrite with MV table is done outside so there 
would not be any harm to the purpose of MV. And moreover checking the 
sameresult multiple times gives performance problems while doing matching. And 
we have tested the sameresult and gives the problem if the session changes 
unless we maintain the cache for each session, I think that is not a good idea.


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-10 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
@ravipesala Incremental loading of MV tables should be done separately from 
query rewrite.  Mixing of query rewrite with incremental loading defeats the 
purpose of MV.  Also, the method sameresult is independent of sessions.  The 
method can be applied to logical plans created in different sessions.  Finally, 
the way used in the chowbranch will have better performance when an MV is 
cached in memory because Spark replaces the corresponding subplan with an 
in-memory table first.


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-09 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
@eychih I have merged the necessary changes from chowbranch to this PR. And 
left outer join changes will be taken in a different PR.  Factored out MV 
subplan to table is handled in `withMVTable` method. I cannot use the 
chowbranch's way it give issues related to incremental loading of table and 
sameresult cannot work if the session is changed.And also it has performnace 
impact if we use the chowbranch's way. Dropping of preAgg function is handled 
in MVHelper class



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-07 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
In addition, for consistency, please use the following way to remove 
PreAggFunction:

val planToRegister = 
sparkSession.sql(updatedQuery).queryExecution.analyzed.
  transform {
case p @ logical.Project(head :: tail, _) if 
head.isInstanceOf[Alias] && head.name.equals("preAgg") => p.copy(projectList = 
tail)
case a @ logical.Aggregate(_, head :: tail, _) if 
head.isInstanceOf[Alias] && head.name.equals("preAgg") => 
a.copy(aggregateExpressions = tail)
  }


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-07 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
For 2 in the previous comment,  in SummaryDatasetCatalog.scala, I add a 
method useSummaryDataset(plan: LogicalPlan) to replace a subplan of an MV with 
the corresponding MV table.



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-07 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
I saw you already integrate some code in chowbranch.  However, in 
MVAnalyzerRule.scala, there are some issues:
1. At line 68, you call withMVTable directly.  The issue is, if the method 
throws exceptions, there is no way to handle them.
2. where do you replace the factored out MV subplan with the corresponding 
MV table? (please refer the same file in chowbranch, to see how this get 
handled)   


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-07 Thread eychih
Github user eychih commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
can you merge changes in chowbranch into master?  code in chowbranch has 
passed test cases in both old and new test cases and has support for the left 
outer join view?


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-05 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
@jackylk Please review and merge


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5222/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5090/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6252/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5215/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5083/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6245/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5214/



---