[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2207
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5089/



---


[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2207
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6251/



---


[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

2018-06-04 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2252
  
@xuchuanyin 
1. still use 'long_string_columns' instead of varchar datatype to make it 
consistent it spark/hive
   Are u facing any problem with varchar??
2. use an integer (previously short) to store the length of bytes content.
 Only for text data type??



---


[GitHub] carbondata issue #2207: [CARBONDATA-2428] Support flat folder for managed ca...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2207
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5221/



---


[GitHub] carbondata pull request #2335: [CARBONDATA-2573] integrate carbonstore mv br...

2018-06-04 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2335#discussion_r192945546
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVState.scala 
---
@@ -31,25 +31,25 @@ private[mv] class MVState(summaryDatasetCatalog: 
SummaryDatasetCatalog) {
   // Note: These are all lazy vals because they depend on each other (e.g. 
conf) and we
   // want subclasses to override some of the fields. Otherwise, we would 
get a lot of NPEs.
 
-  /**
-   * Modular query plan modularizer
-   */
-  lazy val modularizer = SimpleModularizer
-
-  /**
-   * Logical query plan optimizer.
-   */
-  lazy val optimizer = BirdcageOptimizer
-
-  lazy val matcher = DefaultMatchMaker
-
-  lazy val navigator: Navigator = new Navigator(summaryDatasetCatalog, 
this)
+//  /**
+//   * Modular query plan modularizer
+//   */
+//  lazy val modularizer = SimpleModularizer
+//
+//  /**
+//   * Logical query plan optimizer.
+//   */
--- End diff --

Not required, removed


---


[GitHub] carbondata pull request #2335: [CARBONDATA-2573] integrate carbonstore mv br...

2018-06-04 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2335#discussion_r192945440
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -373,5 +373,25 @@ object MVHelper {
   case other => other
 }
   }
+
+  def rewriteWithMVTable(rewrittenPlan: ModularPlan, rewrite: 
QueryRewrite): ModularPlan = {
--- End diff --

ok


---


[GitHub] carbondata issue #2361: [WIP] Avro logical type issue fixes

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2361
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6250/



---


[GitHub] carbondata issue #2361: [WIP] Avro logical type issue fixes

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2361
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5088/



---


[GitHub] carbondata issue #2361: [WIP] Avro logical type issue fixes

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2361
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5220/



---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
LGTM


---


[GitHub] carbondata issue #2360: [CARBONDATA-2575] Add document to explain DataMap Ma...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2360
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6249/



---


[GitHub] carbondata issue #2360: [CARBONDATA-2575] Add document to explain DataMap Ma...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2360
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5087/



---


[GitHub] carbondata pull request #2361: [WIP] Avro logical type issue fixes

2018-06-04 Thread ajantha-bhat
GitHub user ajantha-bhat opened a pull request:

https://github.com/apache/carbondata/pull/2361

[WIP] Avro logical type issue fixes

**This PR is dependent on #2347** 

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajantha-bhat/carbondata 
avro_logical_type_support

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2361


commit 94c80a53941128e44cc72899b65a453c61c2eb81
Author: kunal642 
Date:   2018-05-28T06:11:59Z

added support for logical type

commit 34df6c627cb82b2852e2b27f36042abfc069427b
Author: ajantha-bhat 
Date:   2018-06-04T10:42:48Z

issue fix




---


[GitHub] carbondata issue #2360: [CARBONDATA-2575] Add document to explain DataMap Ma...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2360
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5219/



---


[jira] [Updated] (CARBONDATA-2576) MV Datamap - MV is not working fine if there is more than 3 aggregate function in the same datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2576:
--
Description: 
MV is not working fine if there is more than 3 aggregate function in the same 
datamap. It is working fine upto 3 aggregate functions on the same MV. Please 
see the attached document for more details.

Test queries:

 

scala> carbon.sql("create datamap datamap_comp_maxsumminavg using 'mv' as 
select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

++
  
 ++

++

 

 

scala> carbon.sql("rebuild datamap datamap_comp_maxsumminavg").show(200,false)

++
  
 ++

++

 

 

scala> carbon.sql("explain select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

org.apache.spark.sql.AnalysisException: expression 
'datamap_comp_maxsumminavg_table.`avg_attendance`' is neither present in the 
group by, nor is it an aggregate function. Add to group by or wrap in first() 
(or first_value) if you don't care which value you get.;;

Aggregate [origintable_empno#2925|#2925], [origintable_empno#2925 AS 
empno#3002, max(max_projectenddate#2926) AS max(projectenddate)#3003, 
sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006|#2925 AS 
empno#3002, max(max_projectenddate#2926) AS max(projectenddate)#3003, 
sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006]

+- SubqueryAlias datamap_comp_maxsumminavg_table

   +- 
Relation[origintable_empno#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929|#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929]
 CarbonDatasourceHadoopRelation [ Database name :default, Table name 
:datamap_comp_maxsumminavg_table, Schema 
:Some(StructType(StructField(origintable_empno,IntegerType,true), 
StructField(max_projectenddate,TimestampType,true), 
StructField(sum_salary,LongType,true), 
StructField(min_projectjoindate,TimestampType,true), 
StructField(avg_attendance,DoubleType,true))) ]

 

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:39)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:247)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78)

  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:78)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:52)

  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:148)

  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:72)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:38)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:46)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:27)

  at 

[jira] [Updated] (CARBONDATA-2576) MV Datamap - MV is not working fine if there is more than 3 aggregate function in the same datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2576:
--
Attachment: data.csv

> MV Datamap - MV is not working fine if there is more than 3 aggregate 
> function in the same datamap.
> ---
>
> Key: CARBONDATA-2576
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2576
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CARBONDATA., MV, Materialistic_Views
> Attachments: From 4th aggregate function -error shown.docx, data.csv
>
>
> MV is not working fine if there is more than 3 aggregate function in the same 
> datamap.
> Test queries:
>  
> scala> carbon.sql("create datamap datamap_comp_maxsumminavg using 'mv' as 
> select 
> empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) 
> from originTable group by empno").show(200,false)
> ++
> ||
> ++
> ++
>  
>  
> scala> carbon.sql("rebuild datamap datamap_comp_maxsumminavg").show(200,false)
> ++
> ||
> ++
> ++
>  
>  
> scala> carbon.sql("explain select 
> empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) 
> from originTable group by empno").show(200,false)
> org.apache.spark.sql.AnalysisException: expression 
> 'datamap_comp_maxsumminavg_table.`avg_attendance`' is neither present in the 
> group by, nor is it an aggregate function. Add to group by or wrap in first() 
> (or first_value) if you don't care which value you get.;;
> Aggregate [origintable_empno#2925], [origintable_empno#2925 AS empno#3002, 
> max(max_projectenddate#2926) AS max(projectenddate)#3003, 
> sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
> min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006]
> +- SubqueryAlias datamap_comp_maxsumminavg_table
>    +- 
> Relation[origintable_empno#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :datamap_comp_maxsumminavg_table, Schema 
> :Some(StructType(StructField(origintable_empno,IntegerType,true), 
> StructField(max_projectenddate,TimestampType,true), 
> StructField(sum_salary,LongType,true), 
> StructField(min_projectjoindate,TimestampType,true), 
> StructField(avg_attendance,DoubleType,true))) ]
>  
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:39)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:91)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:247)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:253)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:280)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:78)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91)
>   at 
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:52)
>   at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:148)
>   at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>   at 
> 

[GitHub] carbondata pull request #2335: [CARBONDATA-2573] integrate carbonstore mv br...

2018-06-04 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2335#discussion_r192744678
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVState.scala 
---
@@ -31,25 +31,25 @@ private[mv] class MVState(summaryDatasetCatalog: 
SummaryDatasetCatalog) {
   // Note: These are all lazy vals because they depend on each other (e.g. 
conf) and we
   // want subclasses to override some of the fields. Otherwise, we would 
get a lot of NPEs.
 
-  /**
-   * Modular query plan modularizer
-   */
-  lazy val modularizer = SimpleModularizer
-
-  /**
-   * Logical query plan optimizer.
-   */
-  lazy val optimizer = BirdcageOptimizer
-
-  lazy val matcher = DefaultMatchMaker
-
-  lazy val navigator: Navigator = new Navigator(summaryDatasetCatalog, 
this)
+//  /**
+//   * Modular query plan modularizer
+//   */
+//  lazy val modularizer = SimpleModularizer
+//
+//  /**
+//   * Logical query plan optimizer.
+//   */
--- End diff --

Is it for debugging purpose?


---


[jira] [Updated] (CARBONDATA-2576) MV Datamap - MV is not working fine if there is more than 3 aggregate function in the same datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2576:
--
Description: 
MV is not working fine if there is more than 3 aggregate function in the same 
datamap. It is working fine upto 3 aggregate functions on the same MV.

Test queries:

 

scala> carbon.sql("create datamap datamap_comp_maxsumminavg using 'mv' as 
select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

++
 
++

++

 

 

scala> carbon.sql("rebuild datamap datamap_comp_maxsumminavg").show(200,false)

++
 
++

++

 

 

scala> carbon.sql("explain select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

org.apache.spark.sql.AnalysisException: expression 
'datamap_comp_maxsumminavg_table.`avg_attendance`' is neither present in the 
group by, nor is it an aggregate function. Add to group by or wrap in first() 
(or first_value) if you don't care which value you get.;;

Aggregate [origintable_empno#2925|#2925], [origintable_empno#2925 AS 
empno#3002, max(max_projectenddate#2926) AS max(projectenddate)#3003, 
sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006|#2925 AS 
empno#3002, max(max_projectenddate#2926) AS max(projectenddate)#3003, 
sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006]

+- SubqueryAlias datamap_comp_maxsumminavg_table

   +- 
Relation[origintable_empno#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929|#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929]
 CarbonDatasourceHadoopRelation [ Database name :default, Table name 
:datamap_comp_maxsumminavg_table, Schema 
:Some(StructType(StructField(origintable_empno,IntegerType,true), 
StructField(max_projectenddate,TimestampType,true), 
StructField(sum_salary,LongType,true), 
StructField(min_projectjoindate,TimestampType,true), 
StructField(avg_attendance,DoubleType,true))) ]

 

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:39)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:247)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78)

  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:78)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:52)

  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:148)

  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:72)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:38)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:46)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:27)

  at 
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69)

  at 

[jira] [Commented] (CARBONDATA-2576) MV Datamap - MV is not working fine if there is more than 3 aggregate function in the same datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500225#comment-16500225
 ] 

Prasanna Ravichandran commented on CARBONDATA-2576:
---

Please find the queries for the base table creation:

CREATE TABLE originTable (empno int, empname String, designation String, doj 
Timestamp,
workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';

LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

 Also attached the data.csv.

 

> MV Datamap - MV is not working fine if there is more than 3 aggregate 
> function in the same datamap.
> ---
>
> Key: CARBONDATA-2576
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2576
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CARBONDATA., MV, Materialistic_Views
> Attachments: From 4th aggregate function -error shown.docx
>
>
> MV is not working fine if there is more than 3 aggregate function in the same 
> datamap.
> Test queries:
>  
> scala> carbon.sql("create datamap datamap_comp_maxsumminavg using 'mv' as 
> select 
> empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) 
> from originTable group by empno").show(200,false)
> ++
> ||
> ++
> ++
>  
>  
> scala> carbon.sql("rebuild datamap datamap_comp_maxsumminavg").show(200,false)
> ++
> ||
> ++
> ++
>  
>  
> scala> carbon.sql("explain select 
> empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) 
> from originTable group by empno").show(200,false)
> org.apache.spark.sql.AnalysisException: expression 
> 'datamap_comp_maxsumminavg_table.`avg_attendance`' is neither present in the 
> group by, nor is it an aggregate function. Add to group by or wrap in first() 
> (or first_value) if you don't care which value you get.;;
> Aggregate [origintable_empno#2925], [origintable_empno#2925 AS empno#3002, 
> max(max_projectenddate#2926) AS max(projectenddate)#3003, 
> sum(sum_salary#2927L) AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
> min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006]
> +- SubqueryAlias datamap_comp_maxsumminavg_table
>    +- 
> Relation[origintable_empno#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :datamap_comp_maxsumminavg_table, Schema 
> :Some(StructType(StructField(origintable_empno,IntegerType,true), 
> StructField(max_projectenddate,TimestampType,true), 
> StructField(sum_salary,LongType,true), 
> StructField(min_projectjoindate,TimestampType,true), 
> StructField(avg_attendance,DoubleType,true))) ]
>  
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:39)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:91)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:247)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:253)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:280)
>   at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78)
>   at 
> 

[jira] [Created] (CARBONDATA-2576) MV Datamap - MV is not working fine if there is more than 3 aggregate function in the same datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)
Prasanna Ravichandran created CARBONDATA-2576:
-

 Summary: MV Datamap - MV is not working fine if there is more than 
3 aggregate function in the same datamap.
 Key: CARBONDATA-2576
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2576
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Reporter: Prasanna Ravichandran
 Attachments: From 4th aggregate function -error shown.docx

MV is not working fine if there is more than 3 aggregate function in the same 
datamap.

Test queries:

 

scala> carbon.sql("create datamap datamap_comp_maxsumminavg using 'mv' as 
select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

++

||

++

++

 

 

scala> carbon.sql("rebuild datamap datamap_comp_maxsumminavg").show(200,false)

++

||

++

++

 

 

scala> carbon.sql("explain select 
empno,max(projectenddate),sum(salary),min(projectjoindate),avg(attendance) from 
originTable group by empno").show(200,false)

org.apache.spark.sql.AnalysisException: expression 
'datamap_comp_maxsumminavg_table.`avg_attendance`' is neither present in the 
group by, nor is it an aggregate function. Add to group by or wrap in first() 
(or first_value) if you don't care which value you get.;;

Aggregate [origintable_empno#2925], [origintable_empno#2925 AS empno#3002, 
max(max_projectenddate#2926) AS max(projectenddate)#3003, sum(sum_salary#2927L) 
AS sum(salary)#3004L, min(min_projectjoindate#2928) AS 
min(projectjoindate)#3005, avg_attendance#2929 AS avg(attendance)#3006]

+- SubqueryAlias datamap_comp_maxsumminavg_table

   +- 
Relation[origintable_empno#2925,max_projectenddate#2926,sum_salary#2927L,min_projectjoindate#2928,avg_attendance#2929]
 CarbonDatasourceHadoopRelation [ Database name :default, Table name 
:datamap_comp_maxsumminavg_table, Schema 
:Some(StructType(StructField(origintable_empno,IntegerType,true), 
StructField(max_projectenddate,TimestampType,true), 
StructField(sum_salary,LongType,true), 
StructField(min_projectjoindate,TimestampType,true), 
StructField(avg_attendance,DoubleType,true))) ]

 

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:39)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:247)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1$5.apply(CheckAnalysis.scala:253)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.org$apache$spark$sql$catalyst$analysis$CheckAnalysis$class$$anonfun$$checkValidAggregateExpression$1(CheckAnalysis.scala:253)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$9.apply(CheckAnalysis.scala:280)

  at scala.collection.immutable.List.foreach(List.scala:381)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:280)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78)

  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:78)

  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91)

  at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:52)

  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:148)

  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:72)

  at 
org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:38)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:46)

  at org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:27)

  at 
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:69)

  at 
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:67)

  at 

[GitHub] carbondata pull request #2335: [CARBONDATA-2573] integrate carbonstore mv br...

2018-06-04 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2335#discussion_r192743275
  
--- Diff: 
datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala 
---
@@ -373,5 +373,25 @@ object MVHelper {
   case other => other
 }
   }
+
+  def rewriteWithMVTable(rewrittenPlan: ModularPlan, rewrite: 
QueryRewrite): ModularPlan = {
--- End diff --

please add comment for this func


---


[GitHub] carbondata pull request #2357: [CARBONDATA-2569] Change the strategy of Sear...

2018-06-04 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2357#discussion_r192742793
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala ---
@@ -171,19 +170,24 @@ class CarbonSession(@transient val sc: SparkContext,
*/
   private def trySearchMode(qe: QueryExecution, sse: SQLStart): DataFrame 
= {
 val analyzed = qe.analyzed
+val LOG: LogService = 
LogServiceFactory.getLogService(classOf[CarbonSession].getName)
 analyzed match {
   case _@Project(columns, _@Filter(expr, s: SubqueryAlias))
 if s.child.isInstanceOf[LogicalRelation] &&
s.child.asInstanceOf[LogicalRelation].relation
  .isInstanceOf[CarbonDatasourceHadoopRelation] =>
+LOG.info(s"Search service started and supports: ${sse.sqlText}")
 runSearch(analyzed, columns, expr, 
s.child.asInstanceOf[LogicalRelation])
   case gl@GlobalLimit(_, ll@LocalLimit(_, p@Project(columns, 
_@Filter(expr, s: SubqueryAlias
 if s.child.isInstanceOf[LogicalRelation] &&
s.child.asInstanceOf[LogicalRelation].relation
  .isInstanceOf[CarbonDatasourceHadoopRelation] =>
 val logicalRelation = s.child.asInstanceOf[LogicalRelation]
+LOG.info(s"Search service started and supports: ${sse.sqlText}")
--- End diff --

put this log into `runSearch`


---


[GitHub] carbondata pull request #2357: [CARBONDATA-2569] Change the strategy of Sear...

2018-06-04 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2357#discussion_r192742562
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala ---
@@ -171,19 +170,24 @@ class CarbonSession(@transient val sc: SparkContext,
*/
   private def trySearchMode(qe: QueryExecution, sse: SQLStart): DataFrame 
= {
 val analyzed = qe.analyzed
+val LOG: LogService = 
LogServiceFactory.getLogService(classOf[CarbonSession].getName)
--- End diff --

change `classOf[CarbonSession]` to `this.getClass`


---


[jira] [Created] (CARBONDATA-2575) Add document to explain DataMap Management

2018-06-04 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-2575:


 Summary: Add document to explain DataMap Management
 Key: CARBONDATA-2575
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2575
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.5.0, 1.4.1






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2360: [CARBONDATA-2575] Add document to explain Dat...

2018-06-04 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/2360

[CARBONDATA-2575] Add document to explain DataMap Management

Add document to explain DataMap Management

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata 
datamap-management-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2360.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2360


commit f5d3fb0df64e5b7c05c425680f139b9d4f1b3688
Author: Jacky Li 
Date:   2018-06-04T13:18:31Z

add doc




---


[jira] [Created] (CARBONDATA-2574) MV Datamap - MV is not working if there is aggregate function with group by and without any projections.

2018-06-04 Thread Prasanna Ravichandran (JIRA)
Prasanna Ravichandran created CARBONDATA-2574:
-

 Summary: MV Datamap - MV is not working if there is aggregate 
function with group by and without any projections.
 Key: CARBONDATA-2574
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2574
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
 Environment: 3 Node Opensource ANT cluster.
Reporter: Prasanna Ravichandran
 Attachments: MV_aggregate_without_projection_and_with_groupby.docx, 
data.csv

User query is not fetching data from the MV datamap, if there is aggregate 
function with group by and without any projections.

Test queries:(In Spark-shell)

 

scala> carbon.sql("CREATE TABLE originTable (empno int, empname String, 
designation String, doj Timestamp,workgroupcategory int, workgroupcategoryname 
String, deptno int, deptname String,projectcode int, projectjoindate Timestamp, 
projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
'org.apache.carbondata.format'").show(200,false)

++

||

++

++

 

scala> carbon.sql("LOAD DATA local inpath 
'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'\"','timestampformat'='dd-MM-')").show(200,false)

++

||

++

++

 

 

scala> carbon.sql("create datamap Mv_misscol using 'mv' as select sum(salary) 
from origintable group by empno").show(200,false)

++

||

++

++

 

 

scala> carbon.sql("rebuild datamap Mv_misscol").show(200,false)

++

||

++

++

 

 

scala> carbon.sql("explain select sum(salary) from origintable group by 
empno").show(200,false)

+---+

|plan   











|

+---+

|== CarbonData Profiler ==

Table Scan on origintable

 - total blocklets: 1

 - filter: none

 - pruned by Main DataMap

    - skipped blocklets: 0

  

[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6248/



---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5086/



---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5218/



---


[GitHub] carbondata pull request #2351: [CARBONDATA-2559] task id set for each carbon...

2018-06-04 Thread rahulforallp
Github user rahulforallp closed the pull request at:

https://github.com/apache/carbondata/pull/2351


---


[jira] [Resolved] (CARBONDATA-2559) task id is not being set for CarbonReader

2018-06-04 Thread kumar vishal (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2559.
--
Resolution: Fixed

> task id is not being set for CarbonReader
> -
>
> Key: CARBONDATA-2559
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2559
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
LGTM


---


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread rahulforallp
Github user rahulforallp commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
@kumarvishal09  done


---


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
@rahulforallp Please update the rootcause/Issue for this PR.


---


[jira] [Updated] (CARBONDATA-2541) MV Dataset - When MV satisfy filter condition but not exact same condition given during MV creation, then the user query is not accessing the data from MV.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2541:
--
Attachment: data.csv

> MV Dataset - When MV satisfy filter condition but not exact same condition 
> given during MV creation, then the user query is not accessing the data from 
> MV.
> ---
>
> Key: CARBONDATA-2541
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2541
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv
>
>
> MV Dataset - When MV satisfy filter condition but not exact same condition 
> given during MV creation, then the user query is not accessing the data from 
> MV.
> Test queries - spark shell:
> scala>carbon.sql("CREATE TABLE originTable (empno int, empname String, 
> designation String, doj Timestamp, workgroupcategory int, 
> workgroupcategoryname String, deptno int, deptname String, projectcode int, 
> projectjoindate Timestamp, projectenddate Timestamp,attendance int, 
> utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show()
> ++
> ||
> ++
> ++
>  
> scala>carbon.sql("LOAD DATA local inpath 
> 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
> OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show()
> ++
> ||
> ++
> ++
>  
> scala> carbon.sql("create datamap mv_project3 using 'mv' as select 
> projectenddate,empno from originTable where empno>10").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql(" rebuild datamap mv_project3").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql(" explain select projectenddate,empno from originTable 
> where empno>15").show(200,false)
> +-+
> |plan |
> +-+
> |== CarbonData Profiler ==
> Table Scan on origintable
>  - total blocklets: 2
>  - filter: (empno <> null and empno > 15)
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table 
> name :origintable, Schema 
> :Some(StructType(StructField(empno,IntegerType,true), 
> StructField(empname,StringType,true), 
> StructField(designation,StringType,true), 
> StructField(doj,TimestampType,true), 
> StructField(workgroupcategory,IntegerType,true), 
> StructField(workgroupcategoryname,StringType,true), 
> StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), 
> StructField(projectcode,IntegerType,true), 
> StructField(projectjoindate,TimestampType,true), 
> StructField(projectenddate,TimestampType,true), 
> StructField(attendance,IntegerType,true), 
> StructField(utilization,IntegerType,true), 
> StructField(salary,IntegerType,true))) ] 
> default.origintable[projectenddate#3095,empno#3085] PushedFilters: 
> [IsNotNull(empno), GreaterThan(empno,15)]|
> 

[jira] [Updated] (CARBONDATA-2539) MV Dataset - Subqueries is not accessing the data from the MV datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2539:
--
Attachment: data.csv

> MV Dataset - Subqueries is not accessing the data from the MV datamap.
> --
>
> Key: CARBONDATA-2539
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2539
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
> Attachments: data.csv
>
>
> Inner subquery is not accessing the data from the MV datamap. It is accessing 
> the data from the main table.
> Test queries - Spark shell:
> scala> carbon.sql("drop table if exists origintable").show()
> ++
> ||
> ++
> ++
>  scala> carbon.sql("CREATE TABLE originTable (empno int, empname String, 
> designation String, doj Timestamp, workgroupcategory int, 
> workgroupcategoryname String, deptno int, deptname String, projectcode int, 
> projectjoindate Timestamp, projectenddate Timestamp,attendance int, 
> utilization int,salary int) STORED BY 
> 'org.apache.carbondata.format'").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("LOAD DATA local inpath 
> 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
> OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show(200,false)
> ++
> ||
> ++
> ++
>  
> scala> carbon.sql("drop datamap datamap_subqry").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("create datamap datamap_subqry using 'mv' as select 
> min(salary) from originTable group by empno").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain SELECT max(empno) FROM originTable WHERE salary IN 
> (select min(salary) from originTable group by empno ) group by 
> empname").show(200,false)
> ++
> |plan |
> 

[jira] [Updated] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2537:
--
Attachment: data.csv

> MV Dataset - User queries with 'having' condition is not accessing the data 
> from the MV datamap.
> 
>
> Key: CARBONDATA-2537
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2537
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv, image-2018-05-25-15-50-23-903.png
>
>
> User queries with 'having' condition is not accessing the data from the MV 
> datamap. It is accessing the data from the Main table.
> Test queries - spark shell:
> scala>carbon.sql("CREATE TABLE originTable (empno int, empname String, 
> designation String, doj Timestamp, workgroupcategory int, 
> workgroupcategoryname String, deptno int, deptname String, projectcode int, 
> projectjoindate Timestamp, projectenddate Timestamp,attendance int, 
> utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show()
> ++
> ||
> ++
> ++
> scala>carbon.sql("LOAD DATA local inpath 
> 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
> OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show()
> ++
> ||
> ++
> ++
> scala> carbon.sql("select empno from originTable having 
> salary>1").show(200,false)
> +-+
> |empno|
> +-+
> |14 |
> |15 |
> |20 |
> |19 |
> +-+
> scala> carbon.sql("create datamap mv_hav using 'mv' as select empno from 
> originTable having salary>1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select empno from originTable having 
> salary>1").show(200,false)
> +---+
> |plan |
> +---+
> |== CarbonData Profiler ==
> Table Scan on origintable
>  - total blocklets: 1
>  - filter: (salary <> null and salary > 1)
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *Project [empno#1131]
> +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, 
> Table name :origintable, Schema 
> :Some(StructType(StructField(empno,IntegerType,true), 
> StructField(empname,StringType,true), 
> StructField(designation,StringType,true), 
> StructField(doj,TimestampType,true), 
> StructField(workgroupcategory,IntegerType,true), 
> StructField(workgroupcategoryname,StringType,true), 
> StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), 
> StructField(projectcode,IntegerType,true), 
> StructField(projectjoindate,TimestampType,true), 
> StructField(projectenddate,TimestampType,true), 
> StructField(attendance,IntegerType,true), 
> StructField(utilization,IntegerType,true), 
> StructField(salary,IntegerType,true))) ] default.origintable[empno#1131] 
> PushedFilters: [IsNotNull(salary), GreaterThan(salary,1)]|
> 

[jira] [Updated] (CARBONDATA-2536) MV Dataset - When user query has substring() of column under group by, which is same as the MV group by column, then the user query is not accessing the data from th

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2536:
--
Attachment: data.csv

> MV Dataset - When user query has substring() of column under group by, which 
> is same as the MV group by column, then the user query is not accessing the 
> data from the MV datamap table.
> 
>
> Key: CARBONDATA-2536
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2536
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node opensource ANT Cluster
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv
>
>
> MV Dataset - When user query has substring() of column under group by, which 
> is same as the  MV group by column, then the user query is not accessing the 
> data from the MV datamap table. It is accessing the data from the main table 
> only.
> Test query:
> carbon.sql("CREATE TABLE originTable (empno int, empname String, designation 
> String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
> deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
> projectenddate Timestamp,attendance int, utilization int,salary int) STORED 
> BY 'org.apache.carbondata.format'").show()
> ++
>  
> ++
>  ++
>  
> carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' 
> INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show()
> ++
>  
> ++
>  ++
>  
> scala> carbon.sql("Create datamap m2 using 'mv' as select sum(salary) from 
> originTable group by deptname").show(200,false)
>  ++
>  
> ++
>  ++
> scala> carbon.sql("rebuild datamap m2").show(200,false)
>  ++
>  
> ++
>  ++
>  
> scala> carbon.sql("explain select sum(salary) from originTable group by 
> substring(deptname,2,2)")
>  res60: org.apache.spark.sql.DataFrame = [plan: string]
> scala> carbon.sql("explain select sum(salary) from originTable group by 
> substring(deptname,2,2)").show(200,false)
>  
> +-+
> |plan|
> +-+
> |== CarbonData Profiler ==
>  Table Scan on origintable
>  - total blocklets: 1
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0|
> |== Physical Plan ==
>  *HashAggregate(keys=[substring(deptname#1138, 2, 2)#1255|#1138, 2, 2)#1255], 
> 

[jira] [Updated] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2534:
--
Attachment: data.csv

> MV Dataset - MV creation is not working with the substring() 
> -
>
> Key: CARBONDATA-2534
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2534
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node opensource ANT cluster
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: MV_substring.docx, data.csv
>
>
> MV creation is not working with the sub string function. We are getting the 
> spark.sql.AnalysisException while trying to create a MV with the substring 
> and aggregate function. 
> *Spark -shell test queries:*
>  scala> carbon.sql("create datamap mv_substr using 'mv' as select 
> sum(salary),substring(empname,2,5),designation from originTable group by 
> substring(empname,2,5),designation").show(200,false)
> *org.apache.spark.sql.AnalysisException: Cannot create a table having a 
> column whose name contains commas in Hive metastore. Table: 
> `default`.`mv_substr_table`; Column: substring_empname,_2,_5;*
>  *at* 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148)
>  at scala.collection.immutable.List.foreach(List.scala:381)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
>  at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
>  at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:183)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>  at 
> org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
>  at 
> org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103)
>  at 
> org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53)
>  at 
> org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:183)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>  ... 48 elided



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()

2018-06-04 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500032#comment-16500032
 ] 

Prasanna Ravichandran commented on CARBONDATA-2534:
---

Base table queries:

CREATE TABLE originTable (empno int, empname String, designation String, doj 
Timestamp,
workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';

LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

> MV Dataset - MV creation is not working with the substring() 
> -
>
> Key: CARBONDATA-2534
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2534
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node opensource ANT cluster
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: MV_substring.docx
>
>
> MV creation is not working with the sub string function. We are getting the 
> spark.sql.AnalysisException while trying to create a MV with the substring 
> and aggregate function. 
> *Spark -shell test queries:*
>  scala> carbon.sql("create datamap mv_substr using 'mv' as select 
> sum(salary),substring(empname,2,5),designation from originTable group by 
> substring(empname,2,5),designation").show(200,false)
> *org.apache.spark.sql.AnalysisException: Cannot create a table having a 
> column whose name contains commas in Hive metastore. Table: 
> `default`.`mv_substr_table`; Column: substring_empname,_2,_5;*
>  *at* 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148)
>  at scala.collection.immutable.List.foreach(List.scala:381)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
>  at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
>  at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
>  at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:183)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
>  at 
> org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
>  at 
> org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103)
>  at 
> org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53)
>  at 
> org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> 

[jira] [Comment Edited] (CARBONDATA-2533) MV Datamap - MV with Expression in the Aggregation is not fetching data from the MV datamap table.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500028#comment-16500028
 ] 

Prasanna Ravichandran edited comment on CARBONDATA-2533 at 6/4/18 10:43 AM:


Main table queries:

CREATE TABLE fact_table1 (empname String, designation String, doj Timestamp,

workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';
LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO TABLE 
fact_table1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');
LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO TABLE 
fact_table1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

 


was (Author: prasanna ravichandran):
Main table queries:

CREATE TABLE originTable (empno int, empname String, designation String, doj 
Timestamp,
workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';

LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

> MV Datamap - MV with Expression in the Aggregation is not fetching data from 
> the MV datamap table.
> --
>
> Key: CARBONDATA-2533
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2533
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: data_mv.csv
>
>
> MV with Expression in the Aggregation is not fetching data from the MV 
> datamap table. It is fetching the data from the main table. Please see the 
> below explain query - table name for more details.
> *Test queries from Spark -shell:*
> scala> carbon.sql("create datamap MV_expr1 using 'MV' as select sum(case when 
> deptno=11 and (utilization=92) then salary else 0 end) as t from fact_table1 
> group by empno").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("rebuild datamap MV_expr1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select sum(case when deptno=11 and 
> (utilization=92) then salary else 0 end) as t from fact_table1 group by 
> empno").show(200,false)
> +-+
> |plan |
> 

[jira] [Updated] (CARBONDATA-2533) MV Datamap - MV with Expression in the Aggregation is not fetching data from the MV datamap table.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2533:
--
Attachment: data_mv.csv

> MV Datamap - MV with Expression in the Aggregation is not fetching data from 
> the MV datamap table.
> --
>
> Key: CARBONDATA-2533
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2533
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: data_mv.csv
>
>
> MV with Expression in the Aggregation is not fetching data from the MV 
> datamap table. It is fetching the data from the main table. Please see the 
> below explain query - table name for more details.
> *Test queries from Spark -shell:*
> scala> carbon.sql("create datamap MV_expr1 using 'MV' as select sum(case when 
> deptno=11 and (utilization=92) then salary else 0 end) as t from fact_table1 
> group by empno").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("rebuild datamap MV_expr1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select sum(case when deptno=11 and 
> (utilization=92) then salary else 0 end) as t from fact_table1 group by 
> empno").show(200,false)
> +-+
> |plan |
> +-+
> |== CarbonData Profiler ==
> Table Scan on fact_table1
>  - total blocklets: 2
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *HashAggregate(keys=[empno#252], functions=[sum(cast(CASE WHEN ((deptno#258 = 
> 11) && (utilization#264 = 92)) THEN salary#265 ELSE 0 END as bigint))])
> +- Exchange hashpartitioning(empno#252, 200)
>  +- *HashAggregate(keys=[empno#252], functions=[partial_sum(cast(CASE WHEN 
> ((deptno#258 = 11) && (utilization#264 = 92)) THEN salary#265 ELSE 0 END as 
> bigint))])
>  +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, 
> *Table name :fact_table1,* Schema 
> :Some(StructType(StructField(empno,IntegerType,true), 
> StructField(empname,StringType,true), 
> StructField(designation,StringType,true), 
> StructField(doj,TimestampType,true), 
> StructField(workgroupcategory,IntegerType,true), 
> StructField(workgroupcategoryname,StringType,true), 
> StructField(deptno,IntegerType,true), 

[jira] [Commented] (CARBONDATA-2533) MV Datamap - MV with Expression in the Aggregation is not fetching data from the MV datamap table.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500028#comment-16500028
 ] 

Prasanna Ravichandran commented on CARBONDATA-2533:
---

Main table queries:

CREATE TABLE originTable (empno int, empname String, designation String, doj 
Timestamp,
workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';

LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

> MV Datamap - MV with Expression in the Aggregation is not fetching data from 
> the MV datamap table.
> --
>
> Key: CARBONDATA-2533
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2533
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
>
> MV with Expression in the Aggregation is not fetching data from the MV 
> datamap table. It is fetching the data from the main table. Please see the 
> below explain query - table name for more details.
> *Test queries from Spark -shell:*
> scala> carbon.sql("create datamap MV_expr1 using 'MV' as select sum(case when 
> deptno=11 and (utilization=92) then salary else 0 end) as t from fact_table1 
> group by empno").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("rebuild datamap MV_expr1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select sum(case when deptno=11 and 
> (utilization=92) then salary else 0 end) as t from fact_table1 group by 
> empno").show(200,false)
> +-+
> |plan |
> +-+
> |== CarbonData Profiler ==
> Table Scan on fact_table1
>  - total blocklets: 2
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *HashAggregate(keys=[empno#252], functions=[sum(cast(CASE WHEN ((deptno#258 = 
> 11) && (utilization#264 = 92)) THEN salary#265 ELSE 0 END as bigint))])
> +- Exchange hashpartitioning(empno#252, 200)
>  +- *HashAggregate(keys=[empno#252], functions=[partial_sum(cast(CASE WHEN 
> ((deptno#258 = 11) && (utilization#264 = 92)) THEN salary#265 ELSE 0 

[jira] [Commented] (CARBONDATA-2528) MV Datamap - When the MV is created with the order by, then when we execute the corresponding query defined in MV with order by, then the data is not accessed from

2018-06-04 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500026#comment-16500026
 ] 

Prasanna Ravichandran commented on CARBONDATA-2528:
---

Main table queries:

CREATE TABLE originTable (empno int, empname String, designation String, doj 
Timestamp,
workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
String,
projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance 
int,
utilization int,salary int)
STORED BY 'org.apache.carbondata.format';

LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

> MV Datamap - When the MV is created with the order by, then when we execute 
> the corresponding query defined in MV with order by, then the data is not 
> accessed from the MV. 
> 
>
> Key: CARBONDATA-2528
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2528
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node Opensource ANT cluster. (Opensource Hadoop 2.7.2+ 
> Opensource Spark 2.2.1+ Opensource Carbondata 1.3.1)
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: MV_orderby.docx, data.csv
>
>
> When the MV is created with the order by condition, then when we execute the 
> corresponding query defined in MV along with order by, then the data is not 
> accessed from the MV. The data is being accessed from the maintable only. 
> Test queries:
> create datamap MV_order using 'mv' as select 
> empno,sum(salary)+sum(utilization) as total from originTable group by empno 
> order by empno;
> create datamap MV_desc_order using 'mv' as select 
> empno,sum(salary)+sum(utilization) as total from originTable group by empno 
> order by empno DESC;
> rebuild datamap MV_order;
> rebuild datamap MV_desc_order;
> explain select empno,sum(salary)+sum(utilization) as total from originTable 
> group by empno order by empno;
> explain select empno,sum(salary)+sum(utilization) as total from originTable 
> group by empno order by empno DESC;
> Expected result: MV with order by condition should access data from the MV 
> table only.
>  
> Please see the attached document for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2528) MV Datamap - When the MV is created with the order by, then when we execute the corresponding query defined in MV with order by, then the data is not accessed from t

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2528:
--
Attachment: data.csv

> MV Datamap - When the MV is created with the order by, then when we execute 
> the corresponding query defined in MV with order by, then the data is not 
> accessed from the MV. 
> 
>
> Key: CARBONDATA-2528
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2528
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 node Opensource ANT cluster. (Opensource Hadoop 2.7.2+ 
> Opensource Spark 2.2.1+ Opensource Carbondata 1.3.1)
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: MV_orderby.docx, data.csv
>
>
> When the MV is created with the order by condition, then when we execute the 
> corresponding query defined in MV along with order by, then the data is not 
> accessed from the MV. The data is being accessed from the maintable only. 
> Test queries:
> create datamap MV_order using 'mv' as select 
> empno,sum(salary)+sum(utilization) as total from originTable group by empno 
> order by empno;
> create datamap MV_desc_order using 'mv' as select 
> empno,sum(salary)+sum(utilization) as total from originTable group by empno 
> order by empno DESC;
> rebuild datamap MV_order;
> rebuild datamap MV_desc_order;
> explain select empno,sum(salary)+sum(utilization) as total from originTable 
> group by empno order by empno;
> explain select empno,sum(salary)+sum(utilization) as total from originTable 
> group by empno order by empno DESC;
> Expected result: MV with order by condition should access data from the MV 
> table only.
>  
> Please see the attached document for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5085/



---


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6247/



---


[jira] [Updated] (CARBONDATA-2526) MV datamap - When the MV datamap is created for the operators: sum(col1)+sum(col2) then when we execute a query of sum(col1+col2) it is not accessing the data from t

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2526:
--
Attachment: data.csv

> MV datamap - When the MV datamap is created for the operators: 
> sum(col1)+sum(col2) then when we execute a query of sum(col1+col2) it is not 
> accessing the data from the MV.
> ---
>
> Key: CARBONDATA-2526
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2526
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: CarbonData, MV, Materialistic_Views
> Attachments: Arithmetic.docx, Screenshot_MV.png, data.csv
>
>
> When the MV datamap is created for the operators like: sum(col1)+sum(col2) 
> then when we execute a query like sum(col1+col2), then it is not accessing 
> the data from the created MV.
> Test queries:
> CREATE TABLE originTable (empno int, empname String, designation String, doj 
> Timestamp,
>  workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
> String,
>  projectcode int, projectjoindate Timestamp, projectenddate 
> Timestamp,attendance int,
>  utilization int,salary int)
>  STORED BY 'org.apache.carbondata.format';
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE 
> originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> create datamap arithmetic_op using 'mv' as select 
> empno,sum(salary)+sum(utilization) as total , sum(salary)/sum(utilization) as 
> updownratio from originTable where empno>10 group by empno;
> rebuild datamap arithmetic_op; 
>  explain select empno,sum(salary)+sum(utilization) as total , 
> sum(salary)/sum(utilization) as updownratio from originTable where empno>10 
> group by empno;
> explain select empno,sum(salary+utilization) as total from originTable where 
> empno>10 group by empno;
> As   sum(col1)+sum(col2) = sum(col1+col2) are equal, it should point to the 
> same datamap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2522) MV dataset when created with Joins, then it is not pointing towards the MV, while executing that join query.

2018-06-04 Thread Prasanna Ravichandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2522:
--
Attachment: data_mv.csv

> MV dataset when created with Joins, then it is not pointing towards the MV, 
> while executing that join query.
> 
>
> Key: CARBONDATA-2522
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2522
> Project: CarbonData
>  Issue Type: Bug
> Environment: 3 Node Opensource ANT Cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>  Labels: MV, Materialistic_Views
> Attachments: MV_joins.docx, data_mv.csv
>
>
> When MV is created on Joining tables, then the explain of that join query 
> points to the maintable, instead of the created MV datamap.  
> Queries:
> drop table if exists fact_table1;
> CREATE TABLE fact_table1 (empno int, empname String, designation String, doj 
> Timestamp,
> workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
> String,
> projectcode int, projectjoindate Timestamp, projectenddate 
> Timestamp,attendance int,
> utilization int,salary int)
> STORED BY 'org.apache.carbondata.format';
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> drop table if exists fact_table2;
> CREATE TABLE fact_table2 (empno int, empname String, designation String, doj 
> Timestamp,
> workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
> String,
> projectcode int, projectjoindate Timestamp, projectenddate 
> Timestamp,attendance int,
> utilization int,salary int)
> STORED BY 'org.apache.carbondata.format';
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table2 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table2 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> drop table if exists fact_table3;
> CREATE TABLE fact_table3 (empno int, empname String, designation String, doj 
> Timestamp,
> workgroupcategory int, workgroupcategoryname String, deptno int, deptname 
> String,
> projectcode int, projectjoindate Timestamp, projectenddate 
> Timestamp,attendance int,
> utilization int,salary int)
> STORED BY 'org.apache.carbondata.format';
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table2 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data_mv.csv' INTO 
> TABLE fact_table2 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '"','timestampformat'='dd-MM-');
> create datamap datamap25 using 'mv' as select t1.empname as c1, 
> t2.designation from fact_table1 t1,fact_table2 t2,fact_table3 t3  where 
> t1.empname = t2.empname and t1.empname=t3.empname;
> explain create datamap datamap25 using 'mv' as select t1.empname as c1, 
> t2.designation from fact_table1 t1,fact_table2 t2,fact_table3 t3  where 
> t1.empname = t2.empname and t1.empname=t3.empname;
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2351: [CARBONDATA-2559] task id set for each carbonReader ...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2351
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5217/



---


[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

2018-06-04 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2347
  
@kumarvishal09 Please review


---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5084/



---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6246/



---


[GitHub] carbondata issue #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDATA-2570...

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2345
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5216/



---


[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

2018-06-04 Thread sounakr
Github user sounakr commented on the issue:

https://github.com/apache/carbondata/pull/2347
  
LGTM


---


[GitHub] carbondata pull request #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDA...

2018-06-04 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2345#discussion_r192658697
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 ---
@@ -226,7 +226,7 @@ public static CarbonTable buildFromTablePath(String 
tableName, String tablePath,
 } else {
   // Infer the schema from the Carbondata file.
   TableInfo tableInfoInfer =
-  SchemaReader.inferSchema(AbsoluteTableIdentifier.from(tablePath, 
"null", "null"), false);
+  CarbonUtil.inferDummySchema(tablePath, "null", "null");
--- End diff --

OK. Changed


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5215/



---


[GitHub] carbondata pull request #2345: [CARBONDATA-2557] [CARBONDATA-2472] [CARBONDA...

2018-06-04 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2345#discussion_r192653651
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 ---
@@ -226,7 +226,7 @@ public static CarbonTable buildFromTablePath(String 
tableName, String tablePath,
 } else {
   // Infer the schema from the Carbondata file.
   TableInfo tableInfoInfer =
-  SchemaReader.inferSchema(AbsoluteTableIdentifier.from(tablePath, 
"null", "null"), false);
+  CarbonUtil.inferDummySchema(tablePath, "null", "null");
--- End diff --

Change the name of the method to `buildDummyTable(tableName, tablePath)


---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5083/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6245/



---


[GitHub] carbondata issue #2335: [CARBONDATA-2573] integrate carbonstore mv branch

2018-06-04 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5214/



---


[jira] [Created] (CARBONDATA-2573) Merge carbonstore branch to master

2018-06-04 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-2573:
---

 Summary: Merge carbonstore branch to master
 Key: CARBONDATA-2573
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2573
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)