[jira] [Commented] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

2018-06-26 Thread Prasanna Ravichandran (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524638#comment-16524638
 ] 

Prasanna Ravichandran commented on CARBONDATA-2537:
---

User queries are accessing the data from the created MV datamap. User have to 
rebuild the datamap once, after creation. Closed.

!image-2018-06-27-11-54-31-158.png!

> MV Dataset - User queries with 'having' condition is not accessing the data 
> from the MV datamap.
> 
>
> Key: CARBONDATA-2537
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2537
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Assignee: xubo245
>Priority: Minor
>  Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv, image-2018-05-25-15-50-23-903.png, 
> image-2018-06-27-11-53-49-587.png, image-2018-06-27-11-54-31-158.png
>
>
> User queries with 'having' condition is not accessing the data from the MV 
> datamap. It is accessing the data from the Main table.
> Test queries - spark shell:
> scala>carbon.sql("CREATE TABLE originTable (empno int, empname String, 
> designation String, doj Timestamp, workgroupcategory int, 
> workgroupcategoryname String, deptno int, deptname String, projectcode int, 
> projectjoindate Timestamp, projectenddate Timestamp,attendance int, 
> utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show()
> ++
> ||
> ++
> ++
> scala>carbon.sql("LOAD DATA local inpath 
> 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
> OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show()
> ++
> ||
> ++
> ++
> scala> carbon.sql("select empno from originTable having 
> salary>1").show(200,false)
> +-+
> |empno|
> +-+
> |14 |
> |15 |
> |20 |
> |19 |
> +-+
> scala> carbon.sql("create datamap mv_hav using 'mv' as select empno from 
> originTable having salary>1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select empno from originTable having 
> salary>1").show(200,false)
> +---+
> |plan |
> +---+
> |== CarbonData Profiler ==
> Table Scan on origintable
>  - total blocklets: 1
>  - filter: (salary <> null and salary > 1)
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *Project [empno#1131]
> +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, 
> Table name :origintable, Schema 
> :Some(StructType(StructField(empno,IntegerType,true), 
> StructField(empname,StringType,true), 
> StructField(designation,StringType,true), 
> StructField(doj,TimestampType,true), 
> StructField(workgroupcategory,IntegerType,true), 
> StructField(workgroupcategoryname,StringType,true), 
> StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), 
> StructField(projectcode,IntegerType,true), 
> StructField(projectjoindate,TimestampType,true), 
> StructField(projectenddate,TimestampTyp

[jira] [Commented] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

2018-06-26 Thread xubo245 (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524463#comment-16524463
 ] 

xubo245 commented on CARBONDATA-2537:
-

It's Work fine now after rebuild in cluster


{code:java}
0: jdbc:hive2://hadoop1:1> explain select empno from originTable having 
salary>1;
+--+--+
|   




plan




   |
+--+--+
| == CarbonData Profiler ==
Table Scan on origintable
 - total blocklets: 1
 - filter: (salary <> null and salary > 1)
 - pruned by Main DataMap
- skipped blocklets: 0








   |
| == Physical Plan ==
*Project [empno#1853]
+- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table 
name :origintable, Schema :Some(StructType(StructField(empno,IntegerType,true), 
StructField(empname,StringType,true), StructField(designation,StringType,true), 
StructField(doj,TimestampType,true), 
StructField(workgroupcategory,IntegerType,true), 
StructField(workgroupcategoryname,StringType,true), 
StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), 
StructField(projectcode,IntegerType,true), 
StructField(projectjoindate,TimestampType,true), 
StructField(projectenddate,TimestampType,true), 
StructField(attendance,IntegerType,true), 
StructField(utilization,IntegerType,true), 
StructField(salary,IntegerType,true))) ] default.origintable[empno#1853] 
PushedFilters: [IsNotNull(salary), GreaterThan(salary,1)]  |
+-

[jira] [Commented] (CARBONDATA-2537) MV Dataset - User queries with 'having' condition is not accessing the data from the MV datamap.

2018-06-26 Thread xubo245 (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523570#comment-16523570
 ] 

xubo245 commented on CARBONDATA-2537:
-

It will match datamap if there is rebuild for datamap :


{code:java}
0: jdbc:hive2://127.0.0.1:1> create datamap mv_hav using 'mv' as select 
empno from originTable having salary>1;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.085 seconds)
0: jdbc:hive2://127.0.0.1:1> rebuild datamap mv_hav;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.131 seconds)
0: jdbc:hive2://127.0.0.1:1> explain select empno from originTable having 
salary>1;
+--+--+
|   
plan

   |
+--+--+
| == CarbonData Profiler ==



   |
| == Physical Plan ==
*Project [origintable_empno#7961 AS empno#7985]
+- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table 
name :mv_hav_table, Schema 
:Some(StructType(StructField(origintable_empno,IntegerType,true))) ] 
default.mv_hav_table[origintable_empno#7961]  |
+--+--+
2 rows selected (0.071 seconds)
0: jdbc:hive2://127.0.0.1:1>  select empno from originTable having 
salary>1;
++--+
| empno  |
++--+
++--+
No rows selected (0.066 seconds)

{code}


> MV Dataset - User queries with 'having' condition is not accessing the data 
> from the MV datamap.
> 
>
> Key: CARBONDATA-2537
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2537
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: 3 Node Opensource ANT cluster.
>Reporter: Prasanna Ravichandran
>Assignee: xubo245
>Priority: Minor
>  Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv, image-2018-05-25-15-50-23-903.png
>
>
> User queries with 'having' condition is not accessing the data from the MV 
> datamap. It is accessing the data from the Main table.
> Test queries - spark shell:
> scala>carbon.sql("CREATE TABLE originTable (empno int, empname String, 
> designation String, doj Timestamp, workgroupcategory int, 
> workgroupcategoryname String, deptno int, deptname String, projectcode int, 
> projectjoindate Timestamp, projectenddate Timestamp,attendance int, 
> utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show()
> ++
> ||
> ++
> ++
> scala>carbon.sql("LOAD DATA local inpath 
> 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable 
> OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-')").show()
> ++
> ||
> ++
> ++
> scala> carbon.sql("select empno from originTable having 
> salary>1").show(200,false)
> +-+
> |empno|
> +-+
> |14 |
> |15 |
> |20 |
> |19 |
> +-+
> scala> carbon.sql("create datamap mv_hav using 'mv' as select empno from 
> originTable having salary>1").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain select empno from originTable having 
> salary>1").show(200,false)
> +