[
https://issues.apache.org/jira/browse/CARBONDATA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prasanna Ravichandran updated CARBONDATA-2536:
----------------------------------------------
Attachment: data.csv
> MV Dataset - When user query has substring() of column under group by, which
> is same as the MV group by column, then the user query is not accessing the
> data from the MV datamap table.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-2536
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2536
> Project: CarbonData
> Issue Type: Bug
> Components: data-query
> Environment: 3 node opensource ANT Cluster
> Reporter: Prasanna Ravichandran
> Priority: Minor
> Labels: Carbondata, MV, Materialistic_Views
> Attachments: data.csv
>
>
> MV Dataset - When user query has substring() of column under group by, which
> is same as the MV group by column, then the user query is not accessing the
> data from the MV datamap table. It is accessing the data from the main table
> only.
> Test query:
> carbon.sql("CREATE TABLE originTable (empno int, empname String, designation
> String, doj Timestamp, workgroupcategory int, workgroupcategoryname String,
> deptno int, deptname String, projectcode int, projectjoindate Timestamp,
> projectenddate Timestamp,attendance int, utilization int,salary int) STORED
> BY 'org.apache.carbondata.format'").show()
> ++
>
> ++
> ++
>
> carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv'
> INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'=
> '\"','timestampformat'='dd-MM-yyyy')").show()
> ++
>
> ++
> ++
>
> scala> carbon.sql("Create datamap m2 using 'mv' as select sum(salary) from
> originTable group by deptname").show(200,false)
> ++
>
> ++
> ++
> scala> carbon.sql("rebuild datamap m2").show(200,false)
> ++
>
> ++
> ++
>
> scala> carbon.sql("explain select sum(salary) from originTable group by
> substring(deptname,2,2)")
> res60: org.apache.spark.sql.DataFrame = [plan: string]
> scala> carbon.sql("explain select sum(salary) from originTable group by
> substring(deptname,2,2)").show(200,false)
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |plan|
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |== CarbonData Profiler ==
> Table Scan on origintable
> - total blocklets: 1
> - filter: none
> - pruned by Main DataMap
> - skipped blocklets: 0|
> |== Physical Plan ==
> *HashAggregate(keys=[substring(deptname#1138, 2, 2)#1255|#1138, 2, 2)#1255],
> functions=[sum(cast(salary#1144 as bigint))|#1144 as bigint))])
> +- Exchange hashpartitioning(substring(deptname#1138, 2, 2)#1255, 200)
> +- *HashAggregate(keys=[substring(deptname#1138, 2, 2) AS
> substring(deptname#1138, 2, 2)#1255|#1138, 2, 2) AS substring(deptname#1138,
> 2, 2)#1255], functions=[partial_sum(cast(salary#1144 as bigint))|#1144 as
> bigint))])
> +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default,
> Table name :origintable, Schema
> :Some(StructType(StructField(empno,IntegerType,true),
> StructField(empname,StringType,true),
> StructField(designation,StringType,true),
> StructField(doj,TimestampType,true),
> StructField(workgroupcategory,IntegerType,true),
> StructField(workgroupcategoryname,StringType,true),
> StructField(deptno,IntegerType,true), StructField(deptname,StringType,true),
> StructField(projectcode,IntegerType,true),
> StructField(projectjoindate,TimestampType,true),
> StructField(projectenddate,TimestampType,true),
> StructField(attendance,IntegerType,true),
> StructField(utilization,IntegerType,true),
> StructField(salary,IntegerType,true))) ]
> default.origintable[deptname#1138,salary#1144|#1138,salary#1144]|
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
>
> Please check the attached document for reference.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)