[ 
https://issues.apache.org/jira/browse/CARBONDATA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-2536:
----------------------------------------------
    Attachment: data.csv

> MV Dataset - When user query has substring() of column under group by, which 
> is same as the MV group by column, then the user query is not accessing the 
> data from the MV datamap table.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2536
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2536
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>         Environment: 3 node opensource ANT Cluster
>            Reporter: Prasanna Ravichandran
>            Priority: Minor
>              Labels: Carbondata, MV, Materialistic_Views
>         Attachments: data.csv
>
>
> MV Dataset - When user query has substring() of column under group by, which 
> is same as the  MV group by column, then the user query is not accessing the 
> data from the MV datamap table. It is accessing the data from the main table 
> only.
> Test query:
> carbon.sql("CREATE TABLE originTable (empno int, empname String, designation 
> String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
> deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
> projectenddate Timestamp,attendance int, utilization int,salary int) STORED 
> BY 'org.apache.carbondata.format'").show()
> ++
>  
> ++
>  ++
>  
> carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' 
> INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
> '\"','timestampformat'='dd-MM-yyyy')").show()
> ++
>  
> ++
>  ++
>  
> scala> carbon.sql("Create datamap m2 using 'mv' as select sum(salary) from 
> originTable group by deptname").show(200,false)
>  ++
>  
> ++
>  ++
> scala> carbon.sql("rebuild datamap m2").show(200,false)
>  ++
>  
> ++
>  ++
>  
> scala> carbon.sql("explain select sum(salary) from originTable group by 
> substring(deptname,2,2)")
>  res60: org.apache.spark.sql.DataFrame = [plan: string]
> scala> carbon.sql("explain select sum(salary) from originTable group by 
> substring(deptname,2,2)").show(200,false)
>  
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |plan|
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |== CarbonData Profiler ==
>  Table Scan on origintable
>  - total blocklets: 1
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0|
> |== Physical Plan ==
>  *HashAggregate(keys=[substring(deptname#1138, 2, 2)#1255|#1138, 2, 2)#1255], 
> functions=[sum(cast(salary#1144 as bigint))|#1144 as bigint))])
>  +- Exchange hashpartitioning(substring(deptname#1138, 2, 2)#1255, 200)
>  +- *HashAggregate(keys=[substring(deptname#1138, 2, 2) AS 
> substring(deptname#1138, 2, 2)#1255|#1138, 2, 2) AS substring(deptname#1138, 
> 2, 2)#1255], functions=[partial_sum(cast(salary#1144 as bigint))|#1144 as 
> bigint))])
>  +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, 
> Table name :origintable, Schema 
> :Some(StructType(StructField(empno,IntegerType,true), 
> StructField(empname,StringType,true), 
> StructField(designation,StringType,true), 
> StructField(doj,TimestampType,true), 
> StructField(workgroupcategory,IntegerType,true), 
> StructField(workgroupcategoryname,StringType,true), 
> StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), 
> StructField(projectcode,IntegerType,true), 
> StructField(projectjoindate,TimestampType,true), 
> StructField(projectenddate,TimestampType,true), 
> StructField(attendance,IntegerType,true), 
> StructField(utilization,IntegerType,true), 
> StructField(salary,IntegerType,true))) ] 
> default.origintable[deptname#1138,salary#1144|#1138,salary#1144]|
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
>  
> Please check the attached document for reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to