[
https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565016#comment-16565016
]
Prasanna Ravichandran commented on CARBONDATA-2534:
---------------------------------------------------
Now the MV creation is working with the substring function without any error
but when the user queries the MV query, it is not accessing the data from the
MV datamap.
*Terminal:*
> create datamap mv_substr using 'mv' as select
> sum(salary),substring(empname,2,5),designation from originTable group by
> substring(empname,2,5),designation;
+---------+--+
| Result |
+---------+--+
+---------+--+
No rows selected (0.661 seconds)
> explain select sum(salary),substring(empname,2,5),designation from
> originTable group by substring(empname,2,5),designation;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| plan |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| == CarbonData Profiler ==
Table Scan on origintable
- total blocklets: 2
- filter: none
- pruned by Main DataMap
- skipped blocklets: 0
|
| == Physical Plan ==
*HashAggregate(keys=[substring(empname#18267, 2, 5)#18352, designation#18268],
functions=[sum(cast(salary#18279 as bigint))])
+- Exchange hashpartitioning(substring(empname#18267, 2, 5)#18352,
designation#18268, 200)
+- *HashAggregate(keys=[substring(empname#18267, 2, 5) AS
substring(empname#18267, 2, 5)#18352, designation#18268],
functions=[partial_sum(cast(salary#18279 as bigint))])
+- *FileScan carbondata
*b011.origintable*[empname#18267,designation#18268,salary#18279] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
2 rows selected (0.432 seconds)
> MV Dataset - MV creation is not working with the substring()
> -------------------------------------------------------------
>
> Key: CARBONDATA-2534
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2534
> Project: CarbonData
> Issue Type: Bug
> Components: data-query
> Environment: 3 node opensource ANT cluster
> Reporter: Prasanna Ravichandran
> Priority: Minor
> Labels: CarbonData, MV, Materialistic_Views
> Fix For: 1.5.0, 1.4.1
>
> Attachments: MV_substring.docx, data.csv
>
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> MV creation is not working with the sub string function. We are getting the
> spark.sql.AnalysisException while trying to create a MV with the substring
> and aggregate function.
> *Spark -shell test queries:*
> scala> carbon.sql("create datamap mv_substr using 'mv' as select
> sum(salary),substring(empname,2,5),designation from originTable group by
> substring(empname,2,5),designation").show(200,false)
> *org.apache.spark.sql.AnalysisException: Cannot create a table having a
> column whose name contains commas in Hive metastore. Table:
> `default`.`mv_substr_table`; Column: substring_empname,_2,_5;*
> *at*
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
> at
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
> at
> org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
> at
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
> at
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
> at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
> at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
> at
> org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126)
> at
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
> at
> org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103)
> at
> org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53)
> at
> org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118)
> at
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
> at
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108)
> at
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97)
> at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155)
> at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95)
> ... 48 elided
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)