[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()
[ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565578#comment-16565578 ] Ravindra Pesala commented on CARBONDATA-2534: - After datamap creation you should rebuild datamap before accessing it.otherwise it will be disabled. > MV Dataset - MV creation is not working with the substring() > - > > Key: CARBONDATA-2534 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2534 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: 3 node opensource ANT cluster >Reporter: Prasanna Ravichandran >Priority: Minor > Labels: CarbonData, MV, Materialistic_Views > Fix For: 1.5.0, 1.4.1 > > Attachments: MV_substring.docx, data.csv > > Time Spent: 3h 10m > Remaining Estimate: 0h > > MV creation is not working with the sub string function. We are getting the > spark.sql.AnalysisException while trying to create a MV with the substring > and aggregate function. > *Spark -shell test queries:* > scala> carbon.sql("create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation").show(200,false) > *org.apache.spark.sql.AnalysisException: Cannot create a table having a > column whose name contains commas in Hive metastore. Table: > `default`.`mv_substr_table`; Column: substring_empname,_2,_5;* > *at* > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316) > at > org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95) > at > org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68) > at > org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103) > at > org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53) > at > org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at
[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()
[ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565167#comment-16565167 ] Prasanna Ravichandran commented on CARBONDATA-2534: --- When the user executes the MV datamap query, it should be accessed from MV_Table. > MV Dataset - MV creation is not working with the substring() > - > > Key: CARBONDATA-2534 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2534 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: 3 node opensource ANT cluster >Reporter: Prasanna Ravichandran >Priority: Minor > Labels: CarbonData, MV, Materialistic_Views > Fix For: 1.5.0, 1.4.1 > > Attachments: MV_substring.docx, data.csv > > Time Spent: 3h 10m > Remaining Estimate: 0h > > MV creation is not working with the sub string function. We are getting the > spark.sql.AnalysisException while trying to create a MV with the substring > and aggregate function. > *Spark -shell test queries:* > scala> carbon.sql("create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation").show(200,false) > *org.apache.spark.sql.AnalysisException: Cannot create a table having a > column whose name contains commas in Hive metastore. Table: > `default`.`mv_substr_table`; Column: substring_empname,_2,_5;* > *at* > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316) > at > org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95) > at > org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68) > at > org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103) > at > org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53) > at > org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at
[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()
[ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565016#comment-16565016 ] Prasanna Ravichandran commented on CARBONDATA-2534: --- Now the MV creation is working with the substring function without any error but when the user queries the MV query, it is not accessing the data from the MV datamap. *Terminal:* > create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation; +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.661 seconds) > explain select sum(salary),substring(empname,2,5),designation from > originTable group by substring(empname,2,5),designation; +--+--+ | plan | +--+--+ | == CarbonData Profiler == Table Scan on origintable - total blocklets: 2 - filter: none - pruned by Main DataMap - skipped blocklets: 0 | | == Physical Plan == *HashAggregate(keys=[substring(empname#18267, 2, 5)#18352, designation#18268], functions=[sum(cast(salary#18279 as bigint))]) +- Exchange hashpartitioning(substring(empname#18267, 2, 5)#18352, designation#18268, 200) +- *HashAggregate(keys=[substring(empname#18267, 2, 5) AS substring(empname#18267, 2, 5)#18352, designation#18268], functions=[partial_sum(cast(salary#18279 as bigint))]) +- *FileScan carbondata *b011.origintable*[empname#18267,designation#18268,salary#18279] | +--+--+ 2 rows selected (0.432 seconds) > MV Dataset - MV creation is not working with the substring() > - > > Key: CARBONDATA-2534 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2534 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: 3 node opensource ANT cluster >Reporter: Prasanna Ravichandran >Priority: Minor > Labels: CarbonData, MV, Materialistic_Views > Fix For: 1.5.0, 1.4.1 > > Attachments: MV_substring.docx, data.csv > > Time Spent: 3h 10m > Remaining Estimate: 0h > > MV creation is not working with the sub string function. We are getting the > spark.sql.AnalysisException while trying to create a MV with the substring > and aggregate function. > *Spark -shell test queries:* > scala> carbon.sql("create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation").show(200,false) > *org.apache.spark.sql.AnalysisException: Cannot create a table having a > column whose name contains commas in Hive metastore. Table: > `default`.`mv_substr_table`; Column: substring_empname,_2,_5;* > *at* > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148) > at >
[jira] [Commented] (CARBONDATA-2534) MV Dataset - MV creation is not working with the substring()
[ https://issues.apache.org/jira/browse/CARBONDATA-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500032#comment-16500032 ] Prasanna Ravichandran commented on CARBONDATA-2534: --- Base table queries: CREATE TABLE originTable (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, utilization int,salary int) STORED BY 'org.apache.carbondata.format'; LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"','timestampformat'='dd-MM-'); > MV Dataset - MV creation is not working with the substring() > - > > Key: CARBONDATA-2534 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2534 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: 3 node opensource ANT cluster >Reporter: Prasanna Ravichandran >Priority: Minor > Labels: CarbonData, MV, Materialistic_Views > Attachments: MV_substring.docx > > > MV creation is not working with the sub string function. We are getting the > spark.sql.AnalysisException while trying to create a MV with the substring > and aggregate function. > *Spark -shell test queries:* > scala> carbon.sql("create datamap mv_substr using 'mv' as select > sum(salary),substring(empname,2,5),designation from originTable group by > substring(empname,2,5),designation").show(200,false) > *org.apache.spark.sql.AnalysisException: Cannot create a table having a > column whose name contains commas in Hive metastore. Table: > `default`.`mv_substr_table`; Column: substring_empname,_2,_5;* > *at* > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:150) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema$2.apply(HiveExternalCatalog.scala:148) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.hive.HiveExternalCatalog.org$apache$spark$sql$hive$HiveExternalCatalog$$verifyDataSchema(HiveExternalCatalog.scala:148) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:222) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316) > at > org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:119) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) > at org.apache.spark.sql.Dataset.(Dataset.scala:183) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:108) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:97) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:155) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:95) > at > org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:126) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68) > at > org.apache.carbondata.mv.datamap.MVHelper$.createMVDataMap(MVHelper.scala:103) > at > org.apache.carbondata.mv.datamap.MVDataMapProvider.initMeta(MVDataMapProvider.scala:53) > at > org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:118) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at >