[jira] [Created] (CARBONDATA-4278) Avoid refetching all indexes to get segment properties
Mahesh Raju Somalaraju created CARBONDATA-4278: -- Summary: Avoid refetching all indexes to get segment properties Key: CARBONDATA-4278 URL: https://issues.apache.org/jira/browse/CARBONDATA-4278 Project: CarbonData Issue Type: Bug Reporter: Mahesh Raju Somalaraju h1. Avoid refetching all indexes to get segment properties 1) When block index is available then no need to prepare blockindex from available segments and partition locations. 2) call directly getsegment properties if blockindex available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi updated CARBONDATA-4273: --- Comment: was deleted (was: Can you tell me, in which file environment you are facing this issue ? in Hadoop FileSystem or are you running this in local ?) > Cannot create table with partitions in Spark in EMR > --- > > Key: CARBONDATA-4273 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4273 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.2.0 > Environment: Release label:emr-5.24.1 > Hadoop distribution:Amazon 2.8.5 > Applications: > Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, > JupyterHub 0.9.6 > Jar complied with: > apache-carbondata:2.2.0 > spark:2.4.5 > hadoop:2.8.3 >Reporter: Bigicecream >Priority: Critical > Labels: EMR, spark > > > When trying to create a table like this: > {code:sql} > CREATE TABLE IF NOT EXISTS will_not_work( > timestamp string, > name string > ) > PARTITIONED BY (dt string, hr string) > STORED AS carbondata > LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work > {code} > I get the following error: > {noformat} > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Partition is not supported for external table > at > org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219) > at > org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) > at > org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394) > at > org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643) > ... 64 elided > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405273#comment-17405273 ] Indhumathi commented on CARBONDATA-4273: Can you tell me, in which file environment you are facing this issue ? in Hadoop FileSystem or are you running this in local ? > Cannot create table with partitions in Spark in EMR > --- > > Key: CARBONDATA-4273 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4273 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.2.0 > Environment: Release label:emr-5.24.1 > Hadoop distribution:Amazon 2.8.5 > Applications: > Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, > JupyterHub 0.9.6 > Jar complied with: > apache-carbondata:2.2.0 > spark:2.4.5 > hadoop:2.8.3 >Reporter: Bigicecream >Priority: Critical > Labels: EMR, spark > > > When trying to create a table like this: > {code:sql} > CREATE TABLE IF NOT EXISTS will_not_work( > timestamp string, > name string > ) > PARTITIONED BY (dt string, hr string) > STORED AS carbondata > LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work > {code} > I get the following error: > {noformat} > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Partition is not supported for external table > at > org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219) > at > org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) > at > org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394) > at > org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643) > ... 64 elided > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-4180) create maintable and do insert before creation of si on maintable, then query on si column from presto does not hit SI
[ https://issues.apache.org/jira/browse/CARBONDATA-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahesh Raju Somalaraju closed CARBONDATA-4180. -- the same problem is working fine. So closing the JIRA > create maintable and do insert before creation of si on maintable, then query > on si column from presto does not hit SI > -- > > Key: CARBONDATA-4180 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4180 > Project: CarbonData > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > create maintable and do insert before creation of si on maintable, then query > on si column from presto does not hit SI > > steps: > 1) create maintable > 2) insert the data > 3) create SI > 4) query from presto on si column > Expectation: > It should hit SI table and fetch the results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405241#comment-17405241 ] Brijoo Bopanna commented on CARBONDATA-4273: Thanks for sharing this issue, we will check and reply > Cannot create table with partitions in Spark in EMR > --- > > Key: CARBONDATA-4273 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4273 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.2.0 > Environment: Release label:emr-5.24.1 > Hadoop distribution:Amazon 2.8.5 > Applications: > Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, > JupyterHub 0.9.6 > Jar complied with: > apache-carbondata:2.2.0 > spark:2.4.5 > hadoop:2.8.3 >Reporter: Bigicecream >Priority: Critical > Labels: EMR, spark > > > When trying to create a table like this: > {code:sql} > CREATE TABLE IF NOT EXISTS will_not_work( > timestamp string, > name string > ) > PARTITIONED BY (dt string, hr string) > STORED AS carbondata > LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work > {code} > I get the following error: > {noformat} > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Partition is not supported for external table > at > org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219) > at > org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) > at > org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394) > at > org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643) > ... 64 elided > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)
[ https://issues.apache.org/jira/browse/CARBONDATA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] PURUJIT CHAUGULE updated CARBONDATA-4277: - Priority: Major (was: Minor) > Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData > 2.2.0 (Spark 2.4.5 and Spark 3.1.1) > - > > Key: CARBONDATA-4277 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4277 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Spark 2.4.5 > Spark 3.1.1 >Reporter: PURUJIT CHAUGULE >Priority: Major > > > > *Issue 1 : Load on geospatial table from 2.1.0 table in 2.2.0(Spark 2.4.5 and > 3.1.1) is failing* > *STEPS:-* > # create table in CarbonData 2.1.0 : create table > source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS > carbondata TBLPROPERTIES > ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, > > latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100'); > # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO > TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', > 'QUOTECHAR'='|'); > # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and > Spark 3.1.1) clusters > # refresh table source_index_2_1_0; > # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE > source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.Exception: DataLoad failure: Data Loading failed for table > source_index_2_1_0 > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for > table source_index_2_1_0 > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168) >
[jira] [Updated] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)
[ https://issues.apache.org/jira/browse/CARBONDATA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] PURUJIT CHAUGULE updated CARBONDATA-4277: - Description: *Issue 1 : Load on geospatial table from 2.1.0 table in 2.2.0(Spark 2.4.5 and 3.1.1) is failing* *STEPS:-* # create table in CarbonData 2.1.0 : create table source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS carbondata TBLPROPERTIES ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100'); # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and Spark 3.1.1) clusters # refresh table source_index_2_1_0; # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); Error: org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.Exception: DataLoad failure: Data Loading failed for table source_index_2_1_0 at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for table source_index_2_1_0 at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) at org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) at org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.
[jira] [Updated] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)
[ https://issues.apache.org/jira/browse/CARBONDATA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] PURUJIT CHAUGULE updated CARBONDATA-4277: - Summary: Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1) (was: Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1))) > Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData > 2.2.0 (Spark 2.4.5 and Spark 3.1.1) > - > > Key: CARBONDATA-4277 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4277 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Spark 2.4.5 > Spark 3.1.1 >Reporter: PURUJIT CHAUGULE >Priority: Minor > > > > *Issue 1 : Load on geo table from 2.1.0 table in 2.2.0(Spark 2.4.5 and 3.1.1) > is failing* > *STEPS:-* > # create table in CarbonData 2.1.0 : create table > source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS > carbondata TBLPROPERTIES > ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, > > latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100'); > # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO > TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', > 'QUOTECHAR'='|'); > # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and > Spark 3.1.1) clusters > # refresh table source_index_2_1_0; > # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE > source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.Exception: DataLoad failure: Data Loading failed for table > source_index_2_1_0 > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for > table source_index_2_1_0 > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala
[jira] [Created] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1))
PURUJIT CHAUGULE created CARBONDATA-4277: Summary: Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)) Key: CARBONDATA-4277 URL: https://issues.apache.org/jira/browse/CARBONDATA-4277 Project: CarbonData Issue Type: Bug Affects Versions: 2.2.0 Environment: Spark 2.4.5 Spark 3.1.1 Reporter: PURUJIT CHAUGULE *Issue 1 : Load on geo table from 2.1.0 table in 2.2.0(Spark 2.4.5 and 3.1.1) is failing* *STEPS:-* # create table in CarbonData 2.1.0 : create table source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS carbondata TBLPROPERTIES ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100'); # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and Spark 3.1.1) clusters # refresh table source_index_2_1_0; # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); Error: org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.Exception: DataLoad failure: Data Loading failed for table source_index_2_1_0 at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for table source_index_2_1_0 at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) at org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) at org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Data
[jira] [Resolved] (CARBONDATA-4234) Alter change datatype at nested levels
[ https://issues.apache.org/jira/browse/CARBONDATA-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4234. Fix Version/s: 2.3.0 Resolution: Fixed > Alter change datatype at nested levels > -- > > Key: CARBONDATA-4234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4234 > Project: CarbonData > Issue Type: Sub-task >Reporter: Akshay >Priority: Major > Fix For: 2.3.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4199) Support renaming of map columns including nested levels
[ https://issues.apache.org/jira/browse/CARBONDATA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4199. Fix Version/s: 2.3.0 Resolution: Fixed > Support renaming of map columns including nested levels > --- > > Key: CARBONDATA-4199 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4199 > Project: CarbonData > Issue Type: Sub-task >Reporter: Akshay >Priority: Minor > Fix For: 2.3.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4198) Support adding of single-level and multi-level map columns
[ https://issues.apache.org/jira/browse/CARBONDATA-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4198. Fix Version/s: 2.3.0 Resolution: Fixed > Support adding of single-level and multi-level map columns > -- > > Key: CARBONDATA-4198 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4198 > Project: CarbonData > Issue Type: Sub-task >Reporter: Akshay >Priority: Minor > Fix For: 2.3.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4164) Support adding of multi-level complex columns(array/struct)
[ https://issues.apache.org/jira/browse/CARBONDATA-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4164. Fix Version/s: 2.3.0 Resolution: Fixed > Support adding of multi-level complex columns(array/struct) > --- > > Key: CARBONDATA-4164 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4164 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Akshay >Priority: Major > Fix For: 2.3.0 > > Time Spent: 8h > Remaining Estimate: 0h > > Add multi-level(upto 3 nested levels) complex columns(only array and struct) > to carbon table. For example - > Command - > ALTER TABLE ADD COLUMNS(arr array >) -- This message was sent by Atlassian Jira (v8.3.4#803005)