[GitHub] [iceberg] RussellSpitzer commented on issue #1533: Error when creating a table `InvalidObjectException`

GitBox Thu, 01 Oct 2020 08:04:53 -0700


RussellSpitzer commented on issue #1533:
URL: https://github.com/apache/iceberg/issues/1533#issuecomment-702198282



   I believe this is caused by an incompatibility between the hive client
   being used and the hive metastore being used. The Inner error is
   
   Caused by: InvalidObjectException(message:db)
   at
   
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42216)
   
   Which should be thrown when the thrift api makes a call and is unable to
   work with the results. My guess would be that the hive client being
   included at runtime within Spark is not compatible with the metastore you
   are running. I know there are several incompatibilities between certain
   metastore versions and certain versions of spark. I couldn't find any
   tickets for a 2.7x metastore in the Spark jira but I did see support for
   2.3, 2.4 and 3.0 and 3.1. So it may be related to that? What version of the
   metastore are you using?
   
   On Wed, Sep 30, 2020 at 9:17 AM Vijay Akkineni <[email protected]>
   wrote:
   
   > Hi,
   >
   > I am new to iceberg and trying to follow get started section. But running 
into an error while creating a table using iceberg.  Below is my environment 
setup. Any help is appreciated.
   >
   > Hive - 2.7.3 using Postgres as metastore.
   > Spark - 3.0.1 with hadoop 2.7.4 with S3A
   > spark.sql.warehouse.dir ('s3a://testbucket')
   >
   > ./spark-sql
   > --conf
   > 
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
   >
   > --conf spark.sql.catalog.spark_catalog.type=hive
   > --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog
   > --conf spark.sql.catalog.local.type=hive
   > --conf spark.sql.catalog.local.uri=thrift://192.168.86.240:9083
   >
   > SQL command: CREATE TABLE local.db.sample (id bigint COMMENT 'unique id',
   > data string, category string) USING iceberg PARTITIONED BY (category);
   >
   > I do see a directory getting created in s3 bucket
   > testbucket/db.db/sample/metadata, but fails with the below error.
   >
   > The exception while creating table is
   > 20/09/30 10:01:00 ERROR SparkSQLDriver: Failed in [CREATE TABLE
   > local.db.sample (id bigint COMMENT 'unique id', data string,category
   > string) USING iceberg PARTITIONED BY (category)]
   > java.lang.RuntimeException: Metastore operation failed for db.sample
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:200)
   > at
   > 
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:103)
   > at
   > 
org.apache.iceberg.BaseMetastoreCatalog.createTable(BaseMetastoreCatalog.java:71)
   > at
   > 
org.apache.iceberg.CachingCatalog.lambda$createTable$0(CachingCatalog.java:75)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2337)
   > at
   > 
java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2335)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2318)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:111)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:54)
   > at org.apache.iceberg.CachingCatalog.createTable(CachingCatalog.java:73)
   > at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:136)
   > at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:78)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.CreateTableExec.run(CreateTableExec.scala:41)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:39)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:39)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:45)
   > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
   > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
   > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616)
   > at org.apache.spark.sql.Dataset.(Dataset.scala:229)
   > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
   > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
   > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:650)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
   > at scala.collection.Iterator.foreach(Iterator.scala:941)
   > at scala.collection.Iterator.foreach$(Iterator.scala:941)
   > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
   > at scala.collection.IterableLike.foreach(IterableLike.scala:74)
   > at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
   > at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:282)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
   > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
   > Method)
   > at
   > 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   > at
   > 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   > at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   > at
   > 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   > at org.apache.spark.deploy.SparkSubmit.org
   > $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
   > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   > at
   > 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
   > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
   > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   > Caused by: InvalidObjectException(message:db)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42216)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42193)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:42119)
   > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1203)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1189)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2405)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:752)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:740)
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.lambda$doCommit$3(HiveTableOperations.java:185)
   > at org.apache.iceberg.hive.ClientPool.run(ClientPool.java:54)
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:184)
   > ... 56 more
   > java.lang.RuntimeException: Metastore operation failed for db.sample
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:200)
   > at
   > 
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:103)
   > at
   > 
org.apache.iceberg.BaseMetastoreCatalog.createTable(BaseMetastoreCatalog.java:71)
   > at
   > 
org.apache.iceberg.CachingCatalog.lambda$createTable$0(CachingCatalog.java:75)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2337)
   > at
   > 
java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2335)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2318)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:111)
   > at
   > 
org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:54)
   > at org.apache.iceberg.CachingCatalog.createTable(CachingCatalog.java:73)
   > at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:136)
   > at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:78)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.CreateTableExec.run(CreateTableExec.scala:41)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:39)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:39)
   > at
   > 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:45)
   > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
   > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at
   > 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
   > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616)
   > at org.apache.spark.sql.Dataset.(Dataset.scala:229)
   > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
   > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607)
   > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
   > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:650)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
   > at scala.collection.Iterator.foreach(Iterator.scala:941)
   > at scala.collection.Iterator.foreach$(Iterator.scala:941)
   > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
   > at scala.collection.IterableLike.foreach(IterableLike.scala:74)
   > at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
   > at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:282)
   > at
   > 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
   > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
   > Method)
   > at
   > 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   > at
   > 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   > at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   > at
   > 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   > at org.apache.spark.deploy.SparkSubmit.org
   > $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
   > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   > at
   > 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
   > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
   > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   > Caused by: InvalidObjectException(message:db)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42216)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:42193)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:42119)
   > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1203)
   > at
   > 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1189)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2405)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:752)
   > at
   > 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:740)
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.lambda$doCommit$3(HiveTableOperations.java:185)
   > at org.apache.iceberg.hive.ClientPool.run(ClientPool.java:54)
   > at
   > 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:184)
   > ... 56 more
   >
   > —
   > You are receiving this because you are subscribed to this thread.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/iceberg/issues/1533>, or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AADE2YNS4FE3KGUV5RSUN6DSIM4Y7ANCNFSM4R7E322A>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] RussellSpitzer commented on issue #1533: Error when creating a table `InvalidObjectException`

Reply via email to