[GitHub] [iceberg] adnanhb opened a new issue #2176: Running iceberg with spark 3 in local mode

GitBox Thu, 28 Jan 2021 09:08:36 -0800


adnanhb opened a new issue #2176:
URL: https://github.com/apache/iceberg/issues/2176



   Hi,
   I am running into an exception when writing to an iceberg table using spark 
3 in local mode.
   Code is roughly:
   
   ```
   SparkSession` spark = SparkSession.builder()
      .config("spark.sql.catalog.spark_catalog", 
"org.apache.iceberg.spark.SparkSessionCatalog")
      .config("spark.sql.catalog.spark_catalog.type", "hive")
      .config("spark.sql.catalog.local", 
"org.apache.iceberg.spark.SparkCatalog")
      .config("spark.sql.catalog.local.type", "hadoop")
      .config("spark.sql.catalog.local.warehouse", "/warehouse");
   
   HadoopCatalog catalog = new HadoopCatalog(new Configuration(), location);
   
   Table table = catalog.createTable(tableId, schema, spec);
   
   table.updateProperties().set(TableProperties.WRITE_NEW_DATA_LOCATION, 
location).commit();
   
   Dataset ds = \<some dataset\>;
   
   ds.writeTo(tableName).append();
   ```
   
   The above code results in the following exception:
   `Exception in thread "main" org.apache.iceberg.hive.RuntimeMetaException: 
Failed to connect to Hive Metastore
           at 
org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63)
           at 
org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:30)
           at org.apache.iceberg.hive.ClientPool.get(ClientPool.java:117)
           at org.apache.iceberg.hive.ClientPool.run(ClientPool.java:52)
           at 
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:121)
           at 
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:86)
           at 
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:69)
           at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:102)
           at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2344)
           at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
           at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2342)
           at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2325)
           at 
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
           at 
com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
           at 
org.apache.iceberg.CachingCatalog.loadTable(CachingCatalog.java:94)
           at 
org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:125)
           at 
org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:78)
           at 
org.apache.iceberg.spark.SparkSessionCatalog.loadTable(SparkSessionCatalog.java:118)
           at 
org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:283)
           at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.loaded$lzycompute$1(Analyzer.scala:1010)
           at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.loaded$1(Analyzer.scala:1010)
           at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$3(Analyzer.scala:1022)
           Caused by: MetaException(message:Version information not found in 
metastore. )
           at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:83)
           at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
           at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6902)
           at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
           at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:129)
           at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
           at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
           at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
           at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
           at 
org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60)
           at 
org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73)
           at 
org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:54)`
   
   Is it not possible to use iceberg with spark running in local mode?
   Thanks in advance.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] adnanhb opened a new issue #2176: Running iceberg with spark 3 in local mode

Reply via email to