jungho.choi created SPARK-47881:
-----------------------------------
Summary: Not working HDFS path for hive.metastore.jars.path
Key: SPARK-47881
URL: https://issues.apache.org/jira/browse/SPARK-47881
Project: Spark
Issue Type: Question
Components: SQL
Affects Versions: 3.4.2
Reporter: jungho.choi
I trying to use Hive Metastore version 3.1.3 with Spark version 3.4.2, but
encountering an error when specifying the path to the metastore JARs on HDFS.
According to the official documentation, you've followed the guidelines and
specified the path using an HDFS URI:
{code:java}
spark.sql.hive.metastore.version 3.1.3
spark.sql.hive.metastore.jars path
spark.sql.hive.metastore.jars.path hdfs://namespace/spark/hive3_lib/* {code}
However, when tested it, encountered an error stating that the URI schema in
HiveClientImpl.scala is not file.
{code:java}
Caused by: java.lang.ExceptionInInitializerError:
java.lang.IllegalArgumentException: URI scheme is not "file"
at
org.apache.spark.sql.hive.client.HiveClientImpl$.newHiveConf(HiveClientImpl.scala:1296)
at
org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:174)
at
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:139)
at
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:315)
at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:517)
at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:377)
at
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:70)
at
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:69)
at
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:223)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101)
... 143 more {code}
To resolve this, changed the spark.sql.hive.metastore.jars.path to a local file
path instead of an HDFS path, and it worked fine. I think I followed the
instructions correctly, but are there any specific configurations or
preferences required to use HDFS paths?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]