[
https://issues.apache.org/jira/browse/SPARK-40736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616399#comment-17616399
]
Pratik Malani commented on SPARK-40736:
---------------------------------------
Hi All,
Removed hive-service jar from the classpath.
Now Spark Thriftserver has started but facing another issue while querying the
database using the thriftserver.
{noformat}
java.lang.NullPointerException
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:809)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:702)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:650)
at
org.apache.hadoop.hive.metastore.ObjectStore.unCacheDataNucleusClassLoaders(ObjectStore.java:9708)
at
org.apache.hadoop.hive.ql.session.SessionState.unCacheDataNucleusClassLoaders(SessionState.java:1802)
at
org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1777)
at
org.apache.hive.service.cli.session.HiveSessionImpl.close(HiveSessionImpl.java:669)
at
org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:295)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLSessionManager.closeSession(SparkSQLSessionManager.scala:91)
at
org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:238)
at
org.apache.hive.service.cli.thrift.ThriftCLIService$1.deleteContext(ThriftCLIService.java:107)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:325)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Exception in thread "HiveServer2-Handler-Pool: Thread-68"
java.lang.NoSuchMethodError:
org.apache.hadoop.hive.ql.QueryState.<init>(Lorg/apache/hadoop/hive/conf/HiveConf;Ljava/util/Map;Z)V
at
org.apache.hive.service.cli.operation.Operation.<init>(Operation.java:89)
at
org.apache.hive.service.cli.operation.ExecuteStatementOperation.<init>(ExecuteStatementOperation.java:34)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.<init>(SparkExecuteStatementOperation.scala:50)
at
org.apache.spark.sql.hive.thriftserver.server.SparkSQLOperationManager.newExecuteStatementOperation(SparkSQLOperationManager.scala:55)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:481)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:472)
at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:455)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750){noformat}
After deep investigation, found out Spark 3.3.0 is creating the QueryState
using constructor
[https://jar-download.com/artifacts/org.apache.spark/spark-hive-thriftserver_2.12/3.3.0/source-code/org/apache/hive/service/cli/operation/Operation.java]
!image-2022-10-12-18-19-24-455.png|width=581,height=183!
This constructor initialization has been removed from the latest hive
QueryState.java file
[https://jar-download.com/artifacts/org.apache.hive/hive-exec/3.1.2/source-code/org/apache/hadoop/hive/ql/QueryState.java]
Hive is using Builder pattern while creating the QueryState object.
Can Spark help to resolve the code to be compatible with Spark 3.1.2?
> Spark 3.3.0 doesn't works with Hive 3.1.2
> -----------------------------------------
>
> Key: SPARK-40736
> URL: https://issues.apache.org/jira/browse/SPARK-40736
> Project: Spark
> Issue Type: Bug
> Components: Spark Core, SQL
> Affects Versions: 3.3.0
> Reporter: Pratik Malani
> Priority: Major
> Labels: Hive, spark
> Attachments: image-2022-10-12-18-19-24-455.png
>
>
> Hive 2.3.9 is impacted with CVE-2021-34538, so trying to use the Hive 3.1.2.
> Using Spark 3.3.0 with Hadoop 3.3.4 and Hive 3.1.2, getting below error when
> starting the Thriftserver
>
> {noformat}
> Exception in thread "main" java.lang.IllegalAccessError: tried to access
> class org.apache.hive.service.server.HiveServer2$ServerOptionsProcessor from
> class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$
> at
> org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:92)
> at
> org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala){noformat}
> Using below command to start the Thriftserver
>
> *spark-class org.apache.spark.deploy.SparkSubmit --class
> org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal*
>
> Have set the SPARK_HOME correctly.
>
> The same works well with Hive 2.3.9, but fails when we upgrade to Hive 3.1.2.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]