Cheng Lian created SPARK-11785: ---------------------------------- Summary: When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception Key: SPARK-11785 URL: https://issues.apache.org/jira/browse/SPARK-11785 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.1, 1.6.0, 1.7.0 Reporter: Cheng Lian
To reproduce this issue with 1.7-SNAPSHOT # Start Hive 0.13.1 metastore service using {{$HIVE_HOME/bin/hive --service metastore}} # Configures remote Hive metastore in {{conf/hive-site.xml}} by pointing {{hive.metastore.uris}} to metastore endpoint (e.g. {{thrift://localhost:9083}}) # Set {{spark.sql.hive.metastore.version}} to {{0.13.1}} and {{spark.sql.hive.metastore.jars}} to {{maven}} in {{conf/spark-defaults.conf}} # Start Thrift server using {{$SPARK_HOME/sbin/start-thriftserver.sh}} # Run the testing JDBC client program attached at the end Exception thrown from client side: {noformat} java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:273) at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170) at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222) at JDBCExperiments$.main(JDBCExperiments.scala:28) at JDBCExperiments.main(JDBCExperiments.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) at org.apache.hive.service.cli.thrift.TGetResultSetMetadataReq.validate(TGetResultSetMetadataReq.java:290) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.validate(TCLIService.java:12041) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12098) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12067) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.write(TCLIService.java:12018) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:63) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_GetResultSetMetadata(TCLIService.java:472) at org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:464) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:242) at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170) at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222) at JDBCExperiments$.main(JDBCExperiments.scala:28) at JDBCExperiments.main(JDBCExperiments.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) {noformat} Exception thrown from server side: {noformat} 15/11/18 02:27:01 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.TApplicationException: Invalid method name: 'get_schema_with_environment_context' at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_schema_with_environment_context(ThriftHiveMetastore.java:1010) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_schema_with_environment_context(ThriftHiveMetastore.java:995) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getSchema(HiveMetaStoreClient.java:1499) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getSchema(SessionHiveMetaStoreClient.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy13.getSchema(Unknown Source) at org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:160) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:519) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy18.getColumns(Unknown Source) at org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:350) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetColumns(ThriftCLIService.java:575) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} I suspect it's related to SPARK-9686 and SPARK-11783. My guess is that: # When deployed against remote Hive metastore, execution Hive client points to the actual Hive metastore rather than local execution Derby metastore using Hive 1.2.1 libraries delivered together with Spark (SPARK-11783). # JDBC calls are not properly dispatched to metastore Hive client in Thrift server, but handled by execution Hive. (SPARK-9686). # When a JDBC call like {{getSchemas()}} comes, execution Hive client using a higher version (1.2.1) is used to talk to a lower version Hive metastore (0.13.1). Because of incompatible changes made between these two versions, the Thrift RPC call fails and exceptions are thrown. (This assumption hasn't been fully verified yet.) The testing JDBC program: {code} import java.sql.DriverManager object JDBCExperiments { def main(args: Array[String]) { val url = "jdbc:hive2://localhost:10000/default" val username = "lian" val password = "" try { Class.forName("org.apache.hive.jdbc.HiveDriver") val connection = DriverManager.getConnection(url, username, password) val metadata = connection.getMetaData val schema = metadata.getSchemas() while (schema.next()) { val (key, value) = (schema.getString(1), schema.getString(2)) println(s"$key: $value") } val tables = metadata.getTables(null, null, null, null) while (tables.next()) { val fields = Array.tabulate(5) { i => tables.getString(i + 1) } println(fields.mkString(", ")) } val columns = metadata.getColumns(null, null, null, null) while (columns.next()) { println((columns.getString(3), columns.getString(4), columns.getString(6))) } } } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org