Cheng Lian created SPARK-11785:
----------------------------------

             Summary: When deployed against remote Hive metastore with lower 
versions, JDBC metadata calls throws exception
                 Key: SPARK-11785
                 URL: https://issues.apache.org/jira/browse/SPARK-11785
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.5.1, 1.6.0, 1.7.0
            Reporter: Cheng Lian


To reproduce this issue with 1.7-SNAPSHOT
# Start Hive 0.13.1 metastore service using {{$HIVE_HOME/bin/hive --service 
metastore}}
# Configures remote Hive metastore in {{conf/hive-site.xml}} by pointing 
{{hive.metastore.uris}} to metastore endpoint (e.g. {{thrift://localhost:9083}})
# Set {{spark.sql.hive.metastore.version}} to {{0.13.1}} and 
{{spark.sql.hive.metastore.jars}} to {{maven}} in {{conf/spark-defaults.conf}}
# Start Thrift server using {{$SPARK_HOME/sbin/start-thriftserver.sh}}
# Run the testing JDBC client program attached at the end

Exception thrown from client side:
{noformat}
java.sql.SQLException: Could not create ResultSet: Required field 
'operationHandle' is unset! 
Struct:TGetResultSetMetadataReq(operationHandle:null)
java.sql.SQLException: Could not create ResultSet: Required field 
'operationHandle' is unset! 
Struct:TGetResultSetMetadataReq(operationHandle:null)
        at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:273)
        at 
org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188)
        at 
org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170)
        at 
org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222)
        at JDBCExperiments$.main(JDBCExperiments.scala:28)
        at JDBCExperiments.main(JDBCExperiments.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
'operationHandle' is unset! 
Struct:TGetResultSetMetadataReq(operationHandle:null)
        at 
org.apache.hive.service.cli.thrift.TGetResultSetMetadataReq.validate(TGetResultSetMetadataReq.java:290)
        at 
org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.validate(TCLIService.java:12041)
        at 
org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12098)
        at 
org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12067)
        at 
org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.write(TCLIService.java:12018)
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:63)
        at 
org.apache.hive.service.cli.thrift.TCLIService$Client.send_GetResultSetMetadata(TCLIService.java:472)
        at 
org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:464)
        at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:242)
        at 
org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188)
        at 
org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170)
        at 
org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222)
        at JDBCExperiments$.main(JDBCExperiments.scala:28)
        at JDBCExperiments.main(JDBCExperiments.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
{noformat}

Exception thrown from server side:
{noformat}
15/11/18 02:27:01 WARN RetryingMetaStoreClient: MetaStoreClient lost 
connection. Attempting to reconnect.
org.apache.thrift.TApplicationException: Invalid method name: 
'get_schema_with_environment_context'
        at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_schema_with_environment_context(ThriftHiveMetastore.java:1010)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_schema_with_environment_context(ThriftHiveMetastore.java:995)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getSchema(HiveMetaStoreClient.java:1499)
        at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getSchema(SessionHiveMetaStoreClient.java:239)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
        at com.sun.proxy.$Proxy13.getSchema(Unknown Source)
        at 
org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:160)
        at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
        at 
org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:519)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
        at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
        at com.sun.proxy.$Proxy18.getColumns(Unknown Source)
        at 
org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:350)
        at 
org.apache.hive.service.cli.thrift.ThriftCLIService.GetColumns(ThriftCLIService.java:575)
        at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1433)
        at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1418)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

I suspect it's related to SPARK-9686 and SPARK-11783. My guess is that:
# When deployed against remote Hive metastore, execution Hive client points to 
the actual Hive metastore rather than local execution Derby metastore using 
Hive 1.2.1 libraries delivered together with Spark (SPARK-11783).
# JDBC calls are not properly dispatched to metastore Hive client in Thrift 
server, but handled by execution Hive. (SPARK-9686).
# When a JDBC call like {{getSchemas()}} comes, execution Hive client using a 
higher version (1.2.1) is used to talk to a lower version Hive metastore 
(0.13.1). Because of incompatible changes made between these two versions, the 
Thrift RPC call fails and exceptions are thrown.

(This assumption hasn't been fully verified yet.)

The testing JDBC program:

{code}
import java.sql.DriverManager

object JDBCExperiments {
  def main(args: Array[String]) {
    val url = "jdbc:hive2://localhost:10000/default"
    val username = "lian"
    val password = ""

    try {
      Class.forName("org.apache.hive.jdbc.HiveDriver")
      val connection = DriverManager.getConnection(url, username, password)
      val metadata = connection.getMetaData
      val schema = metadata.getSchemas()

      while (schema.next()) {
        val (key, value) = (schema.getString(1), schema.getString(2))
        println(s"$key: $value")
      }

      val tables = metadata.getTables(null, null, null, null)
      while (tables.next()) {
        val fields = Array.tabulate(5) { i =>
          tables.getString(i + 1)
        }
        println(fields.mkString(", "))
      }

      val columns = metadata.getColumns(null, null, null, null)
      while (columns.next()) {
        println((columns.getString(3), columns.getString(4), 
columns.getString(6)))
      }
    }
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to