[
https://issues.apache.org/jira/browse/HIVE-29213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033798#comment-18033798
]
Butao Zhang commented on HIVE-29213:
------------------------------------
Let me continue with some additional points. I just noticed that it's not only
Hive that uses the Beeline client; other components like Impala and HBase also
utilize the Beeline component. Therefore, the relevant primary key code in
Beeline—specifically `DatabaseMetaData.getPrimaryKeys`—though never used by
Hive prior to Hive 4.1.0, might have been consistently used by HBase. This
could very well be the source of the issue reported in HIVE-19996.
This makes it quite interesting: the Beeline codes related to primary key in
Hive that has never been utilized by Hive itself but is actively used by other
components.
Nevertheless, I still recommend removing the usage of
`DatabaseMetaData.getPrimaryKeys` in Beeline. First, fixing this exception on
the Hive side is neither easy nor meaningful; Second, as mentioned above, the
purpose of retrieving primary keys in Beeline is solely for coloring the output
results. Removing this code would not cause any compatibility issues and would
actually improve Beeline's performance, as it avoids the heavyweight Thrift RPC
call to the server for fetching primary keys.
> HS2 report 'Failed to get primary keys' if using old beeline client
> -------------------------------------------------------------------
>
> Key: HIVE-29213
> URL: https://issues.apache.org/jira/browse/HIVE-29213
> Project: Hive
> Issue Type: Bug
> Components: Beeline
> Affects Versions: 4.1.0
> Reporter: Butao Zhang
> Priority: Critical
>
> HIVE-19996 fixed a performance issue related to beeline. However, this fix
> causes a warning exception "Failed to get primary keys" to appear in the HS2
> logs when using a beeline client that does not include this patch (such as
> Hive 4.1.0) to submit SQL queries to Hive 4.2.0.
>
> Here is a simple test procedure:
> # Use the Hive 4.1.0 beeline client to connect to the latest version of Hive
> 4.2.0 (which includes HIVE-19996).
> // Create a test table
> create table testdb.test12(id int);
> {color:#de350b}*// Query this table, and you will notice the following
> exception: Failed to get primary keys ..*{color}
> select * from testdb.test12;
> {code:java}
> 2025-09-18T16:22:34,121 ERROR [HiveServer2-Handler-Pool: Thread-166]
> thrift.ThriftCLIService: Failed to get primary keys [request:
> TGetPrimaryKeysReq(sessionHandle:TSessionHandle(sessionId:THandleIdentifier(guid:56
> 22 37 10 00 2D 4B BB B6 FE 42 49 E2 25 75 1F, secret:C7 E6 BF 96 D1 82 4E F5
> 82 36 AC A7 A1 3E 7A A6)), catalogName:, tableName:test12)]
> org.apache.hive.service.cli.HiveSQLException:
> org.apache.thrift.protocol.TProtocolException: Required field 'db_name' is
> unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:test12, catName:hive)
> at
> org.apache.hive.service.cli.operation.GetPrimaryKeysOperation.runInternal(GetPrimaryKeysOperation.java:120)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:286)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.getPrimaryKeys(HiveSessionImpl.java:998)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
> ~[?:?]
> at java.base/javax.security.auth.Subject.doAs(Subject.java:525) ~[?:?]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
> ~[hadoop-common-3.4.1.jar:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at jdk.proxy2/jdk.proxy2.$Proxy41.getPrimaryKeys(Unknown Source)
> ~[?:?]
> at
> org.apache.hive.service.cli.CLIService.getPrimaryKeys(CLIService.java:416)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.GetPrimaryKeys(ThriftCLIService.java:919)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1870)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1850)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ~[?:?]
> at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field
> 'db_name' is unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:test12,
> catName:hive)
> at
> org.apache.hadoop.hive.metastore.api.PrimaryKeysRequest.validate(PrimaryKeysRequest.java:591)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args.validate(ThriftHiveMetastore.java)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args$get_primary_keys_argsStandardScheme.write(ThriftHiveMetastore.java)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args$get_primary_keys_argsStandardScheme.write(ThriftHiveMetastore.java)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args.write(ThriftHiveMetastore.java)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_primary_keys(ThriftHiveMetastore.java:4926)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_primary_keys(ThriftHiveMetastore.java:4918)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.ThriftHiveMetaStoreClient.getPrimaryKeys(ThriftHiveMetaStoreClient.java:2393)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.getPrimaryKeys(MetaStoreClientWrapper.java:1081)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT] {code}
> *{color:#de350b}// show tables, then you will find new
> exceptions:client.ThriftHiveMetaStoreClient: Got error flushing the cache
> org.apache.thrift.TApplicationException: Unrecognized type -128{color}*
> show tables;
> {code:java}
> 2025-09-18T16:22:39,771 INFO [56223710-002d-4bbb-b6fe-4249e225751f
> HiveServer2-Handler-Pool: Thread-166] ql.Driver: Compiling
> command(queryId=hive_20250918162239_69ae0c6e-309f-4b87-a2c4-8093ae4700e8):
> show tables
> 2025-09-18T16:22:39,771 WARN [56223710-002d-4bbb-b6fe-4249e225751f
> HiveServer2-Handler-Pool: Thread-166] client.ThriftHiveMetaStoreClient: Got
> error flushing the cache
> org.apache.thrift.TApplicationException: Unrecognized type -128
> at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_flushCache(ThriftHiveMetastore.java:7493)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.flushCache(ThriftHiveMetastore.java:7481)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.ThriftHiveMetaStoreClient.flushCache(ThriftHiveMetaStoreClient.java:2541)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.flushCache(MetaStoreClientWrapper.java:1043)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.flushCache(MetaStoreClientWrapper.java:1043)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.flushCache(MetaStoreClientWrapper.java:1043)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
> at
> org.apache.hadoop.hive.metastore.client.SynchronizedMetaStoreClient$SynchronizedHandler.invoke(SynchronizedMetaStoreClient.java:69)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at jdk.proxy2/jdk.proxy2.$Proxy32.flushCache(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.flushCache(MetaStoreClientWrapper.java:1043)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:232)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at jdk.proxy2/jdk.proxy2.$Proxy32.flushCache(Unknown Source) ~[?:?]
> at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:198)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:109)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:498)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:450)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:414)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:408)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:205)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:268)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:286)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:558)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:543)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
> ~[?:?]
> at java.base/javax.security.auth.Subject.doAs(Subject.java:525) ~[?:?]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
> ~[hadoop-common-3.4.1.jar:?]
> at
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at jdk.proxy2/jdk.proxy2.$Proxy41.executeStatementAsync(Unknown
> Source) ~[?:?]
> at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:652)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1670)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1650)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> ~[hive-service-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ~[?:?]
> at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> 2025-09-18T16:22:39,773 WARN [56223710-002d-4bbb-b6fe-4249e225751f
> HiveServer2-Handler-Pool: Thread-166] metastore.RetryingMetaStoreClient:
> MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s.
> getDatabase
> org.apache.thrift.transport.TTransportException: Socket is closed by peer.
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:464)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:362)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:245)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database_req(ThriftHiveMetastore.java:1497)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database_req(ThriftHiveMetastore.java:1484)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.ThriftHiveMetaStoreClient.getDatabase(ThriftHiveMetaStoreClient.java:1979)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.getDatabase(MetaStoreClientWrapper.java:211)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getDatabase(SessionHiveMetaStoreClient.java:2281)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.metastore.client.MetaStoreClientWrapper.getDatabase(MetaStoreClientWrapper.java:211)
> ~[hive-exec-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT]
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> ~[?:?]
> {code}
>
>
> {color:#172b4d}HIVE-19996 may have broken the backward compatibility of
> beeline. Accessing Hive 4.2 through the Hive 4.1 beeline client should
> theoretically not present compatibility issues. We need to ensure backward
> compatibility as much as possible to provide users with a smooth
> experience.{color}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)