[GitHub] [spark] wangyum opened a new pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

GitBox Mon, 23 Nov 2020 19:19:38 -0800


wangyum opened a new pull request #30478:
URL: https://github.com/apache/spark/pull/30478



   ### What changes were proposed in this pull request?
   
   We supported Hive metastore are 0.12.0 through 3.1.2, but we supported 
hive-jdbc are 0.12.0 through 2.3.7. It will throw `TProtocolException` if we 
use hive-jdbc 3.x:
   
   ```
   [root@spark-3267648 apache-hive-3.1.2-bin]# bin/beeline -u 
jdbc:hive2://localhost:10000/default
   Connecting to jdbc:hive2://localhost:10000/default
   Connected to: Spark SQL (version 3.1.0-SNAPSHOT)
   Driver: Hive JDBC (version 3.1.2)
   Transaction isolation: TRANSACTION_REPEATABLE_READ
   Beeline version 3.1.2 by Apache Hive
   0: jdbc:hive2://localhost:10000/default> create table t1(id int) using 
parquet;
   Unexpected end of file when reading from HS2 server. The root cause might be 
too many concurrent connections. Please ask the administrator to check the 
number of active connections, and adjust hive.server2.thrift.max.worker.threads 
if applicable.
   Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)
   ```
   ```
   org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
        at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:234)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
        at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:53)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
        at java.base/java.lang.Thread.run(Thread.java:832)
   ```
   
   This pr upgrade hive-service-rpc to 3.1.2 to fix this issue.
   
   ### Why are the changes needed?
   
   To support hive-jdbc 3.x.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   
   Manual test:
   ```
   [root@spark-3267648 apache-hive-3.1.2-bin]# bin/beeline -u 
jdbc:hive2://localhost:10000/default
   Connecting to jdbc:hive2://localhost:10000/default
   Connected to: Spark SQL (version 3.1.0-SNAPSHOT)
   Driver: Hive JDBC (version 3.1.2)
   Transaction isolation: TRANSACTION_REPEATABLE_READ
   Beeline version 3.1.2 by Apache Hive
   0: jdbc:hive2://localhost:10000/default> create table t1(id int) using 
parquet;
   +---------+
   | Result  |
   +---------+
   +---------+
   No rows selected (1.051 seconds)
   0: jdbc:hive2://localhost:10000/default> insert into t1 values(1);
   +---------+
   | Result  |
   +---------+
   +---------+
   No rows selected (2.08 seconds)
   0: jdbc:hive2://localhost:10000/default> select * from t1;
   +-----+
   | id  |
   +-----+
   | 1   |
   +-----+
   1 row selected (0.605 seconds)
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] wangyum opened a new pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

Reply via email to