[
https://issues.apache.org/jira/browse/HIVE-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874676#comment-15874676
]
Rui Li commented on HIVE-15893:
-------------------------------
Hi [~xuefuz], our RPC channel has a handler that monitors the channel inactive
event:
https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L240
When the channel is closed abnormally this handler closes the RPC and print a
warning:
https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java#L131
My understanding is the time needed to detect the broken connection is up to
netty. When I worked on HIVE-15860 it's immediately detected.
So I think you can check your log to look for the warning message. If the
message is printed, it means the error is detected and Hive is probably hanging
somewhere else.
Besides, you may want to check whether you increased this property
{{hive.spark.client.future.timeout}}. It's one possible reason that can make
the client wait.
> Followup on HIVE-15671
> ----------------------
>
> Key: HIVE-15893
> URL: https://issues.apache.org/jira/browse/HIVE-15893
> Project: Hive
> Issue Type: Improvement
> Components: Spark
> Affects Versions: 2.2.0
> Reporter: Xuefu Zhang
> Assignee: Xuefu Zhang
>
> In HIVE-15671, we fixed a type where server.connect.timeout is used in the
> place of client.connect.timeout. This might solve some potential problems,
> but the original problem reported in HIVE-15671 might still exist. (Not sure
> if HIVE-15860 helps). Here is the proposal suggested by Marcelo:
> {quote}
> bq: server detecting a driver problem after it has connected back to the
> server.
> Hmm. That is definitely not any of the "connect" timeouts, which probably
> means it isn't configured and is just using netty's default (which is
> probably no timeout?). Would probably need something using
> io.netty.handler.timeout.IdleStateHandler, and also some periodic "ping" so
> that the connection isn't torn down without reason.
> {quote}
> We will use this JIRA to track the issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)