[GitHub] [iceberg] pvary commented on a change in pull request #2119: Hive: Fix connection pool fails to connect to the Hive Metastore #1994

GitBox Thu, 21 Jan 2021 00:38:32 -0800


pvary commented on a change in pull request #2119:
URL: https://github.com/apache/iceberg/pull/2119#discussion_r561690101




##########
File path: 
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -83,6 +83,20 @@ protected HiveMetaStoreClient reconnect(HiveMetaStoreClient 
client) {
     return client;
   }
 
+  @Override
+  protected boolean failureDetection(HiveMetaStoreClient client, Exception ex) 
{

Review comment:
       MetaException does not have an embedded Exception (no `ex.getCause()` 
because the MetaException is declared on the Thrift interface and only has a 
message field) 😢. Also it is used for signaling connection problems and input 
data issues as well, and the only difference between those is the exception 
message text. So we can only check the message text to decide on the cause 
which is too brittle for my taste.
   
   So I would prefer a solution where we do not try to differentiate between 
the MetaExceptions parsing the text message.
   
   Since we can not have a definite answer when the error is caused by a 
connection problem or a bad parametrization of the HMS call, I think 
reconnection every time when we have a MetaException should be avoided if 
possible. I would use a cheap probe to check the liveliness of the connection 
(for example `client.getServerVersion()`)
   
   I see the following possibilities:
   - Periodic check for the connection - Fix number of extra HMS calls, no 
extra delay on other HMS calls
   - Check when there is a MetaException - Double HMS calls for every call 
where the response is MetaException 
   - Some combination of the above
   
   The decision should depend on the use-cases we would like to cover.
   - Who will use the `ClientPool`?
   - How likely that we issue calls which are badly parametrized?
   - How likely that we will have connection issues?
   
   To answer the questions from the Hive side:
   - I do not see places where we can have MetaException because of the wrong 
parametrization of the calls (if so we have to fix it 😄)
   - If there is a MetaException then that is because of the connection issues
   
   OTOH I consider ClientPool as an API which could/should be used by other 
components, and such an API should handle errors better. I would accept the 
extra complexity to avoid extra `reconnect` attempts, and even maybe extra HMS 
calls too, but can be convinced otherwise.
   
   That's my 2 cents.
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pvary commented on a change in pull request #2119: Hive: Fix connection pool fails to connect to the Hive Metastore #1994

Reply via email to