snazy commented on code in PR #3566:
URL: https://github.com/apache/polaris/pull/3566#discussion_r2754353469


##########
persistence/relational-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -312,7 +312,8 @@ private boolean isRetryable(SQLException e) {
 
     // Additionally, one might check for specific error messages or other 
conditions
     return e.getMessage().toLowerCase(Locale.ROOT).contains("connection 
refused")
-        || e.getMessage().toLowerCase(Locale.ROOT).contains("connection 
reset");
+        || e.getMessage().toLowerCase(Locale.ROOT).contains("connection reset")
+        || e.getMessage().toLowerCase(Locale.ROOT).contains("acquisition 
timeout");

Review Comment:
   Agree, that retrying after on an 'acquisition timeout' leads to unfair 
request execution, which may or may not be desirable.
   
   I wonder whether it is a good strategy to further increase the acquisition 
timeout.
   
   In such situations the connection pool is already exhausted or, in other 
retry cases, the database is in a "bad shape".
   
   Retrying or waiting longer lets requests pile up in the Polaris service, 
leading to additional resource usage by waiting threads holding resources.
   
   User/client requests might have already been aborted, but the threads in the 
service sill occupy resources, meaning that newer client requests are stalled, 
because earlier requests haven't finished.
   This can easily lead to a situation where the Polaris service unnecessarily 
appears unresponsive for clients.
   
   The situation gets worse with pieces that add significant additional load 
against the database, like Polaris events or the idempotency-keys proposal.
   
   Instead of increasing the acquisition timeout, operators should consider 
adjusting the connection pool size to the load. Or shed load by limiting the 
number of concurrent requests via the ingress.
   
   Polaris' `RateLimiterFilter` does not help, it actually makes the situation 
worse, because rate-limited events are persisted via JDBC, requiring a _usable_ 
JDBC connection. This leads to blocked servlet request filter instances, which 
I am not sure is a good idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to