snazy commented on code in PR #1517:
URL: https://github.com/apache/polaris/pull/1517#discussion_r2078333260


##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/RelationalJdbcConfiguration.java:
##########
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.extension.persistence.relational.jdbc;
+
+import io.smallrye.config.ConfigMapping;
+import java.util.Optional;
+
+@ConfigMapping(prefix = "polaris.persistence.relational.jdbc")

Review Comment:
   Having the `@ConfigMapping` annotation here via the smallrye-config 
dependency is fine IMO



##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -173,23 +190,86 @@ public int executeUpdate(String query) throws 
SQLException {
    * @throws SQLException : Exception caught during transaction execution.
    */
   public void runWithinTransaction(TransactionCallback callback) throws 
SQLException {
-    try (Connection connection = borrowConnection()) {
-      boolean autoCommit = connection.getAutoCommit();
-      connection.setAutoCommit(false);
-      boolean success = false;
+    withRetries(
+        () -> {
+          try (Connection connection = borrowConnection()) {
+            boolean autoCommit = connection.getAutoCommit();
+            boolean success = false;
+            connection.setAutoCommit(false);
+            try {
+              try (Statement statement = connection.createStatement()) {
+                success = callback.execute(statement);
+              }
+            } finally {
+              if (success) {
+                connection.commit();
+              } else {
+                connection.rollback();
+              }
+              connection.setAutoCommit(autoCommit);
+            }
+          }
+          return null;
+        });
+  }
+
+  private boolean isRetryable(SQLException e) {
+    String sqlState = e.getSQLState();
+
+    if (sqlState != null) {
+      return sqlState.equals(DEADLOCK_SQL_CODE)
+          || // Deadlock detected
+          sqlState.equals(SERIALIZATION_FAILURE_SQL_CODE); // Serialization 
failure
+    }
+
+    // Additionally, one might check for specific error messages or other 
conditions
+    return e.getMessage().contains("connection refused")
+        || e.getMessage().contains("connection reset");
+  }
+
+  public <T> T withRetries(Operation<T> operation) throws SQLException {
+    int attempts = 0;
+    // maximum number of retries.
+    int maxAttempts = relationalJdbcConfiguration.maxRetries().orElse(1);
+    // How long we should try, since the first attempt.
+    long maxDuration = 
relationalJdbcConfiguration.maxDurationInMs().orElse(5000L);
+    // How long to wait before first failure.
+    long delay = relationalJdbcConfiguration.initialDelayInMs().orElse(100L);
+
+    // maximum time we will retry till.
+    long maxRetryTime = Clock.systemUTC().millis() + maxDuration;
+
+    while (attempts < maxAttempts) {
       try {
-        try (Statement statement = connection.createStatement()) {
-          success = callback.execute(statement);
+        return operation.execute();
+      } catch (SQLException e) {
+        attempts++;
+        long timeLeft = Math.max((maxRetryTime - Clock.systemUTC().millis()), 
0L);
+        if (attempts >= maxAttempts || !isRetryable(e) || timeLeft == 0) {
+          String exceptionMessage =
+              String.format(
+                  "Failed due to %s, after , %s attempts and %s milliseconds",
+                  e.getMessage(), attempts, maxDuration);
+          throw new SQLException(exceptionMessage, e.getSQLState(), 
e.getErrorCode());
         }
-      } finally {
-        if (success) {
-          connection.commit();
-        } else {
-          connection.rollback();
+        // Add jitter
+        long timeToSleep = Math.min(timeLeft, delay + (long) 
(random.nextFloat() * 0.2 * delay));

Review Comment:
   Noting: there's `org.apache.polaris.persistence.commits.retry` package in 
#1189 for a _tested_ retry-loop. There's a _LOT_ of nuance here.



##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -173,23 +190,82 @@ public int executeUpdate(String query) throws 
SQLException {
    * @throws SQLException : Exception caught during transaction execution.
    */
   public void runWithinTransaction(TransactionCallback callback) throws 
SQLException {
-    try (Connection connection = borrowConnection()) {
-      boolean autoCommit = connection.getAutoCommit();
-      connection.setAutoCommit(false);
-      boolean success = false;
+    withRetries(
+        () -> {
+          try (Connection connection = borrowConnection()) {
+            boolean autoCommit = connection.getAutoCommit();
+            boolean success = false;
+            connection.setAutoCommit(false);
+            try {
+              try (Statement statement = connection.createStatement()) {
+                success = callback.execute(statement);
+              }
+            } finally {
+              if (success) {
+                connection.commit();
+              } else {
+                connection.rollback();
+              }
+              connection.setAutoCommit(autoCommit);
+            }
+          }
+          return null;
+        });
+  }
+
+  private boolean isRetryable(SQLException e) {
+    String sqlState = e.getSQLState();
+
+    if (sqlState != null) {
+      return sqlState.equals(DEADLOCK_SQL_CODE)
+          || // Deadlock detected
+          sqlState.equals(SERIALIZATION_FAILURE_SQL_CODE); // Serialization 
failure
+    }
+
+    // Additionally, one might check for specific error messages or other 
conditions
+    return e.getMessage().contains("connection refused")
+        || e.getMessage().contains("connection reset");
+  }
+
+  public <T> T withRetries(Operation<T> operation) throws SQLException {
+    int attempts = 0;
+    // maximum number of retries.
+    int maxAttempts = relationalJdbcConfiguration.maxRetries().orElse(1);

Review Comment:
   I doubt it's a good "middle ground". Time is most important here, which is 
not being controlled here - it can run for hours and the client's already hung 
up.



##########
extension/persistence/relational-jdbc/src/test/java/org/apache/polaris/extension/persistence/impl/relational/jdbc/AtomicMetastoreManagerWithJdbcBasePersistenceImplTest.java:
##########
@@ -67,4 +70,22 @@ protected PolarisTestMetaStoreManager 
createPolarisTestMetaStoreManager() {
             new PolarisConfigurationStore() {},
             timeSource.withZone(ZoneId.systemDefault())));
   }
+
+  private static class H2JdbcConfiguration implements 
RelationalJdbcConfiguration {

Review Comment:
   Why is this a constant thing?



##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -28,23 +28,38 @@
 import java.sql.ResultSet;
 import java.sql.SQLException;
 import java.sql.Statement;
+import java.time.Clock;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.Objects;
+import java.util.Random;
 import java.util.function.Consumer;
 import java.util.function.Function;
 import java.util.stream.Stream;
 import javax.sql.DataSource;
 import 
org.apache.polaris.extension.persistence.relational.jdbc.models.Converter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 public class DatasourceOperations {
 
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(DatasourceOperations.class);
+
   private static final String CONSTRAINT_VIOLATION_SQL_CODE = "23505";
 
+  // POSTGRES RETRYABLE EXCEPTIONS
+  private static final String DEADLOCK_SQL_CODE = "40P01";

Review Comment:
   Not sure whether retrying on a deadlock is a good idea.
   Deadlocks should IMHO not happen at all.



##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -128,22 +143,21 @@ public <T> void executeSelectOverStream(
       @Nonnull Converter<T> converterInstance,
       @Nonnull Consumer<Stream<T>> consumer)
       throws SQLException {
-    try (Connection connection = borrowConnection();
-        Statement statement = connection.createStatement();
-        ResultSet resultSet = statement.executeQuery(query)) {
-      ResultSetIterator<T> iterator = new ResultSetIterator<>(resultSet, 
converterInstance);
-      consumer.accept(iterator.toStream());
-    } catch (SQLException e) {
-      throw e;
-    } catch (RuntimeException e) {
-      if (e.getCause() instanceof SQLException) {
-        throw (SQLException) e.getCause();
-      } else {
-        throw e;
-      }
-    } catch (Exception e) {
-      throw new RuntimeException(e);
-    }
+    withRetries(
+        () -> {
+          try (Connection connection = borrowConnection();
+              Statement statement = connection.createStatement();
+              ResultSet resultSet = statement.executeQuery(query)) {
+            ResultSetIterator<T> iterator = new ResultSetIterator<>(resultSet, 
converterInstance);
+            consumer.accept(iterator.toStream());
+            return null;
+          } catch (RuntimeException e) {
+            if (e.getCause() instanceof SQLException ex) {
+              throw ex;

Review Comment:
   Agree. This can easily lose important contextual information



##########
extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/DatasourceOperations.java:
##########
@@ -173,23 +190,86 @@ public int executeUpdate(String query) throws 
SQLException {
    * @throws SQLException : Exception caught during transaction execution.
    */
   public void runWithinTransaction(TransactionCallback callback) throws 
SQLException {
-    try (Connection connection = borrowConnection()) {
-      boolean autoCommit = connection.getAutoCommit();
-      connection.setAutoCommit(false);
-      boolean success = false;
+    withRetries(
+        () -> {
+          try (Connection connection = borrowConnection()) {
+            boolean autoCommit = connection.getAutoCommit();
+            boolean success = false;
+            connection.setAutoCommit(false);
+            try {
+              try (Statement statement = connection.createStatement()) {
+                success = callback.execute(statement);
+              }
+            } finally {
+              if (success) {
+                connection.commit();
+              } else {
+                connection.rollback();
+              }
+              connection.setAutoCommit(autoCommit);
+            }
+          }
+          return null;
+        });
+  }
+
+  private boolean isRetryable(SQLException e) {
+    String sqlState = e.getSQLState();
+
+    if (sqlState != null) {
+      return sqlState.equals(DEADLOCK_SQL_CODE)
+          || // Deadlock detected
+          sqlState.equals(SERIALIZATION_FAILURE_SQL_CODE); // Serialization 
failure
+    }
+
+    // Additionally, one might check for specific error messages or other 
conditions
+    return e.getMessage().contains("connection refused")
+        || e.getMessage().contains("connection reset");
+  }
+
+  public <T> T withRetries(Operation<T> operation) throws SQLException {

Review Comment:
   Still `public`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to