Copilot commented on code in PR #6383:
URL: https://github.com/apache/hive/pull/6383#discussion_r2987115503


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java:
##########
@@ -253,6 +265,56 @@ public boolean needsQuotedIdentifier() {
       return false;
     }
 
+    @Override
+    public List<String> getExecutableCommands(String scriptDir, String 
scriptFile)
+        throws IllegalFormatException, IOException {
+      return getExecutableCommands(scriptDir, scriptFile, false);
+    }
+
+    @Override
+    public List<String> getExecutableCommands(String scriptDir, String 
scriptFile, boolean fixQuotes) 
+        throws IllegalFormatException, IOException {
+      List<String> commands = new java.util.ArrayList<>();
+
+      try (BufferedReader bfReader = 
+               new BufferedReader(new FileReader(scriptDir + 
File.separatorChar + scriptFile))) {
+        String currLine;
+        String currentCommand = null;
+
+        while ((currLine = bfReader.readLine()) != null) {
+          currLine = fixQuotesFromCurrentLine(fixQuotes, currLine.trim());
+
+          if (currLine.isEmpty()) {
+            continue;
+          }
+
+          currentCommand = currentCommand == null ? currLine : currentCommand 
+ " " + currLine;
+
+          if (!isPartialCommand(currLine)) {
+            if (!isNonExecCommand(currentCommand)) {
+              currentCommand = cleanseCommand(currentCommand);
+              if (isNestedScript(currentCommand)) {
+                String currScript = getScriptName(currentCommand);
+                commands.addAll(getExecutableCommands(scriptDir, currScript, 
fixQuotes));
+              } else {
+                commands.add(currentCommand.trim());
+              }
+            }
+            currentCommand = null;
+          }
+
+        }

Review Comment:
   getExecutableCommands() drops the final SQL statement if the file ends 
without a trailing delimiter (currentCommand remains non-null when the reader 
hits EOF, but it’s never flushed into commands). This changes behavior vs 
executing the script directly and can cause the last statement in a script to 
be silently skipped. Consider handling any remaining currentCommand after the 
loop (including nested-script expansion and cleansing) so EOF also terminates a 
statement.
   ```suggestion
           }
   
           // Handle any remaining command when EOF is reached without a 
terminating delimiter.
           if (currentCommand != null && !isNonExecCommand(currentCommand)) {
             currentCommand = cleanseCommand(currentCommand);
             if (isNestedScript(currentCommand)) {
               String currScript = getScriptName(currentCommand);
               commands.addAll(getExecutableCommands(scriptDir, currScript, 
fixQuotes));
             } else {
               commands.add(currentCommand.trim());
             }
           }
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java:
##########
@@ -305,33 +299,22 @@ protected void execSql(String scriptDir, String 
scriptFile) throws IOException,
     execSql(scriptDir + File.separatorChar + scriptFile);
   }
 
-  // Generate the beeline args per hive conf and execute the given script
+  /**
+   * Executes the given SQL script file against the metastore database via 
{@link IdempotentDDLExecutor}.
+   * Each statement in the script is executed individually over a direct JDBC 
connection with
+   * auto-commit enabled. Errors that indicate an object already exists or is 
already gone are
+   * silently ignored according to the per-database {@link DbErrorCodes}; all 
other SQL errors are
+   * rethrown as {@link IOException}.
+   */
   protected void execSql(String sqlScriptFile) throws IOException {
-    CommandBuilder builder =
-        new CommandBuilder(conf, url, driver, userName, passWord, 
sqlScriptFile)
-            .setVerbose(verbose);
-
-    // run the script using SqlLine
-    SqlLine sqlLine = new SqlLine();
-    ByteArrayOutputStream outputForLog = null;
-    if (!verbose) {
-      OutputStream out;
-      if (LOG.isDebugEnabled()) {
-        out = outputForLog = new ByteArrayOutputStream();
-      } else {
-        out = new NullOutputStream();
-      }
-      sqlLine.setOutputStream(new PrintStream(out));
-      System.setProperty("sqlline.silent", "true");
-    }
-    LOG.info("Going to run command <" + builder.buildToLog() + ">");
-    SqlLine.Status status = sqlLine.begin(builder.buildToRun(), null, false);
-    if (LOG.isDebugEnabled() && outputForLog != null) {
-      LOG.debug("Received following output from Sqlline:");
-      LOG.debug(outputForLog.toString("UTF-8"));
-    }
-    if (status != SqlLine.Status.OK) {
-      throw new IOException("Schema script failed, errorcode " + status);
+    LOG.info("Going to run script <{}> via Idempotent JDBC Executor", 
sqlScriptFile);
+    try (Connection conn = getConnectionToMetastore(true)) {
+      NestedScriptParser parser = getDbCommandParser(dbType, metaDbType);
+      IdempotentDDLExecutor idempotentExecutor = new 
IdempotentDDLExecutor(conn, dbType, parser, verbose);
+      idempotentExecutor.executeScript(sqlScriptFile);
+      LOG.info("Script executed successfully.");

Review Comment:
   execSql() previously forced TRANSACTION_READ_COMMITTED via SqlLine 
arguments, but the new direct-JDBC execution path does not set the connection’s 
transaction isolation level. To keep schema upgrades consistent across 
DBs/drivers, consider explicitly setting 
conn.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED) (and 
potentially restoring the prior value if you ever reuse the connection).
   ```suggestion
         int previousIsolation = Connection.TRANSACTION_NONE;
         boolean isolationChanged = false;
         try {
           previousIsolation = conn.getTransactionIsolation();
           if (previousIsolation != Connection.TRANSACTION_READ_COMMITTED) {
             
conn.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED);
             isolationChanged = true;
           }
           NestedScriptParser parser = getDbCommandParser(dbType, metaDbType);
           IdempotentDDLExecutor idempotentExecutor =
               new IdempotentDDLExecutor(conn, dbType, parser, verbose);
           idempotentExecutor.executeScript(sqlScriptFile);
           LOG.info("Script executed successfully.");
         } finally {
           if (isolationChanged) {
             try {
               conn.setTransactionIsolation(previousIsolation);
             } catch (SQLException e) {
               LOG.warn("Failed to restore previous transaction isolation 
level", e);
             }
           }
         }
   ```



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/IdempotentDDLExecutor.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore.tools.schematool;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.sql.Connection;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.List;
+
+public class IdempotentDDLExecutor {
+  private static final Logger LOG = 
LoggerFactory.getLogger(IdempotentDDLExecutor.class);
+
+  private final Connection conn;
+  private final DbErrorCodes errorCodes;
+  private final HiveSchemaHelper.NestedScriptParser parser;
+  private final boolean verbose;
+
+  public IdempotentDDLExecutor(
+      Connection conn, String dbType, HiveSchemaHelper.NestedScriptParser 
parser, boolean verbose) {
+    this.conn = conn;
+    this.errorCodes = DbErrorCodes.forDbType(dbType);
+    this.parser = parser;
+    this.verbose = verbose;
+  }
+
+  public void executeScript(String scriptFile) throws SQLException, 
IOException {
+    LOG.info("Executing script line-by-line via IdempotentDDLExecutor: {}", 
scriptFile);
+    conn.setAutoCommit(true);
+
+    File file = new File(scriptFile);
+    List<String> commands = parser.getExecutableCommands(file.getParent(), 
file.getName());
+
+    for (String sqlStmt : commands) {
+      if (!sqlStmt.isEmpty()) {
+        executeStatement(sqlStmt);
+      }
+    }
+    if (verbose) {
+      System.out.println("Completed successfully.");
+    }
+  }
+
+  private void executeStatement(String sqlStmt) throws SQLException {
+    if (verbose) {
+      System.out.println("Executing: " + sqlStmt);
+    } 
+    if (LOG.isDebugEnabled()) {
+      LOG.debug("Executing: {}", sqlStmt);
+    }
+
+    try (Statement stmt = conn.createStatement()) {
+      stmt.execute(sqlStmt);
+    } catch (SQLException e) {
+      if (errorCodes.isIgnorable(e)) {
+        String msg = String.format("Object already exists or was already 
dropped. " +
+            "Statement: %s, ErrorCode: %d, SQLState: %s", sqlStmt, 
e.getErrorCode(), e.getSQLState());
+        if (verbose) {
+          System.out.println(msg);
+        }
+        if (LOG.isDebugEnabled()) {
+          LOG.debug(msg);
+        }
+      } else {
+        LOG.error("SQL Error executing: {}. Error Code: {}, SQLState: {}", 
sqlStmt, e.getErrorCode(), e.getSQLState());
+        throw e;

Review Comment:
   IdempotentDDLExecutor ignores errors purely based on SQLState/errorCode, 
regardless of statement type. This can accidentally swallow failures from 
non-DDL statements (e.g., UPDATE/SELECT hitting an undefined table/column) 
because those can share the same SQLStates as DDL “missing object” errors, 
potentially letting an upgrade appear successful while skipping required work. 
Consider restricting the ignorable-error behavior to DDL statements 
(CREATE/DROP/ALTER/RENAME/INDEX/CONSTRAINT, etc.) or otherwise validating the 
statement category before ignoring the exception.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java:
##########
@@ -305,33 +299,22 @@ protected void execSql(String scriptDir, String 
scriptFile) throws IOException,
     execSql(scriptDir + File.separatorChar + scriptFile);
   }
 
-  // Generate the beeline args per hive conf and execute the given script
+  /**
+   * Executes the given SQL script file against the metastore database via 
{@link IdempotentDDLExecutor}.
+   * Each statement in the script is executed individually over a direct JDBC 
connection with
+   * auto-commit enabled. Errors that indicate an object already exists or is 
already gone are
+   * silently ignored according to the per-database {@link DbErrorCodes}; all 
other SQL errors are
+   * rethrown as {@link IOException}.
+   */
   protected void execSql(String sqlScriptFile) throws IOException {
-    CommandBuilder builder =
-        new CommandBuilder(conf, url, driver, userName, passWord, 
sqlScriptFile)
-            .setVerbose(verbose);
-
-    // run the script using SqlLine
-    SqlLine sqlLine = new SqlLine();
-    ByteArrayOutputStream outputForLog = null;
-    if (!verbose) {
-      OutputStream out;
-      if (LOG.isDebugEnabled()) {
-        out = outputForLog = new ByteArrayOutputStream();
-      } else {
-        out = new NullOutputStream();
-      }
-      sqlLine.setOutputStream(new PrintStream(out));
-      System.setProperty("sqlline.silent", "true");
-    }
-    LOG.info("Going to run command <" + builder.buildToLog() + ">");
-    SqlLine.Status status = sqlLine.begin(builder.buildToRun(), null, false);
-    if (LOG.isDebugEnabled() && outputForLog != null) {
-      LOG.debug("Received following output from Sqlline:");
-      LOG.debug(outputForLog.toString("UTF-8"));
-    }
-    if (status != SqlLine.Status.OK) {
-      throw new IOException("Schema script failed, errorcode " + status);
+    LOG.info("Going to run script <{}> via Idempotent JDBC Executor", 
sqlScriptFile);
+    try (Connection conn = getConnectionToMetastore(true)) {
+      NestedScriptParser parser = getDbCommandParser(dbType, metaDbType);

Review Comment:
   execSql() calls getDbCommandParser(dbType, metaDbType), but 
MetastoreSchemaTool#getDbCommandParser currently ignores the metaDbType 
parameter and passes null into HiveSchemaHelper.getDbCommandParser(...). With 
dbType=hive (which the CLI supports), this will break nested script parsing 
because HiveCommandParser requires metaDbType. Suggest wiring metaDbType 
through to HiveSchemaHelper.getDbCommandParser and using a usingSqlLine value 
that matches the current executor (JDBC, not SqlLine).
   ```suggestion
         NestedScriptParser parser =
             org.apache.hadoop.hive.metastore.tools.schematool.HiveSchemaHelper
                 .getDbCommandParser(dbType, metaDbType, false);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to