sumeetgajjar opened a new pull request, #4942: URL: https://github.com/apache/iceberg/pull/4942
This PR aims at removing the unwanted `AlreadyExistsException` stacktrace from the spark-iceberg test logs. The test logs for spark tests are filled with `AlreadyExistsException(message:Database default already exists)` stacktrace even when all the tests pass. These `AlreadyExistsException` exceptions are not from Spark per se but from HMS. When "CREATE NAMESPACE IF NOT EXISTS default" SQL command is executed in Spark, Spark invokes `hive.createDatabase` command. https://github.com/apache/spark/blob/89fdb8a6fb6a669c458891b3abeba236e64b1e89/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala#L574 Hive client invokes internally invokes HMS API to create the database. If the DB already exists HMS throws `AlreadyExistsException`. When the `ifNotExist` flag is set to true, the Hive client simply ignores the exception. https://github.com/apache/hive/blob/63326ff775206e59547b6b1332e25279e90ef5ee/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L608-L619 The HMS logs this exception to STDERR and for iceberg tests since a standalone HMS is running in the same JVM as that of the test, these logs are part of the info output of the tests. This generates a lot of noise in the logs and might overshadow an actual exception. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
