sumeetgajjar opened a new pull request, #4942:
URL: https://github.com/apache/iceberg/pull/4942

   This PR aims at removing the unwanted `AlreadyExistsException` stacktrace 
from the spark-iceberg test logs.
   
   The test logs for spark tests are filled with 
`AlreadyExistsException(message:Database default already exists)` stacktrace 
even when all the tests pass.
   
   These `AlreadyExistsException` exceptions are not from Spark per se but from 
HMS.
   
   When "CREATE NAMESPACE IF NOT EXISTS default" SQL command is executed in 
Spark, Spark invokes `hive.createDatabase` command. 
   
https://github.com/apache/spark/blob/89fdb8a6fb6a669c458891b3abeba236e64b1e89/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala#L574
   
   Hive client invokes internally invokes HMS API to create the database. If 
the DB already exists HMS throws `AlreadyExistsException`. When the 
`ifNotExist` flag is set to true, the Hive client simply ignores the exception.
   
https://github.com/apache/hive/blob/63326ff775206e59547b6b1332e25279e90ef5ee/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L608-L619
   
   The HMS logs this exception to STDERR and for iceberg tests since a 
standalone HMS is running in the same JVM as that of the test, these logs are 
part of the info output of the tests.
   
   This generates a lot of noise in the logs and might overshadow an actual 
exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to