userzhy opened a new pull request, #6910: URL: https://github.com/apache/paimon/pull/6910
### Purpose Linked issue: close #6328 This PR adds support for `compact_database` procedure in Spark SQL, similar to Flink's existing implementation. The new procedure supports the following parameters: - `including_databases`: Databases to include, supports regular expressions. - `including_tables`: Tables to include, supports regular expressions. - `excluding_tables`: Tables to exclude, supports regular expressions. - `options`: Compaction options in format `"key1=value1,key2=value2"`. **Usage examples:** ```sql -- Compact all tables in all databases CALL sys.compact_database() -- Compact tables in specific databases CALL sys.compact_database(including_databases => 'db1|db2') -- Compact specific tables with options CALL sys.compact_database(including_tables => '.*_fact', options => 'write-only=true') ``` ### Tests - `CompactDatabaseProcedureTest.testCompactDatabase`: Basic functionality test. - `CompactDatabaseProcedureTest.testCompactDatabaseWithDatabaseFilter`: Database filtering test. - `CompactDatabaseProcedureTest.testCompactDatabaseWithTableFilter`: Table filtering test. - `CompactDatabaseProcedureTest.testCompactDatabaseWithExcludingTables`: Table exclusion test. ### API and Format This change adds a new stored procedure `compact_database` to Spark SQL. No storage format changes. ### Documentation This is a new feature. Documentation update may be needed for the Spark procedures page. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
