Hi Gabry, thanks for bringing up this discussion, usually, when we want to discuss some idea and make decision, instead of starting a thread with both [DISCUSS] and [VOTE], we firstly start a [DISCUSS] thread with all options collected, and during the discussion, pros and cons of each options will be listed and compared, ideally, all those involved in the discussion will reach a consensus eventually, if not, we choose the most supported options as the candidate to start a [VOTE], with
+1 adopt +0 does not care -1 reject because … Back to the topic itself, there are actually 3 options: Option 1: new syntax COMPACT TABLE <table_name> [INTO <target_size >] [CLEANUP | RETAIN | LIST] Option 2: CALL compact_table(args …) Option 3: VACUUM <table_name> [OTHER ARGS] I prefer option 2, then 3. Given Delta and Iceberg's dominance in the lakehouse market, I suggest following either Delta's VACCUM or Iceberg's CALL syntax. Plus Kyuubi Spark extension already adopted Delta ZORDER syntax, and Spark 4.0 adopted the Iceberg CALL syntax, see SPARK-48781. Thanks, Cheng Pan > On Sep 19, 2024, at 19:02, gabrywu <gabr...@apache.org> wrote: > > Hi, folks, > I'm creating a PR #6695 <https://github.com/apache/kyuubi/pull/6695> to > create a new extended Spark SQL command to merge small files. And a few of > PMCs and committers propose that it's better to create a new Call Procedure > instead. > So, I'm posting an email to vote on which one should be the best way to > extend Spark SQL. No matter what's the result, we can consider it as a final > decision to create a new spark extension in the upcoming PRs > > The VOTE will remain open for at least 2 weeks [ ] +1 Spark SQL Command [ ] > +0 Both is OK [ ] -1 Spark Call Procedure >