Re: [VOTE][DISCUSS] A Spark SQL command or Call procedure

Fei Wang Tue, 24 Sep 2024 20:53:31 -0700

Prefer option 2 as well.

BTW, it is necessary to support compact single partition for partitioned table.


On 2024/09/24 07:19:27 Cheng Pan wrote:
> Hi Gabry, thanks for bringing up this discussion, usually, when we want to 
> discuss some idea and make decision, instead of starting a thread with both 
> [DISCUSS] and [VOTE], we firstly start a [DISCUSS] thread with all options 
> collected, and during the discussion, pros and cons of each options will be 
> listed and compared, ideally, all those involved in the discussion will reach 
> a consensus eventually, if not, we choose the most supported options as the 
> candidate to start a [VOTE], with 
> 
> +1 adopt
> +0 does not care
> -1 reject because …
> 
> Back to the topic itself, there are actually 3 options:
> 
> Option 1: new syntax COMPACT TABLE <table_name> [INTO <target_size >] 
> [CLEANUP | RETAIN | LIST]
> Option 2: CALL compact_table(args …)
> Option 3: VACUUM <table_name> [OTHER ARGS]
> 
> I prefer option 2, then 3. Given Delta and Iceberg's dominance in the 
> lakehouse market, I suggest following either Delta's VACCUM or Iceberg's CALL 
> syntax. Plus Kyuubi Spark extension already adopted Delta ZORDER syntax, and 
> Spark 4.0 adopted the Iceberg CALL syntax, see SPARK-48781.
> 
> Thanks,
> Cheng Pan
> 
> 
> 
> > On Sep 19, 2024, at 19:02, gabrywu <[email protected]> wrote:
> > 
> > Hi, folks, 
> > I'm creating a PR #6695 <https://github.com/apache/kyuubi/pull/6695> to 
> > create a new extended Spark SQL command to merge small files. And a few of 
> > PMCs and committers propose that it's better to create a new Call Procedure 
> > instead. 
> > So, I'm posting an email to vote on which one should be the best way to 
> > extend Spark SQL. No matter what's the result, we can consider it as a 
> > final decision to create a new spark extension in the upcoming PRs
> > 
> > The VOTE will remain open for at least 2 weeks [ ] +1 Spark SQL Command [ ] 
> > +0 Both is OK [ ] -1  Spark Call Procedure
> > 
> 
>

Re: [VOTE][DISCUSS] A Spark SQL command or Call procedure

Reply via email to