Re: [VOTE][DISCUSS] A Spark SQL command or Call procedure

gabrywu Wed, 25 Sep 2024 02:37:51 -0700

+1 for option 2.
I'm excited to know about SPARK-48781, and it's perfect if spark supports a
stored procedure. I will use the `call ` syntax, which only looks like a
`stored procedure`,in this PR and adapt it to stored procedures in the
future.


XiDuo You <ulyssesyo...@gmail.com> 于2024年9月25日周三 13:12写道：

> +1 for option 2
> thank you
>
> Fei Wang <feiw...@apache.org> 于2024年9月25日周三 11:53写道：
> >
> > Prefer option 2 as well.
> >
> > BTW, it is necessary to support compact single partition for partitioned
> table.
> >
> > On 2024/09/24 07:19:27 Cheng Pan wrote:
> > > Hi Gabry, thanks for bringing up this discussion, usually, when we
> want to discuss some idea and make decision, instead of starting a thread
> with both [DISCUSS] and [VOTE], we firstly start a [DISCUSS] thread with
> all options collected, and during the discussion, pros and cons of each
> options will be listed and compared, ideally, all those involved in the
> discussion will reach a consensus eventually, if not, we choose the most
> supported options as the candidate to start a [VOTE], with
> > >
> > > +1 adopt
> > > +0 does not care
> > > -1 reject because …
> > >
> > > Back to the topic itself, there are actually 3 options:
> > >
> > > Option 1: new syntax COMPACT TABLE <table_name> [INTO <target_size >]
> [CLEANUP | RETAIN | LIST]
> > > Option 2: CALL compact_table(args …)
> > > Option 3: VACUUM <table_name> [OTHER ARGS]
> > >
> > > I prefer option 2, then 3. Given Delta and Iceberg's dominance in the
> lakehouse market, I suggest following either Delta's VACCUM or Iceberg's
> CALL syntax. Plus Kyuubi Spark extension already adopted Delta ZORDER
> syntax, and Spark 4.0 adopted the Iceberg CALL syntax, see SPARK-48781.
> > >
> > > Thanks,
> > > Cheng Pan
> > >
> > >
> > >
> > > > On Sep 19, 2024, at 19:02, gabrywu <gabr...@apache.org> wrote:
> > > >
> > > > Hi, folks,
> > > > I'm creating a PR #6695 <https://github.com/apache/kyuubi/pull/6695>
> to create a new extended Spark SQL command to merge small files. And a few
> of PMCs and committers propose that it's better to create a new Call
> Procedure instead.
> > > > So, I'm posting an email to vote on which one should be the best way
> to extend Spark SQL. No matter what's the result, we can consider it as a
> final decision to create a new spark extension in the upcoming PRs
> > > >
> > > > The VOTE will remain open for at least 2 weeks [ ] +1 Spark SQL
> Command [ ] +0 Both is OK [ ] -1  Spark Call Procedure
> > > >
> > >
> > >
>

Re: [VOTE][DISCUSS] A Spark SQL command or Call procedure

Reply via email to