Re: [DISCUSS] Support to group rows by column ordinals

Sergey Nuyanzin Thu, 20 Jun 2024 04:27:22 -0700

Hey Jeyhun,

Thanks for starting the discussion.
Could you please elaborate more on use cases of this feature?


The one that I see in FLINK-34366[1] is to simplify referencing to
aliases in SELECT from GROUP BY
(also potentially ORDER BY and HAVING). I wonder whether there is some
other use cases where
support of addressing by ordinals is required?

I'm asking since SqlConformance in Calcite and as a result
FlinkSqlConformance in Flink give ability
to reference to aliases from SELECT by enabling it e.g. [2] where javadoc says
>   * Whether to allow aliases from the {@code SELECT} clause to be used as
>   * column names in the {@code GROUP BY} clause.

Giving the fact that was already mentioned by Timo in the PR
> In some DBMSs it is common to write GROUP BY 1 or ORDER BY 1 for global 
> aggregation/sorting.
and IMHO referencing by ordinals might be error prone if someone adds
more columns in SELECT and forgets about ordinals.

Would it make sense to consider enabling reference by aliases as
another option here?
Or did I miss anything?

[1] https://issues.apache.org/jira/browse/FLINK-34366
[2] 
https://github.com/apache/calcite/blob/c0a53f6b17daaca9d057e70d7fae0a0e9c2cd02a/core/src/main/java/org/apache/calcite/sql/validate/SqlConformance.java#L92-L103

On Thu, Jun 20, 2024 at 12:12 PM Muhammet Orazov
<mor+fl...@morazow.com.invalid> wrote:
>
> Hey Jeyhun,
>
> Thanks for bringing it up! +-1 from my side.
>
> Personally, I find this feature confusing, it feels always natural to
> use
> column names. SQL power users will ask for it, I have seen it used in
> automated complex queries also.
>
> But it seems counterintuitive to enable flag for this feature. Enabling
> it, should not disable grouping/ordering by the column names?
>
> Best,
> Muhammet
>
>
> On 2024-06-17 20:30, Jeyhun Karimov wrote:
> > Hi devs,
> >
> > I am moving our discussion on the PR thread [1] to the dev mailing list
> > to
> > close the loop on the related issue [2]. The end goal of the PR is to
> > support grouping/ordering by via column ordinals. The target
> > implementation
> > (currently there is no flag) should support a flag, so that a user can
> > also
> > use the old behavior as suggested by @Timo.
> >
> > Some vendors such as Postgres [3], SQLite [4], MySQL/MariaDB [5],
> > Oracle
> > [6], Spark [7], and BigQuery[8] support group/order by clauses with
> > column
> > ordinals.
> >
> > Obviously, supporting this clause might lead to less readable and
> > maintainable SQL code. This might also cause a bit of complications
> > both on
> > the codebase and on the user-experience side. On the other hand, we
> > already
> > see that numerous vendors support this feature out of the box, because
> > there was/is a need for this feature.
> >
> > That is why, I would like to discuss and hear your opinions about
> > introducing/abandoning this feature.
> >
> > Regards,
> > Jeyhun
> >
> > [1] https://github.com/apache/flink/pull/24270
> > [2] https://issues.apache.org/jira/browse/FLINK-34366
> > [3] https://www.postgresql.org/docs/6.5/sql-select.htm
> > [4] https://www.sqlite.org/lang_select.html
> > [5] https://www.db-fiddle.com/f/uTrfRrNs4uXLr4Q9j2piCk/1
> > [6]
> > https://oracle-base.com/articles/23/group-by-and-having-clause-using-column-alias-or-column-position-23
> > [7]
> > https://github.com/apache/spark/commit/90613df652d45e121ab2b3a5bbb3b63cb15d297a
> > [8]
> > https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#group_by_col_ordinals



-- 
Best regards,
Sergey

Re: [DISCUSS] Support to group rows by column ordinals

Reply via email to