[
https://issues.apache.org/jira/browse/SPARK-55718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Helios He updated SPARK-55718:
------------------------------
Description:
See SPARK-55501.
We can extend this logic to more than just CASTs.
For `select listagg(col1) within group (order by col2) from T`, the general
rule is as long as for all col1 with the same group by key that map to only 1
unique value in col2, then this is safe.
e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x
will map to x%10)
was:
See SPARK-55501.
We can extend this logic to more than just CASTs.
For `select listagg(col1) within group (order by col2) from T`, the general
rule is as long as for all col1 with the same key, they map to only 1 unique
value in col2, then this is safe.
e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x
will map to x%10)
> Extend ListAgg distinct + order by ambiguity check to other cases
> -----------------------------------------------------------------
>
> Key: SPARK-55718
> URL: https://issues.apache.org/jira/browse/SPARK-55718
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.2
> Reporter: Helios He
> Priority: Minor
>
> See SPARK-55501.
>
> We can extend this logic to more than just CASTs.
> For `select listagg(col1) within group (order by col2) from T`, the general
> rule is as long as for all col1 with the same group by key that map to only 1
> unique value in col2, then this is safe.
>
> e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x
> will map to x%10)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]