[ 
https://issues.apache.org/jira/browse/SPARK-55718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helios He updated SPARK-55718:
------------------------------
    Description: 
See SPARK-55501.

 

We can extend this logic to more than just CASTs. 

For `select listagg(col1) within group (order by col2) from T`, the general 
rule is as long as for all col1 with the same group by key that map to only 1 
unique value in col2, then this is safe.

 

e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x 
will map to x%10)

  was:
See SPARK-55501.

 

We can extend this logic to more than just CASTs. 

For `select listagg(col1) within group (order by col2) from T`, the general 
rule is as long as for all col1 with the same key, they map to only 1 unique 
value in col2, then this is safe.

 

e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x 
will map to x%10)


> Extend ListAgg distinct + order by ambiguity check to other cases
> -----------------------------------------------------------------
>
>                 Key: SPARK-55718
>                 URL: https://issues.apache.org/jira/browse/SPARK-55718
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.2
>            Reporter: Helios He
>            Priority: Minor
>
> See SPARK-55501.
>  
> We can extend this logic to more than just CASTs. 
> For `select listagg(col1) within group (order by col2) from T`, the general 
> rule is as long as for all col1 with the same group by key that map to only 1 
> unique value in col2, then this is safe.
>  
> e.g. if col2 = col1 % 10, then there is no ambiguity (all col1 with value x 
> will map to x%10)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to