[ 
https://issues.apache.org/jira/browse/SPARK-47483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-47483:
--------------------------------

    Assignee: Nikola Mandic

> Add support for aggregation and join operations on arrays of collated strings
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-47483
>                 URL: https://issues.apache.org/jira/browse/SPARK-47483
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Nikola Mandic
>            Assignee: Nikola Mandic
>            Priority: Major
>              Labels: pull-request-available
>
> Example of aggregation sequence:
> {code:java}
> create table t(a array<string collate utf8_binary_lcase>) using parquet;
> insert into t(a) values(array('a' collate utf8_binary_lcase));
> insert into t(a) values(array('A' collate utf8_binary_lcase));
> select distinct a from t; {code}
> Example of join sequence:
> {code:java}
> create table l(a array<string collate utf8_binary_lcase>) using parquet;
> create table r(a array<string collate utf8_binary_lcase>) using parquet;
> insert into l(a) values(array('a' collate utf8_binary_lcase));
> insert into r(a) values(array('A' collate utf8_binary_lcase));
> select * from l join r where l.a = r.a; {code}
> Both runs should yield one row since the arrays are considered equal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to