[ https://issues.apache.org/jira/browse/SPARK-47483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nikola Mandic updated SPARK-47483: ---------------------------------- Epic Link: SPARK-46830 > Add support for aggregation and join operations on arrays of collated strings > ----------------------------------------------------------------------------- > > Key: SPARK-47483 > URL: https://issues.apache.org/jira/browse/SPARK-47483 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 4.0.0 > Reporter: Nikola Mandic > Priority: Major > > Example of aggregation sequence: > {code:java} > create table t(a array<string collate utf8_binary_lcase>) using parquet; > insert into t(a) values(array('a' collate utf8_binary_lcase)); > insert into t(a) values(array('A' collate utf8_binary_lcase)); > select distinct a from t; {code} > Example of join sequence: > {code:java} > create table l(a array<string collate utf8_binary_lcase>) using parquet; > create table r(a array<string collate utf8_binary_lcase>) using parquet; > insert into l(a) values(array('a' collate utf8_binary_lcase)); > insert into r(a) values(array('A' collate utf8_binary_lcase)); > select * from l join r where l.a = r.a; {code} > Both runs should yield one row since the arrays are considered equal. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org