Lim Qing Wei created FLINK-32296: ------------------------------------ Summary: Flink SQL handle array of row incorrectly Key: FLINK-32296 URL: https://issues.apache.org/jira/browse/FLINK-32296 Project: Flink Issue Type: Bug Components: Table SQL / API Affects Versions: 1.16.2, 1.15.3 Reporter: Lim Qing Wei
FlinkSQL produce incorrect result when involving data with type of ARRAY<ROW>, here's a reproduction: {code:java} CREATE TEMPORARY VIEW bug_data as ( SELECT CAST(ARRAY[ (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, '20-01-10'), (4410, '21111') ] AS ARRAY<ROW<A INT, B STRING>>) UNION SELECT CAST(ARRAY[ (10, '2020-01-10'), (121, '244ddf'), (2222, '2asdfaf'), (32243, '200'), (2210, '33333-01-10'), (4410, '23243243') ] AS ARRAY<ROW<A INT, B STRING>>) UNION SELECT CAST(ARRAY[ (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (24367, '20-01-10'), (4410, '21111') ] AS ARRAY<ROW<A INT, B STRING>>) UNION SELECT CAST(ARRAY[ (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), (2210, '20-01-10'), (4410, '21111') ] AS ARRAY<ROW<A INT, B STRING>>) UNION SELECT CAST(ARRAY[ (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (8967564, '20-01-10'), (4410, '21111') ] AS ARRAY<ROW<A INT, B STRING>>) ); CREATE TABLE sink ( r ARRAY<ROW<A INT, B STRING>> ) WITH ('connector' = 'print'); {code} In both 1.15 and 1.16, it produces the following: {noformat} [+I[4410, 21111], +I[4410, 21111], +I[4410, 21111], +I[4410, 21111], +I[4410, 21111], +I[4410, 21111]] [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat} I think this is unexpected/wrong because: # The query should produce 5 rows, not 2 # The data is also wrong, noticed it just make every row in the array the same, but the input are not the same. -- This message was sent by Atlassian Jira (v8.20.10#820010)