benj created DRILL-7515: --------------------------- Summary: ORDER BY clause produce error on GROUP BY with array field manager with any_value Key: DRILL-7515 URL: https://issues.apache.org/jira/browse/DRILL-7515 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.17.0 Reporter: benj
With a parquet containing an array field, for example: {code:sql} apache drill 1.17> CREATE TABLE dfs.TEST.`example_any_pqt` AS (SELECT 'foo' AS a, 'bar' b, split('foo,bar',',') as c); apache drill 1.17> SELECT *, typeof(c) AS type, sqltypeof(c) AS sql_type FROM dfs.TEST.`example_any_pqt`; +-----+-----+---------------+---------+----------+ | a | b | c | type | sql_type | +-----+-----+---------------+---------+----------+ | foo | bar | ["foo","bar"] | VARCHAR | ARRAY | +-----+-----+---------------+---------+----------+ {code} The next request work well {code:sql} apache drill 1.17> SELECT * FROM (SELECT a, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a) ORDER BY a; +-----+---------------+ | a | EXPR$1 | +-----+---------------+ | foo | ["foo","bar"] | +-----+---------------+ {code} But the next request (with the same struct as the previous request) failed {code:sql} apache drill 1.17> SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b) ORDER BY a; Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External Sort. Please enable Union type. Previous schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (VARCHAR:OPTIONAL)], [`EXPR$2` (NULL:OPTIONAL)]], selectionVector=NONE] Incoming schema BatchSchema [fields=[[`a` (VARCHAR:OPTIONAL)], [`b` (VARCHAR:OPTIONAL)], [`EXPR$2` (VARCHAR:REPEATED), children=([`$data$` (VARCHAR:REQUIRED)])]], selectionVector=NONE] Fragment 0:0 {code} Note that the same request +without the order by+ works well. It's also possible to use intermediate table and apply the ORDER BY in a second time. {code:sql} apache drill 1.17> SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b); +-----+-----+---------------+ | a | b | EXPR$2 | +-----+-----+---------------+ | foo | bar | ["foo","bar"] | +-----+-----+---------------+ apache drill 1.17> CREATE TABLE dfs.TEST.`ok_pqt` AS (SELECT * FROM (SELECT a, b, any_value(c) FROM dfs.TEST.`example_any_pqt` GROUP BY a, b)); +----------+---------------------------+ | Fragment | Number of records written | +----------+---------------------------+ | 0_0 | 1 | +----------+---------------------------+ apache drill 1.17> SELECT * FROM dfs.TEST.`ok_pqt` ORDER BY a; +-----+-----+---------------+ | a | b | EXPR$2 | +-----+-----+---------------+ | foo | bar | ["foo","bar"] | +-----+-----+---------------+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)