[ 
https://issues.apache.org/jira/browse/DRILL-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned DRILL-7499:
----------------------------------

    Assignee: Paul Rogers

> sqltypeof() function with an array returns "ARRAY", not type
> ------------------------------------------------------------
>
>                 Key: DRILL-7499
>                 URL: https://issues.apache.org/jira/browse/DRILL-7499
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.16.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> The {{sqltypeof()}} function was introduced in Drill 1.14 to work around 
> limitations of the original {{typeof()}} function. The function is mentioned 
> in _Learning Apache Drill_, Chapter 8, page 152:
> {noformat}
> ELECT sqlTypeOf(columns) AS cols_type,
>        modeOf(columns) AS cols_mode
> FROM `csv/cust.csv` LIMIT 1;
> +--------------------+------------+
> |     cols_type      | cols_mode  |
> +--------------------+------------+
> | CHARACTER VARYING  | ARRAY      |
> +--------------------+------------+
> {noformat}
> When the same query is run against the just-released Drill 1.17, we get the 
> *wrong* results:
> {noformat}
> +-----------+-----------+
> | cols_type | cols_mode |
> +-----------+-----------+
> | ARRAY     | ARRAY     |
> +-----------+-----------+
> {noformat}
> The definition of {{sqlTypeOf()}} is that it should return the type portion 
> of the columns (type, mode) major type. Clearly, it is no longer doing so for 
> arrays. As a result, there is no function to obtain the data type for arrays.
> The problem also shows up in the query from page 158:
> {code:sql}
> SELECT a, b,
>        sqlTypeOf(b) AS b_type, modeof(b) AS b_mode
> FROM `gen/70kmissing.json`
> WHERE mod(a, 70000) = 1;
> {code}
> Expected (table from the book with Drill 1.14 results):
> {noformat}
> +--------+-------+----------+-----------+
> |   a    |   b   |  b_type  |  b_mode   |
> +--------+-------+----------+-----------+
> | 1      | null  | INTEGER  | NULLABLE  |
> +--------+-------+----------+-----------+
> {noformat}
> Actual Drill 1.17 results:
> {noformat}
> +-------+-----------+-------------------+----------+
> |   a   |     b     |      b_type       |  b_mode  |
> +-------+-----------+-------------------+----------+
> | 1     | null      | NULL              | NULLABLE |
> +-------+-----------+-------------------+----------+
> {noformat}
> (Second line of table is omitted because something else changed, not relevant 
> to this ticket.)
> The above might not actually be a bug, however if someone has changed the 
> type of missing columns from the old {{INT}} to a newer (untyped) {{NULL}}. 
> But, an indirect test suggests that the column is still `INT` and the 
> function is wrong:
> {code:sql}
> SELECT a, b
> FROM `gen/70kdouble.json`
> WHERE b IS NOT NULL ORDER BY a;
> {code}
> Data:
> {noformat}
> {a: 1}
> ...
> {a: 69999}
> {a: 70001, b: 10.5}
> {noformat}
> Error:
> {noformat}
> Error: UNSUPPORTED_OPERATION ERROR: Schema changes not supported in External 
> Sort. Please enable Union type.
> Previous schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` 
> (INT:OPTIONAL)]], selectionVector=NONE]
> Incoming schema BatchSchema [fields=[[`a` (BIGINT:OPTIONAL)], [`b` 
> (FLOAT8:OPTIONAL)]], selectionVector=NONE]
> {noformat}
> Oddly, however, the query on page 160 works as expected:
> {code:sql}
> SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode 
> FROM `json/all-null.json` LIMIT 1;
> {code}
> {noformat}
> +---------+----------+
> | a_type  |  a_mode  |
> +---------+----------+
> | INTEGER | NULLABLE |
> +---------+----------+
> {noformat}
>  Someone will have to do some investigating to understand the current 
> behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to