[ 
https://issues.apache.org/jira/browse/DRILL-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961257#comment-16961257
 ] 

Paul Rogers commented on DRILL-7426:
------------------------------------

As it turns out, this is a known limitation of Drill. Drill is a relational 
engine, designed to serve relational clients such as JDBC and ODBC. Although 
Drill has a Union data type, that type remains experimental and not fully 
supported.

At present, it seems that the Union type can be passed through the scan 
operator to a SqlLine client, where it is converted to a string for display, as 
shown in your example. However, it is not supported by most other operators, 
resulting in the failure you reported.

The fundamental problem is that it is not clear how the Union type should work 
with clients (JDBC, ODBC) that require a traditional relational schema. Drill 
does not support extended SQL syntax (such as SQL++), just traditional 
relational SQL.

We have seen cases in which JSON authors use arrays as a compact representation 
of a tuple:

{noformat}
[ 10, "fred", "flintstone", "male", 12.34 ]
{noformat}

Is this the case with your example that contains, it seems, both a string and 
an array?

At present, Drill has no way to map such a tuple into a relational structure. 
One could imagine converting the array into, say, a Map with field names 
defined somehow.

Here, "all text mode" will not help as that mode can't handle array/string 
conflicts, only string/number conflicts.

> Json support lists of different types
> -------------------------------------
>
>                 Key: DRILL-7426
>                 URL: https://issues.apache.org/jira/browse/DRILL-7426
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.16.0
>            Reporter: benj
>            Priority: Trivial
>
> With a file.json like
> {code:json}
> {
>     "name": "toto",
>     "info": [["LOAD", []]],
>     "response": 1
> }
> {code}
> A simple SELECT gives an error
> {code:sql}
> apache drill> SELECT * FROM dfs.test.`file.json`;
> Error: UNSUPPORTED_OPERATION ERROR: In a list of type VARCHAR, encountered a 
> value of type LIST. Drill does not support lists of different types.
> {code}
> But there is an option _exec.enable_union_type_ that allows these request
> {code:sql}
> apache drill> ALTER SESSION SET `exec.enable_union_type` = true;
> apache drill> SELECT * FROM dfs.test.`file.json`;
> +------+---------------+----------+
> | name |     info      | response |
> +------+---------------+----------+
> | toto | [["LOAD",[]]] | 1        |
> +------+---------------+----------+
> 1 row selected (0.283 seconds)
> {code}
> The usage of this option is not evident. So, it will be useful to mention 
> after the error message the possibility to set it.
> {noformat}
> Error: UNSUPPORTED_OPERATION ERROR: In a list of type VARCHAR, encountered a 
> value of type LIST. Drill does not support lists of different types. .... SET 
> the option 'exec.enable_union_type' to true and try again;
> {noformat}
> This behaviour is used for other error, example:
> {noformat}
> ...
> Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due 
> to either a cartesian join or an inequality join. 
> If a cartesian or inequality join is used intentionally, set the option 
> 'planner.enable_nljoin_for_scalar_only' to false and try again.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to