[
https://issues.apache.org/jira/browse/DRILL-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961802#comment-16961802
]
benj commented on DRILL-7426:
-----------------------------
In my particular case schema is difficult to predict, but asĀ [~cgivre] says, an
option to get/force value as string will be great.
But what is particularly surprising in this case is that
{noformat}
apache drill> ALTER SESSION SET `store.json.all_text_mode` = true;
apache drill> ALTER SESSION SET `exec.enable_union_type` = true;
/* I) doesn't work with simple array */
{ "name": "toto",
"info": ["LOAD", 5, [] ],
"response": 1 }
apache drill> SELECT * FROM dfs.test.`file.json` LIMIT 1;
Error: SYSTEM ERROR: SchemaChangeRuntimeException: Inner vector type mismatch.
Requested type: [minor_type: VARCHAR
mode: OPTIONAL
], actual type: [minor_type: UNION
mode: OPTIONAL
sub_type: VARCHAR
sub_type: LIST
]
/* II) but work with array of array */
{ "name": "toto",
"info": [ ["LOAD", 5, [] ] ],
"response": 1 }
apache drill> SELECT * FROM dfs.test.`file.json` LIMIT 1;
+------+-------------------+----------+
| name | info | response |
+------+-------------------+----------+
| toto | [["LOAD","5",[]]] | 1 |
+------+-------------------+----------+
1 row selected (0.133 seconds)
/* III) and it also work WHEN acceding to first field of array of array
(info[0]) that seems the same as the array of the case (I)*/
apache drill> SELECT *, info[0], info[0][0], info[0][1], info[0][2] FROM
dfs.test.`file.json` LIMIT 1;
+------+----------------------+----------+-----------------+--------+--------+--------+
| name | info | response | EXPR$1 | EXPR$2 | EXPR$3 |
EXPR$4 |
+------+----------------------+----------+-----------------+--------+--------+--------+
| toto | [["LOAD","5",[]],[]] | 1 | ["LOAD","5",[]] | LOAD | 5 |
[] |
+------+----------------------+----------+-----------------+--------+--------+--------+
1 row selected (0.185 seconds)
{noformat}
> Json support lists of different types
> -------------------------------------
>
> Key: DRILL-7426
> URL: https://issues.apache.org/jira/browse/DRILL-7426
> Project: Apache Drill
> Issue Type: Improvement
> Components: Documentation
> Affects Versions: 1.16.0
> Reporter: benj
> Priority: Trivial
>
> With a file.json like
> {code:json}
> {
> "name": "toto",
> "info": [["LOAD", []]],
> "response": 1
> }
> {code}
> A simple SELECT gives an error
> {code:sql}
> apache drill> SELECT * FROM dfs.test.`file.json`;
> Error: UNSUPPORTED_OPERATION ERROR: In a list of type VARCHAR, encountered a
> value of type LIST. Drill does not support lists of different types.
> {code}
> But there is an option _exec.enable_union_type_ that allows these request
> {code:sql}
> apache drill> ALTER SESSION SET `exec.enable_union_type` = true;
> apache drill> SELECT * FROM dfs.test.`file.json`;
> +------+---------------+----------+
> | name | info | response |
> +------+---------------+----------+
> | toto | [["LOAD",[]]] | 1 |
> +------+---------------+----------+
> 1 row selected (0.283 seconds)
> {code}
> The usage of this option is not evident. So, it will be useful to mention
> after the error message the possibility to set it.
> {noformat}
> Error: UNSUPPORTED_OPERATION ERROR: In a list of type VARCHAR, encountered a
> value of type LIST. Drill does not support lists of different types. .... SET
> the option 'exec.enable_union_type' to true and try again;
> {noformat}
> This behaviour is used for other error, example:
> {noformat}
> ...
> Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due
> to either a cartesian join or an inequality join.
> If a cartesian or inequality join is used intentionally, set the option
> 'planner.enable_nljoin_for_scalar_only' to false and try again.
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)