[
https://issues.apache.org/jira/browse/DRILL-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882609#comment-16882609
]
Paul Rogers commented on DRILL-7322:
------------------------------------
[~arina], agree we should unify conversion. I would suggest creating a common
{{stringToBoolean()}} conversion routine in {{Types}}, then calling it from
both the cast function and the conversion class.
More broadly, I do worry that the conversion classes don't implement exactly
the same rules as the cast functions. Didn't have time to solve that earlier.
Might want to use the same solution for the other conversions, such as
string-to-number. Especially for float/double, we need to ensure we handle NaN
and Inf values.
Further, we should make sure that the date/time/period formats used by the
schema provisioning stuff are the same as those used in the {{TO_DATE()}}
function.
Finally, we should document the conversion rules in detail. Just checked the
docs: they don't describe, say, all the Boolean formats we support.
> Align cast boolean and schema boolean conversion
> ------------------------------------------------
>
> Key: DRILL-7322
> URL: https://issues.apache.org/jira/browse/DRILL-7322
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.16.0
> Reporter: Denys Ordynskiy
> Priority: Major
>
> Information schema file allows converting any string to the boolean data type.
> But "case(.. as boolean)" statement throws an error:
> {color:#d04437}UserRemoteException : SYSTEM ERROR: IllegalArgumentException:
> Invalid value for boolean: a
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: Invalid value for boolean: a{color}
> *Information Schema file should allow using the same range of boolean
> literals as cast statement.*
> *Steps to reproduce:*
> Upload text file all_types.csvh to the DFS /tmp/ischema/all_types:
> {noformat}
> boolean_col,boolean_col_for_cast
> true,true
> 1,1
> t,t
> y,y
> yes,yes
> on,on
> false,false
> 0,0
> f,f
> n,n
> no,no
> off,off
> a,a
> -,-
> !,!
> `,`
> 7,7
> @,@
> ^,^
> *,*
> {noformat}
> *Create schema:*
> {noformat}
> create schema (boolean_col boolean, boolean_col_for_cast varchar) for table
> dfs.tmp.`ischema/all_types`
> {noformat}
> *Run the query without cast:*
> select boolean_col, sqlTypeOf(boolean_col) boolean_col_type,
> boolean_col_for_cast, sqlTypeOf(boolean_col_for_cast)
> boolean_col_for_cast_type from dfs.tmp.`ischema/all_types`
> |boolean_col|boolean_col_type|boolean_col_for_cast|boolean_col_for_cast_type|
> |true|BOOLEAN|true|CHARACTER VARYING|
> |true|BOOLEAN|1|CHARACTER VARYING|
> |true|BOOLEAN|t|CHARACTER VARYING|
> |true|BOOLEAN|y|CHARACTER VARYING|
> |true|BOOLEAN|yes|CHARACTER VARYING|
> |true|BOOLEAN|on|CHARACTER VARYING|
> |false|BOOLEAN|false|CHARACTER VARYING|
> |false|BOOLEAN|0|CHARACTER VARYING|
> |false|BOOLEAN|f|CHARACTER VARYING|
> |false|BOOLEAN|n|CHARACTER VARYING|
> |false|BOOLEAN|no|CHARACTER VARYING|
> |false|BOOLEAN|off|CHARACTER VARYING|
> |false|BOOLEAN|a|CHARACTER VARYING|
> |false|BOOLEAN|-|CHARACTER VARYING|
> |false|BOOLEAN|!|CHARACTER VARYING|
> |false|BOOLEAN|`|CHARACTER VARYING|
> |false|BOOLEAN|7|CHARACTER VARYING|
> |false|BOOLEAN|@|CHARACTER VARYING|
> |false|BOOLEAN|^|CHARACTER VARYING|
> |false|BOOLEAN|*|CHARACTER VARYING|
> *Run the query with cast:*
> select boolean_col, sqlTypeOf(boolean_col) boolean_col_type,
> cast(boolean_col_for_cast as boolean) boolean_col_for_cast,
> sqlTypeOf(cast(boolean_col_for_cast as boolean)) boolean_col_for_cast_type
> from dfs.tmp.`ischema/all_types`
> {color:#d04437}UserRemoteException : SYSTEM ERROR: IllegalArgumentException:
> Invalid value for boolean: a
>
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: Invalid value for boolean: *a*
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: b9deab6f-7fd4-40c0-acdf-b2e31747e16f on cv1:31010]{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)