[
https://issues.apache.org/jira/browse/DRILL-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adam Gilmore updated DRILL-3150:
--------------------------------
Attachment: DRILL-3150.1.patch.txt
Here's a proposed patch for the issue. Basically, the issue is when
materializing the LogicalExpression, it adds casts to certain aspects of it.
In the query specified above, it will try to cast the string "test" to an int
(INT OPTIONAL seems to be the default typing when a field is not found in the
schema of the table). Of course, the cast fails and thus the exception.
I've added generated "tryCast" functions for certain types. This will return a
NullableTypeHolder with isSet = 0 when casting fails. This gets around the
issue.
This also affected an old bug DRILL-2590 whereby an expected cast failure would
occur. This no longer occurs as I've implemented casting a BIT to INT or
BIGINT (true = 1, false = 0).
Finally, the tryCast is as simple as catching the exception at the moment. We
would probably want to optimize that so it doesn't actually throw and catch
exceptions.
These are some quite significant changes so I welcome feedback/thoughts.
The only other idea I had was to set the default missing field type to, say,
VARCHAR OPTIONAL, but then this would still cause failures when there is a
valid integer field (and you try to compare it to a non-numeric string).
> Error when filtering non-existent field with a string
> -----------------------------------------------------
>
> Key: DRILL-3150
> URL: https://issues.apache.org/jira/browse/DRILL-3150
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.0.0
> Reporter: Adam Gilmore
> Assignee: Adam Gilmore
> Priority: Critical
> Fix For: 1.1.0
>
> Attachments: DRILL-3150.1.patch.txt
>
>
> The following query throws an exception:
> {code}
> select count(*) from cp.`employee.json` where `blah` = 'test'
> {code}
> "blah" does not exist as a field in the JSON. The expected behaviour would
> be to filter out all rows as that field is not present (thus cannot equal the
> string 'test').
> Instead, the following exception occurs:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: test
> Fragment 0:0
> [Error Id: 5d6c9a82-8f87-41b2-a496-67b360302b76 on
> ip-10-1-50-208.ec2.internal:31010]
> {code}
> Apart from the fact the real error message is hidden, the issue is that we're
> trying to cast the varchar to int ('test' to an int). This seems to be
> because the projection out of the scan when a field is not found becomes
> INT:OPTIONAL.
> The filter should not fail on this - if the varchar fails to convert to an
> int, the filter should just simply not allow any records through.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)