[
https://issues.apache.org/jira/browse/DERBY-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Knut Anders Hatlen updated DERBY-6566:
--------------------------------------
Attachment: d6566-1a.diff
Attaching [^d6566-1a.diff], which simplifies the handling of untyped NULLs in
CASE and NULLIF expressions. The patch removes 245 lines more from the engine
code than it adds to it. It is primarily ConditionalNode that's reduced in size.
First I'd like to point out that there is one behaviour change introduced by
the patch. In Derby versions from 10.3 up to current, CASE expressions where
one branch has a typed NULL of a type not compatible with the expression in the
other branch, such as {{CASE WHEN 1=1 THEN 'abc' ELSE CAST(NULL AS SMALLINT)
END}}, are silently rewritten during the bind phase so that {{CAST(NULL AS
SMALLINT)}} becomes {{CAST(NULL AS CHAR(3))}}. The THEN branch and the ELSE
branch are believed to have compatible types because of this rewrite, and the
expression is successfully evaluated.
With the patch, this rewrite no longer happens, and the behaviour is reverted
to what it was in 10.2 and earlier. Such an expression will now cause the
following error to be raised:
{noformat}
ERROR 42X89: Types 'CHAR' and 'SMALLINT' are not type compatible. Neither type
is assignable to the other type.
{noformat}
I believe 10.3 and higher accept such expressions because of an unintended
fallout caused by the fix for DERBY-1620. That fix was supposed to make untyped
NULLs get their type from the context, but ended up changing the type of
already typed NULLs as well.
Here's a description of what the patch does:
1) It makes CASE expression represent untyped NULLs as
UntypedNullConstantNodes, which is the same as NULLIF already does. This allows
ConditionalNode to handle CASE and NULLIF exactly the same way.
2) It changes how the parser builds the AST so that a single CASE expression is
now represented by a single ConditionalNode.
Previously, an expression such as {{CASE WHEN a THEN b WHEN c THEN d ELSE e
END}} would be represented by two nested ConditionalNodes, as if the expression
had actually been {{CASE WHEN a THEN b ELSE (CASE WHEN c THEN d ELSE e END)
END}}. Those two expressions aren't always completely equivalent.
Take for example {{CASE WHEN a THEN 1 WHEN b THEN NULL ELSE NULL END}}, which
would be represented as {{CASE WHEN a THEN 1 ELSE (CASE WHEN b THEN NULL ELSE
NULL END) END}}. The inner conditional node in the rewritten expression is
{{CASE WHEN b THEN NULL ELSE NULL}}, which has no information about what the
return type is. Because of this, the current code needs some extra complexity
to transfer type information from the outer ConditionalNode to the inner
ConditionalNode. The patch is able to remove much of this complexity because it
has information about all the branches of the CASE expression in one
ConditionalNode.
Another benefit from this change, is that the AST is not so deeply nested,
which reduces the risk of getting a StackOverflowError during compilation of
CASE expressions with lots of WHEN clauses.
3) It changes some of the CASE expressions in the ODBC metadata queries. Some
CASTs to the incorrect type had sneaked into the queries, and the previously
described behaviour change revealed those bugs.
4) Add a test case to verify that some edge cases still work as before.
All the regression tests passed with the patch.
> Simplify handling of untyped nulls in CASE and NULLIF expressions
> -----------------------------------------------------------------
>
> Key: DERBY-6566
> URL: https://issues.apache.org/jira/browse/DERBY-6566
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 11.0.0.0
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: d6566-1a.diff
>
>
> The parser translates both CASE and NULLIF expressions into ConditionalNodes,
> but it represents untyped NULLs differently in the two cases.
> In a CASE expression, any branch that is an untyped NULL, is translated into
> an UntypedNullConstantNode that's wrapped in a CastNode that casts the value
> to CHAR(1). The CastNode is replaced with a cast to the correct type during
> the bind phase.
> A NULLIF expression is turned into a CASE expression that has a THEN NULL
> clause. The parser simply creates an UntypedNullConstantNode for that clause,
> without wrapping it in a CastNode. A CastNode is instead added during the
> bind phase.
> This slight difference in how NULLs are represented by the parser in the two
> cases, means that ConditionalNode needs to handle the two cases differently
> during the bind phase. It would be better if the parser generated NULLs in
> the same way for the two cases, so that ConditionalNode didn't need to know
> if it was generated for a CASE expression or a NULLIF expression.
--
This message was sent by Atlassian JIRA
(v6.2#6252)