[ 
https://issues.apache.org/jira/browse/DERBY-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484514
 ] 

A B commented on DERBY-2256:
----------------------------

> Did you consider having this query return an error? If the user provides a
> floating point value for an integer column, wouldn't they prefer to have an
> error message, rather than have the database quietly return zero rows?

Well first I should point out that "quietly return[ing] zero rows" is not the 
correct behavior in the example you gave.  That's what Derby currently does 
(before the patch for this issue) but that is not correct.  The correct 
behavior is to return a single row matching the "4".

Given that, the question becomes "Did you consider having this query return an 
error?"  And this is a great question--thank you for bringing it up.

The short answer is No, I didn't consider throwing an error--but that was just 
because it didn't occur to me ;)  The longer and more relevant answer is that, 
after looking at the SQL 2003 spec for IN lists, my own reading is that we 
should not throw an error in the example that you mentioned.  Here's 
why...(feel free to skip if you're not interested):

----

>From SQL 2003 spec, 8.4 <in predicate> grammar shows:

  <in predicate> ::= <row value predicand> <in predicate part 2>
  <in predicate part 2> ::= [ NOT ] IN <in predicate value>
  <in value list> ::= <row value expression> [ { <comma> <row value expression> 
}... ]
  <in predicate value> ::=
       <table subquery>
       | <left paren> <in value list> <right paren>

Note that <in value list> in this grammar does not include parentheses. Now if 
we look at syntax Rule #2 in the same section, we see:

<begin quote>

  Let IVL be an <in value list>.

    ( IVL )

  is equivalent to the <table value constructor>:

    ( VALUES IVL )

<end quote>

So if our example query is:

  select * from t1 where i in (4.23, 4);

then <in value list> is simply "4.23, 4", and thus

    ( IVL )  ==> ( 4.23, 4 )

  is equivalent to the <table value constructor>:

    ( VALUES 4.23, 4 )

This particular VALUES clause is legal in Derby--i.e. we do not throw an error:

ij> values 4.23, 4;
1
---------------
4.23
4.00

2 rows selected

The next question is whether or not Derby *should* throw an error here.  The 
specs for a VALUES clause are given in section 7.3 as "<table value 
constructor>" and include the following:

 Section 7.3, Syntax Rule #4:

  The row type of TVC is determined by applying Subclause 9.3, "Data types of 
results
  of aggregations", to the row types [of the values in the list].

If we go on to look at subclause 9.3 we see the following rule:

 Section 9.3, Syntax Rule #2:

  All of the data types in DTS shall be comparable.

In our example we have an integer column and a decimal value, and those two 
types are indeed comparable with each other.  So we satisfy rule #2 and 
therefore should not throw an error here.

That said, we now go back to section 8.4 (IN predicate) and look at syntax 
rules #3 and #5, where we see the following:

<begin quote>

 3) Let RVC be the <row value predicand> and let IPV be the <in predicate 
value>.

 5) The expression

     RVC IN IPV

    is equivalent to

     RVC = ANY IPV

<end quote>

In our example RVC is the column "i" and IPV is "(4.23, 4)".  So then we have:

    i IN (4.23, 4)

  is equivalent to

    i = ANY (4.23, 4)

Then if we use syntax rule #2 again, we end up with:

    i = ANY (VALUES 4.23, 4)

Putting that back into our original query we now have:

   select * from t1 where i = any (values 4.23, 4);

This latter query executes without error in Derby and returns 1 row (even 
before the patch for this issue is applied):

ij> select * from t1 where i = any (values 4.23, 4);
I
-----------
4

1 row selected

Since the type of the VALUES clause is decimal (section 9.3, rule #3), we are 
doing a comparison of an integer with a decimal.  Such a comparison should not 
throw an error and should only return true if the values are algebraically 
equal.  Thus the above query is correctly returning a single row--and that's 
what the equivalent IN list query should be doing, too.

----

Putting all of that together, my conclusion is that we should not throw an 
error if if the user specifies a decimal or floating point value for an integer 
column.  Whether or not the user would "prefer" an error I can't say, but 
standards-wise I think the correct thing is to compare using the dominant type 
and only return the rows that match.

Feel free to comment if I've overlooked something or otherwise misread the 
spec.  Spec-reading is *not* one of my gifts...

> Wrong Results: Use of decimal values in an IN-list with INTEGER left operand 
> can lead to extra rows.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2256
>                 URL: https://issues.apache.org/jira/browse/DERBY-2256
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.0, 10.0.2.1, 10.0.2.2, 10.1.1.0, 10.1.2.1, 
> 10.1.3.1, 10.1.3.2, 10.1.4.0, 10.2.1.6, 10.2.2.0, 10.2.2.1, 10.2.3.0, 10.3.0.0
>            Reporter: A B
>         Assigned To: A B
>         Attachments: d2256_v1.patch, d2256_v1.stat
>
>
> While trying out some code changes for DERBY-47 I was running a few test 
> cases and happened to notice that there are a couple of cases in which Derby 
> behaves incorrectly (that or my understanding of what should be happening 
> here is way off).
> First and most simply: the following query should return zero rows (unless 
> I'm missing something?), but it returns one:
> ij> create table t1 (i int);
> 0 rows inserted/updated/deleted
> ij> insert into t1 values 1, 2, 3, 4, 5;
> 5 rows inserted/updated/deleted
> -- Correct returns zero rows.
> ij> select * from t1 where i in (4.23);
> I
> -----------
> 0 rows selected
> -- But this one returns 1 row...
> ij> select * from t1 where i in (2.8, 4.23);
> I
> -----------
> 4
> 1 row selected
> Secondly, if the IN-list contains a non-constant value then Derby can 
> incorrectly return rows that do not match the IN predicate.  I think this is 
> because some internal casting is happening when it shouldn't?
> ij> create table t1 (i int);
> 0 rows inserted/updated/deleted
> ij> insert into t1 values 1, 2, 3, 4, 5;
> 5 rows inserted/updated/deleted
> -- Following values clause returns "2.80", as expected.
> ij> values cast (2.8 as decimal(4, 2));
> 1
> -------
> 2.80
> 1 row selected
> -- But if we use it in an IN-list it gets cast to "2" and thus returns a 
> match.
> -- We get 2 rows when we should get NONE.
> ij> select * from t1 where i in (cast (2.8 as decimal(4, 2)), 4.23);
> I
> -----------
> 2
> 4
> 2 rows selected
> I confirmed that we see these results on trunk, 10.2, 10.1, and even as far 
> back as svn #201660 for 10.0.  I also ran the above statements on DB2 v8 as a 
> sanity check to confirm that NO results were returned there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to