I tried to reproduce this. I ran the query

with t (id, v1, v2) as (
select * from (values
(0, cast('null value' as varchar(11)), cast(null as varchar(4))),
(1, 'a', 'aa'),
(2, 'noBlank', ''),
(3, 'oneBlank', ' '),
(4, 'ltrOneBlank', 'aa ')))
SELECT * FROM t WHERE v2 = ''

and Calcite and Postgres both return

 id |   v1    | v2
----+---------+----
  2 | noBlank |
(1 row)

The single space literal ' ' is of type char(1) and then becomes a
varchar(4) value consisting of one space. It does not match the WHERE
clause condition.

More evidence from Postgres that ' ' does not get trimmed when it is
converted to varchar:

foodmart=> values (length(cast(' ' as varchar(4))));
 column1
---------
       1
(1 row)

foodmart=> values (length(' '));
 column1
---------
       1
(1 row)

foodmart=> values (length(cast(' ' as char(4))));
 column1
---------
       0
(1 row)

I think Postgres' behavior is correct, and Calcite seems to be in agreement
with Postgres.

Julian



On Thu, Jun 18, 2015 at 11:04 PM, Yuri Au Yong <[email protected]>
wrote:

> Hi Julian,
>
> Query through JdbcAdapter is fine as the query is pushed down and handled
> by databases.
>
> However, the issue is when the queries are resolved in calcite. Attached
> is a reflective schema test case which prints the results of queries below
> where both V1 and V2 columns are VARCHARs.
>
> SQL: select * from test.t1
> | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) |
> |_|0|_|null value|_|null|_|
> |_|1|_|a|_|aa|_|
> |_|2|_|noBlank|_||_|
> |_|3|_|oneBlank|_| |_|
> |_|4|_|ltrOneBlank|_|aa |_|
>
> ##These queries should return rows ID=2 and ID=3
> SQL: select * from test.t1 where v2 = ''
> | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) |
> |_|2|_|noBlank|_||_|
> SQL: select * from test.t1 where v2 = ' '
> | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) |
> |_|2|_|noBlank|_||_|
>
> ## These queries should return rows ID=1 and ID=4
> SQL: select * from test.t1 where v2 = 'aa'
> | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) |
> |_|1|_|a|_|aa|_|
> SQL: select * from test.t1 where v2 = 'aa '
> | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) |
> |_|1|_|a|_|aa|_|
>
>
> Thanks & Regards,
>
> Yuri Au Yong
> Software Engineer - NPM | [email protected] | Desk: +60 3 7663
> 8372
> Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com
>
>
> -----Original Message-----
> From: Julian Hyde [mailto:[email protected]]
> Sent: Wednesday, June 17, 2015 4:14 AM
> To: [email protected]
> Subject: Re: Comparing untrimmed Identifier to string literal with trimmed
> trailing space
>
> It sounds as if we are good, then.
>
> A related issue was fixed in
> https://github.com/apache/incubator-calcite/commit/8d20f71.
>
> And if you do
>
>   SELECT CAST(x AS VARCHAR(10)) FROM VALUES 'xyz', 'abcde' AS t(x)
>
> Calcite has to convert the CHAR(3) and CHAR(5) literals to VARCHAR(3) and
> VARCHAR(5) before it converts them to VARCHAR(10). If it converted them
> both to CHAR(5), 'xyz' would end up with two trailing spaces.
>
> Julian
>
>
>
> On Jun 12, 2015, at 12:40 AM, Yuri Au Yong <[email protected]>
> wrote:
>
> > With resultSet.getMetaData().getColumnTypeName(...) I get that CITY is a
> VARCHAR datatype:
> >
> > | EMPNO(INTEGER) | NAME(VARCHAR) | DEPTNO(INTEGER) | GENDER(VARCHAR) |
> > | CITY(VARCHAR) | EMPID(INTEGER) | AGE(INTEGER) | SLACKER(BOOLEAN) |
> > | MANAGER(BOOLEAN) | JOINEDAT(DATE) |
> >
> > where CITY =' ' or CITY ='  ' or CITY=CAST(' ' AS VARCHAR(1)) or
> CITY=CAST('' AS VARCHAR(1)) gives the same results.
> >
> > I have not tried with postgres but I have tested with hsqldb and mysql
> (with JDBC adapter), the queries were pushed down to the dbs and it seems
> identifier and literal are both trimmed prior comparison:
> >
> > SQL: SELECT * from "t1"
> > | I1(INTEGER) | V1(VARCHAR(30)) | V2(VARCHAR(30)) | _|0|_|null
> > |value|_|null|_| _|1|_|a|_|aa|_| _|2|_|noBlank|_||_| _|3|_|oneBlank|_|
> > ||_| _|4|_|ltrOneBlank|_|aa |_|
> > SQL: SELECT * from "t1" where v2 = ''
> > | I1(INTEGER) | V1(VARCHAR(30)) | V2(VARCHAR(30)) |
> > |_|2|_|noBlank|_||_| _|3|_|oneBlank|_| |_|
> >
> > I also observed that the blank string literal is trimmed after
> converting SqlNode to RelNode:
> >
> > Jun 12, 2015 3:32:39 PM
> > org.apache.calcite.sql.validate.SqlValidatorImpl
> > validateScopedExpression
> > FINER: After validation: SELECT `EMPS`.`NAME`, `EMPS`.`CITY` FROM
> > `SALES`.`EMPS` AS `EMPS` WHERE `EMPS`.`CITY` = ' '
> > Jun 12, 2015 3:32:39 PM org.apache.calcite.sql2rel.SqlToRelConverter
> > convertQuery
> > FINE: Plan after converting SqlNode to RelNode
> > LogicalProject(NAME=[$1], CITY=[$4])
> >  LogicalFilter(condition=[=(CAST($4):VARCHAR(1) CHARACTER SET
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", '')])
> >    LogicalTableScan(table=[[SALES, EMPS]])
> >
> >
> > Jun 12, 2015 3:26:42 PM
> > org.apache.calcite.sql.validate.SqlValidatorImpl
> > validateScopedExpression
> > FINER: After validation: SELECT `EMPS`.`NAME`, `EMPS`.`CITY` FROM
> > `SALES`.`EMPS` AS `EMPS` WHERE `EMPS`.`CITY` = CAST(' ' AS VARCHAR(1))
> > Jun 12, 2015 3:26:42 PM org.apache.calcite.sql2rel.SqlToRelConverter
> > convertQuery
> > FINE: Plan after converting SqlNode to RelNode
> > LogicalProject(NAME=[$1], CITY=[$4])
> >  LogicalFilter(condition=[=(CAST($4):VARCHAR(1) CHARACTER SET
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", '')])
> >    LogicalTableScan(table=[[SALES, EMPS]])
> >
> > Jun 12, 2015 3:21:49 PM
> > org.apache.calcite.sql.validate.SqlValidatorImpl
> > validateScopedExpression
> > FINER: After validation: SELECT `T1`.`I1`, `T1`.`V1`, `T1`.`V2` FROM
> > `BASEJDBC`.`T1` AS `T1` WHERE `T1`.`V2` = ' '
> > Jun 12, 2015 3:21:49 PM org.apache.calcite.sql2rel.SqlToRelConverter
> > convertQuery
> > FINE: Plan after converting SqlNode to RelNode LogicalProject(I1=[$0],
> > V1=[$1], V2=[$2])  LogicalFilter(condition=[=($2, '')])
> >    JdbcTableScan(table=[[BASEJDBC, T1]])
> >
> >
> >
> > Thanks & Regards,
> >
> > Yuri Au Yong
> > Software Engineer - NPM | [email protected] | Desk: +60 3 7663
> > 8372 Persistent Systems Ltd. | Partners in Innovation |
> > www.persistentsys.com
> >
> >
> > -----Original Message-----
> > From: Julian Hyde [mailto:[email protected]]
> > Sent: Thursday, June 11, 2015 1:37 AM
> > To: [email protected]
> > Subject: Re: Comparing untrimmed Identifier to string literal with
> > trimmed trailing space
> >
> > I think it may be due to the fact that character literals are of type
> CHAR, not VARCHAR, in compliance with the SQL standard. When CHAR values
> are compared trailing spaces are ignored.
> >
> > Can you try some experiments? Try
> >
> >  WHERE city = CAST(' ' AS VARCHAR(1))
> >
> > etc. Also see whether it is caused by the CSV adapter by running t he
> query against 'VALUES (100, 'Fred', ...), ..., (1, 'oneblank', 1, ...)'.
> >
> > Does Postgres have the same behavior?
> >
> > Julian
> >
> >
> >> On Jun 10, 2015, at 10:24 AM, Yuri Au Yong <[email protected]>
> wrote:
> >>
> >> When querying with a comparison operator predicate that compares an
> identifier to a string literal in sqlline , I notice that the trailing
> space of the string literal is trimmed prior comparison. The trailing space
> of the  identifier however was not trimmed prior comparison which causes
> inaccurate results (see example below). Any pointers on where can I add the
> trailing space trim for the identifier before comparison?
> >>
> >> Example:
> >> I have edited the EMPS.cvs with additional rows, "noblank" and
> "oneblank":
> >>
> >> EMPNO:int,NAME:string,DEPTNO:int,GENDER:string,CITY:string,EMPID:int,
> >> A GE:int,SLACKER:boolean,MANAGER:boolean,JOINEDAT:date
> >> 100,"Fred",10,,,30,25,true,false,"1996-08-03"
> >> 110,"Eric",20,"M","San Francisco",3,80,,false,"2001-01-01"
> >> 110,"John",40,"M","Vancouver",2,,false,true,"2002-05-03"
> >> 120,"Wilma",20,"F",,1,5,,true,"2005-09-07"
> >> 130,"Alice",40,"F","Vancouver",2,,false,true,"2007-01-01"
> >> 0,"noblank",0,"","",0,0,false,false,"2015-06-09"
> >> 1,"oneblank",1," "," ",1,1,false,false,"2015-06-09"
> >>
> >> #where CITY ='' or CITY =' '
> >> 0: jdbc:calcite:model=target/test-classes/mod> select NAME, CITY from
> >> EMPS where CITY ='';
> >> +------+------+
> >> | NAME | CITY |
> >> +------+------+
> >> | Fred |      |
> >> | Wilma |      |
> >> | noblank |      |
> >> +------+------+
> >> 3 rows selected (3.002 seconds)
> >>
> >> #where CITY <>' ' or CITY <>'  '
> >> 0: jdbc:calcite:model=target/test-classes/mod> select NAME, CITY from
> >> EMPS where CITY <>' ';
> >> +------+------+
> >> | NAME | CITY |
> >> +------+------+
> >> | Eric | San Francisco |
> >> | John | Vancouver |
> >> | Alice | Vancouver |
> >> | oneblank |      |
> >> +------+------+
> >> 4 rows selected (0.304 seconds)
> >>
> >>
> >> Thanks & Regards,
> >>
> >> Yuri Au Yong
> >> Software Engineer - NPM | [email protected]
> >> <mailto:[email protected]> | Desk: +60 3 7663 8372 Persistent
> >> Systems Ltd. | Partners in Innovation | www.persistentsys.com
> >> <http://www.persistentsys.com/>
> >> <https://tnpmsupport.persistentsys.com/training>
> >>
> >> DISCLAIMER ========== This e-mail may contain privileged and
> confidential information which is the property of Persistent Systems Ltd.
> It is intended only for the use of the individual or entity to which it is
> addressed. If you are not the intended recipient, you are not authorized to
> read, retain, copy, print, distribute or use this message. If you have
> received this communication in error, please notify the sender and delete
> all copies of this message. Persistent Systems Ltd. does not accept any
> liability for virus infected mails.
> >>
> >
> >
> > DISCLAIMER
> > ==========
> > This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>
>

Reply via email to