I have installed Postgresql-9.4.4 and ran the following queries, calcite does not behave like postgresql either; since postgresql does not trim the literal nor the identifier prior comparison while calcite trims the literal but not the identifier prior comparison.
Jira created: https://issues.apache.org/jira/browse/CALCITE-762 testdb=# create table "T1" ("I1" integer, "V1" varchar(20), "V2" varchar(20)); CREATE TABLE testdb=# insert into "T1" values (0, 'null val', NULL); INSERT 0 1 testdb=# insert into "T1" values (1, 'a', 'aa'); INSERT 0 1 testdb=# INSERT INTO "T1" VALUES (2, 'noBlank', ''); INSERT 0 1 testdb=# INSERT INTO "T1" VALUES (3, 'oneBlank', ' '); INSERT 0 1 testdb=# INSERT INTO "T1" VALUES (4, 'ltrOneBlank', 'aa '); INSERT 0 1 [22/06/2015 3:05 PM] Yuri Au Yong: testdb=# select * from "T1" testdb-# ; I1 | V1 | V2 ----+-------------+----- 0 | null val | 1 | a | aa 2 | noBlank | 3 | oneBlank | 4 | ltrOneBlank | aa (5 rows) testdb=# select * from "T1" where "V2" = ''; I1 | V1 | V2 ----+---------+---- 2 | noBlank | (1 row) testdb=# select * from "T1" where "V2" = ' '; I1 | V1 | V2 ----+----------+---- 3 | oneBlank | (1 row) testdb=# select * from "T1" where "V2" = 'aa'; I1 | V1 | V2 ----+----+---- 1 | a | aa (1 row) testdb=# select * from "T1" where "V2" = 'aa '; I1 | V1 | V2 ----+-------------+----- 4 | ltrOneBlank | aa (1 row) Thanks & Regards, Yuri Au Yong Software Engineer - NPM | [email protected] | Desk: +60 3 7663 8372 Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com -----Original Message----- From: Julian Hyde [mailto:[email protected]] Sent: Saturday, June 20, 2015 12:14 PM To: [email protected] Subject: Re: Comparing untrimmed Identifier to string literal with trimmed trailing space I tried to reproduce this. I ran the query with t (id, v1, v2) as ( select * from (values (0, cast('null value' as varchar(11)), cast(null as varchar(4))), (1, 'a', 'aa'), (2, 'noBlank', ''), (3, 'oneBlank', ' '), (4, 'ltrOneBlank', 'aa '))) SELECT * FROM t WHERE v2 = '' and Calcite and Postgres both return id | v1 | v2 ----+---------+---- 2 | noBlank | (1 row) The single space literal ' ' is of type char(1) and then becomes a varchar(4) value consisting of one space. It does not match the WHERE clause condition. More evidence from Postgres that ' ' does not get trimmed when it is converted to varchar: foodmart=> values (length(cast(' ' as varchar(4)))); column1 --------- 1 (1 row) foodmart=> values (length(' ')); column1 --------- 1 (1 row) foodmart=> values (length(cast(' ' as char(4)))); column1 --------- 0 (1 row) I think Postgres' behavior is correct, and Calcite seems to be in agreement with Postgres. Julian On Thu, Jun 18, 2015 at 11:04 PM, Yuri Au Yong <[email protected]> wrote: > Hi Julian, > > Query through JdbcAdapter is fine as the query is pushed down and > handled by databases. > > However, the issue is when the queries are resolved in calcite. > Attached is a reflective schema test case which prints the results of > queries below where both V1 and V2 columns are VARCHARs. > > SQL: select * from test.t1 > | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) | _|0|_|null value|_|null|_| > |_|1|_|a|_|aa|_| _|2|_|noBlank|_||_| _|3|_|oneBlank|_| |_| > |_|4|_|ltrOneBlank|_|aa |_| > > ##These queries should return rows ID=2 and ID=3 > SQL: select * from test.t1 where v2 = '' > | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) | _|2|_|noBlank|_||_| > SQL: select * from test.t1 where v2 = ' ' > | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) | _|2|_|noBlank|_||_| > > ## These queries should return rows ID=1 and ID=4 > SQL: select * from test.t1 where v2 = 'aa' > | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) | _|1|_|a|_|aa|_| > SQL: select * from test.t1 where v2 = 'aa ' > | ID(INTEGER) | V1(VARCHAR) | V2(VARCHAR) | _|1|_|a|_|aa|_| > > > Thanks & Regards, > > Yuri Au Yong > Software Engineer - NPM | [email protected] | Desk: +60 3 7663 > 8372 > Persistent Systems Ltd. | Partners in Innovation | > www.persistentsys.com > > > -----Original Message----- > From: Julian Hyde [mailto:[email protected]] > Sent: Wednesday, June 17, 2015 4:14 AM > To: [email protected] > Subject: Re: Comparing untrimmed Identifier to string literal with > trimmed trailing space > > It sounds as if we are good, then. > > A related issue was fixed in > https://github.com/apache/incubator-calcite/commit/8d20f71. > > And if you do > > SELECT CAST(x AS VARCHAR(10)) FROM VALUES 'xyz', 'abcde' AS t(x) > > Calcite has to convert the CHAR(3) and CHAR(5) literals to VARCHAR(3) > and > VARCHAR(5) before it converts them to VARCHAR(10). If it converted > them both to CHAR(5), 'xyz' would end up with two trailing spaces. > > Julian > > > > On Jun 12, 2015, at 12:40 AM, Yuri Au Yong > <[email protected]> > wrote: > > > With resultSet.getMetaData().getColumnTypeName(...) I get that CITY > > is a > VARCHAR datatype: > > > > | EMPNO(INTEGER) | NAME(VARCHAR) | DEPTNO(INTEGER) | GENDER(VARCHAR) > > | | > > | CITY(VARCHAR) | EMPID(INTEGER) | AGE(INTEGER) | SLACKER(BOOLEAN) | > > | MANAGER(BOOLEAN) | JOINEDAT(DATE) | > > > > where CITY =' ' or CITY =' ' or CITY=CAST(' ' AS VARCHAR(1)) or > CITY=CAST('' AS VARCHAR(1)) gives the same results. > > > > I have not tried with postgres but I have tested with hsqldb and > > mysql > (with JDBC adapter), the queries were pushed down to the dbs and it > seems identifier and literal are both trimmed prior comparison: > > > > SQL: SELECT * from "t1" > > | I1(INTEGER) | V1(VARCHAR(30)) | V2(VARCHAR(30)) | _|0|_|null > > |value|_|null|_| _|1|_|a|_|aa|_| _|2|_|noBlank|_||_| > > |value|_|3|_|oneBlank|_| > > ||_| _|4|_|ltrOneBlank|_|aa |_| > > SQL: SELECT * from "t1" where v2 = '' > > | I1(INTEGER) | V1(VARCHAR(30)) | V2(VARCHAR(30)) | > > |_|2|_|noBlank|_||_| _|3|_|oneBlank|_| |_| > > > > I also observed that the blank string literal is trimmed after > converting SqlNode to RelNode: > > > > Jun 12, 2015 3:32:39 PM > > org.apache.calcite.sql.validate.SqlValidatorImpl > > validateScopedExpression > > FINER: After validation: SELECT `EMPS`.`NAME`, `EMPS`.`CITY` FROM > > `SALES`.`EMPS` AS `EMPS` WHERE `EMPS`.`CITY` = ' ' > > Jun 12, 2015 3:32:39 PM org.apache.calcite.sql2rel.SqlToRelConverter > > convertQuery > > FINE: Plan after converting SqlNode to RelNode > > LogicalProject(NAME=[$1], CITY=[$4]) > > LogicalFilter(condition=[=(CAST($4):VARCHAR(1) CHARACTER SET > "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", '')]) > > LogicalTableScan(table=[[SALES, EMPS]]) > > > > > > Jun 12, 2015 3:26:42 PM > > org.apache.calcite.sql.validate.SqlValidatorImpl > > validateScopedExpression > > FINER: After validation: SELECT `EMPS`.`NAME`, `EMPS`.`CITY` FROM > > `SALES`.`EMPS` AS `EMPS` WHERE `EMPS`.`CITY` = CAST(' ' AS > > VARCHAR(1)) Jun 12, 2015 3:26:42 PM > > org.apache.calcite.sql2rel.SqlToRelConverter > > convertQuery > > FINE: Plan after converting SqlNode to RelNode > > LogicalProject(NAME=[$1], CITY=[$4]) > > LogicalFilter(condition=[=(CAST($4):VARCHAR(1) CHARACTER SET > "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary", '')]) > > LogicalTableScan(table=[[SALES, EMPS]]) > > > > Jun 12, 2015 3:21:49 PM > > org.apache.calcite.sql.validate.SqlValidatorImpl > > validateScopedExpression > > FINER: After validation: SELECT `T1`.`I1`, `T1`.`V1`, `T1`.`V2` FROM > > `BASEJDBC`.`T1` AS `T1` WHERE `T1`.`V2` = ' ' > > Jun 12, 2015 3:21:49 PM org.apache.calcite.sql2rel.SqlToRelConverter > > convertQuery > > FINE: Plan after converting SqlNode to RelNode > > LogicalProject(I1=[$0], V1=[$1], V2=[$2]) LogicalFilter(condition=[=($2, > > '')]) > > JdbcTableScan(table=[[BASEJDBC, T1]]) > > > > > > > > Thanks & Regards, > > > > Yuri Au Yong > > Software Engineer - NPM | [email protected] | Desk: +60 3 > > 7663 > > 8372 Persistent Systems Ltd. | Partners in Innovation | > > www.persistentsys.com > > > > > > -----Original Message----- > > From: Julian Hyde [mailto:[email protected]] > > Sent: Thursday, June 11, 2015 1:37 AM > > To: [email protected] > > Subject: Re: Comparing untrimmed Identifier to string literal with > > trimmed trailing space > > > > I think it may be due to the fact that character literals are of > > type > CHAR, not VARCHAR, in compliance with the SQL standard. When CHAR > values are compared trailing spaces are ignored. > > > > Can you try some experiments? Try > > > > WHERE city = CAST(' ' AS VARCHAR(1)) > > > > etc. Also see whether it is caused by the CSV adapter by running t > > he > query against 'VALUES (100, 'Fred', ...), ..., (1, 'oneblank', 1, ...)'. > > > > Does Postgres have the same behavior? > > > > Julian > > > > > >> On Jun 10, 2015, at 10:24 AM, Yuri Au Yong > >> <[email protected]> > wrote: > >> > >> When querying with a comparison operator predicate that compares an > identifier to a string literal in sqlline , I notice that the trailing > space of the string literal is trimmed prior comparison. The trailing > space of the identifier however was not trimmed prior comparison > which causes inaccurate results (see example below). Any pointers on > where can I add the trailing space trim for the identifier before comparison? > >> > >> Example: > >> I have edited the EMPS.cvs with additional rows, "noblank" and > "oneblank": > >> > >> EMPNO:int,NAME:string,DEPTNO:int,GENDER:string,CITY:string,EMPID:in > >> t, A GE:int,SLACKER:boolean,MANAGER:boolean,JOINEDAT:date > >> 100,"Fred",10,,,30,25,true,false,"1996-08-03" > >> 110,"Eric",20,"M","San Francisco",3,80,,false,"2001-01-01" > >> 110,"John",40,"M","Vancouver",2,,false,true,"2002-05-03" > >> 120,"Wilma",20,"F",,1,5,,true,"2005-09-07" > >> 130,"Alice",40,"F","Vancouver",2,,false,true,"2007-01-01" > >> 0,"noblank",0,"","",0,0,false,false,"2015-06-09" > >> 1,"oneblank",1," "," ",1,1,false,false,"2015-06-09" > >> > >> #where CITY ='' or CITY =' ' > >> 0: jdbc:calcite:model=target/test-classes/mod> select NAME, CITY > >> from EMPS where CITY =''; > >> +------+------+ > >> | NAME | CITY | > >> +------+------+ > >> | Fred | | > >> | Wilma | | > >> | noblank | | > >> +------+------+ > >> 3 rows selected (3.002 seconds) > >> > >> #where CITY <>' ' or CITY <>' ' > >> 0: jdbc:calcite:model=target/test-classes/mod> select NAME, CITY > >> from EMPS where CITY <>' '; > >> +------+------+ > >> | NAME | CITY | > >> +------+------+ > >> | Eric | San Francisco | > >> | John | Vancouver | > >> | Alice | Vancouver | > >> | oneblank | | > >> +------+------+ > >> 4 rows selected (0.304 seconds) > >> > >> > >> Thanks & Regards, > >> > >> Yuri Au Yong > >> Software Engineer - NPM | [email protected] > >> <mailto:[email protected]> | Desk: +60 3 7663 8372 > >> Persistent Systems Ltd. | Partners in Innovation | > >> www.persistentsys.com <http://www.persistentsys.com/> > >> <https://tnpmsupport.persistentsys.com/training> > >> > >> DISCLAIMER ========== This e-mail may contain privileged and > confidential information which is the property of Persistent Systems Ltd. > It is intended only for the use of the individual or entity to which > it is addressed. If you are not the intended recipient, you are not > authorized to read, retain, copy, print, distribute or use this > message. If you have received this communication in error, please > notify the sender and delete all copies of this message. Persistent > Systems Ltd. does not accept any liability for virus infected mails. > >> > > > > > > DISCLAIMER > > ========== > > This e-mail may contain privileged and confidential information > > which is > the property of Persistent Systems Ltd. It is intended only for the > use of the individual or entity to which it is addressed. If you are > not the intended recipient, you are not authorized to read, retain, > copy, print, distribute or use this message. If you have received this > communication in error, please notify the sender and delete all copies of > this message. > Persistent Systems Ltd. does not accept any liability for virus > infected mails. > > > DISCLAIMER > ========== > This e-mail may contain privileged and confidential information which > is the property of Persistent Systems Ltd. It is intended only for the > use of the individual or entity to which it is addressed. If you are > not the intended recipient, you are not authorized to read, retain, > copy, print, distribute or use this message. If you have received this > communication in error, please notify the sender and delete all copies of > this message. > Persistent Systems Ltd. does not accept any liability for virus > infected mails. > > DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
