[ 
https://issues.apache.org/jira/browse/PHOENIX-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713453#comment-16713453
 ] 

William Shen edited comment on PHOENIX-5065 at 12/8/18 1:15 AM:
----------------------------------------------------------------

With the explain plan, it seems like when there are multiple values involved, 
the IN operator translates IS NULL instead of = NULL for the empty string?
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID = '';
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.022 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID = null;
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.027 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID is null;
+----------------------------------------------------------------------+-----------------+----------------+--------------+
|                                 PLAN                                 | 
EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SYSTEM.CATALOG [null]  | null   
         | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                  | null   
         | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                 | null   
         | null           | null         |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in (null);
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.04 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in ('');
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in ('', 'FOO');
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                 
          | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] 
- ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                           
          | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                          
          | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
{noformat}


Instead of evaluating to 
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID = '' OR TENANT_ID = 'FOO';
+------------------------------------------------------------------+-----------------+----------------+--------------+
|                               PLAN                               | 
EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER SYSTEM.CATALOG      | null       
     | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY AND ( OR TENANT_ID = 'FOO')  | null       
     | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                             | null       
     | null           | null         |
+------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.025 seconds)
{noformat}

It evaluated to
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID IS NULL OR TENANT_ID = 'FOO';
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                 
          | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] 
- ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                           
          | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                          
          | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.021 seconds)

{noformat}


was (Author: willshen):
With the explain plan, it seems like when there are multiple values involved, 
the IN operator translates IS NULL instead of = NULL for the empty string?
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID = '';
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.022 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID = null;
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.027 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID is null;
+----------------------------------------------------------------------+-----------------+----------------+--------------+
|                                 PLAN                                 | 
EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SYSTEM.CATALOG [null]  | null   
         | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                  | null   
         | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                 | null   
         | null           | null         |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in (null);
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.04 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in ('');
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | 
EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | 
null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM 
SYSTEM.CATALOG WHERE TENANT_ID in ('', 'FOO');
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                 
          | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] 
- ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                           
          | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                          
          | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
{noformat}

> Inconsistent treatment of NULL and empty string
> -----------------------------------------------
>
>                 Key: PHOENIX-5065
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5065
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.1
>            Reporter: Geoffrey Jacoby
>            Priority: Major
>
> Phoenix doesn't handle NULLs consistently with other SQL dialects, and it 
> doesn't handle them consistently internally either. 
> In PHOENIX-2422, [~jamestaylor] mentioned that Phoenix's intended behavior is 
> for empty string and NULL to be equivalent. That's inconsistent with other 
> SQL dialects (in which NULL is never equal to anything, including itself), 
> but if that's our documented behavior, then that's fine unless PHOENIX-2422 
> to change it is ever worked. 
> But consider the following queries:
> {code:java}
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '';
> -- Returns 0 rows
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IS NULL;
> -- Returns some number of rows. Call it N
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IN ('');
> -- Returns 0 rows
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IN ('', 'FOO');
> -- Returns N rows. Note that FOO does not exist, and is just a nonsense string
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '' OR TENANT_ID = 'FOO'
> --Returns 0 rows, but slowly
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to