[ 
https://issues.apache.org/jira/browse/PHOENIX-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478953#comment-15478953
 ] 

ASF GitHub Bot commented on PHOENIX-3046:
-----------------------------------------

Github user kliewkliew commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/208#discussion_r78269568
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/compile/ExpressionCompiler.java 
---
    @@ -523,7 +523,12 @@ public Expression visitLeave(LikeParseNode node, 
List<Expression> children) thro
                     byte[] wildcard = {StringUtil.MULTI_CHAR_LIKE};
                     StringUtil.fill(nullExpressionString, 0, pattern.length(), 
wildcard, 0, 1, false);
                     if (pattern.equals(new String (nullExpressionString))) {
    -                    return IsNullExpression.create(lhs, true, 
context.getTempPtr());
    +                    if (node.isNegate()) {
    +                        return LiteralExpression.newConstant(false, 
Determinism.ALWAYS);
    --- End diff --
    
    The specification is:
    ```
    5) Case:
    
                a) If M and P are character strings whose lengths are variable
                  and if the lengths of both M and P are 0, then
    
                     M LIKE P
    
                  is true.
    
                b) The <predicate>
    
                     M LIKE P
    
                  is true if there exists a partitioning of M into substrings
                  such that:
    
                  i) A substring of M is a sequence of 0 or more contiguous
                     <character representation>s of M and each <character repre-
                     sentation> of M is part of exactly one substring.
    
                 ii) If the i-th substring specifier of P is an arbitrary char-
                     acter specifier, the i-th substring of M is any single
                     <character representation>.
    
                iii) If the i-th substring specifier of P is an arbitrary string
                     specifier, then the i-th substring of M is any sequence of
                     0 or more <character representation>s.
    
                 iv) If the i-th substring specifier of P is neither an arbi-
                     trary character specifier nor an arbitrary string speci-
                     fier, then the i-th substring of M is equal to that sub-
                     string specifier according to the collating sequence of
                     the <like predicate>, without the appending of <space>
                     characters to M, and has the same length as that substring
                     specifier.
    
                  v) The number of substrings of M is equal to the number of
                     substring specifiers of P.
    
                c) Otherwise,
    
                     M LIKE P
    
                  is false.
    ```
    
    Given that `LEN(NULL)` is `NULL`, `WHERE col IS NOT LIKE '%'` fails cases 
`a`, `b.i`, and `b.iii`; defaulting to case `c` and always returning false.
    
    However, I looked through the docs again and noticed the following:
    
    ```
             3) "M NOT LIKE P" is equivalent to "NOT (M LIKE P)".
    ```
    
    in which case `WHERE col IS *NOT LIKE* '%'` should return the inverse 
result set of `WHERE col IS *LIKE* '%'` (and should compile to `WHERE col IS 
NULL`).
    
    I might have misinterpreted something but the specification seems to 
contradict itself.


> `NOT LIKE '%'` unexpectedly returns results
> -------------------------------------------
>
>                 Key: PHOENIX-3046
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3046
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Kevin Liew
>            Assignee: Kevin Liew
>            Priority: Minor
>              Labels: like, like-predicate, phoenix, regex, wildcard, wildcards
>             Fix For: 4.9.0, 4.8.1
>
>
> The following returns all rows in the table when it should return no rows:
> {code}select * from emp where first_name not like '%'{code}
> The following returns no rows as expected:
> {code}select * from emp where first_name not like '%%'{code}
> first_name is a VARCHAR column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to