XIANG,CHAO-ZHU created PHOENIX-6808:
---------------------------------------

             Summary: query filter cannot get correct result
                 Key: PHOENIX-6808
                 URL: https://issues.apache.org/jira/browse/PHOENIX-6808
             Project: Phoenix
          Issue Type: Bug
          Components: 4.x, core
    Affects Versions: 4.14.1
         Environment: hbase-1.4

phoenix-4.14.1-HBase-1.4
            Reporter: XIANG,CHAO-ZHU
         Attachments: image-2022-10-10-18-41-09-059.png, 
image-2022-10-10-22-41-51-879.png, image-2022-10-10-23-10-01-059.png

create table && upsert values

 
{code:java}
CREATE TABLE IF NOT EXISTS "TEST_TABLE" (
"K" VARCHAR PRIMARY KEY,
"f"."JR1" UNSIGNED_INT DEFAULT 0,
"f"."JR2" UNSIGNED_INT DEFAULT 0,
"f"."ZN" UNSIGNED_INT DEFAULT 0,
"f"."AA" UNSIGNED_INT,
"f"."AB" UNSIGNED_INT DEFAULT 0,
"f"."AC" INTEGER DEFAULT 0,
"f"."AD" INTEGER
) column_encoded_bytes=0 SPLIT ON (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);

UPSERT INTO TEST_TABLE(K, JR1, JR2, ZN, AA, AB, AC, AD) 
VALUES('5:xiang:346358075:370562729', 270, null, 10846, null, null, null, null);
UPSERT INTO TEST_TABLE(K, JR1, JR2, ZN, AA, AB, AC, AD) 
VALUES('0:chaozhu:123123120:321321321' ,  123, null,  10846, 23303, 123, 321, 
111);
 {code}
then execute query:

 
{code:java}
select * from "TEST_TABLE" where ( "ZN" in (10846) and ("JR1" in (2303) or 
"JR2" in (10846) ));{code}
result:

!image-2022-10-10-18-41-09-059.png!

Above,  neither `JR1`  nor `JR2` matches query statement but result contains 2 
rows.

And execute query (change the order): 

 
{code:java}
select * from "TEST_TABLE" where (("JR1" in (2303) or "JR2" in (10846) ) and  
"ZN" in (10846) ); {code}
result:

 

!image-2022-10-10-22-24-52-310.png!

Or execute query ( "ZN" in (10846) -> "AB" in ( 123 )):

 
{code:java}
select * from "TEST_TABLE" where ( "AB" in (123) and ("JR1" in (2303) or "JR2" 
in (10846) ));
{code}
result:

 

!image-2022-10-10-23-10-01-059.png!

 

I try to debug it by docker and remote debug. Maybe I know the cause of this 
problem.

 

In 
{*}ComparisonExpression.java{*}(org/apache/phoenix/expression/ComparisonExpression.java)

 
{code:java}
@Override
public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    if (!children.get(0).evaluate(tuple, ptr)) {
        return false;
    }
    if (ptr.getLength() == 0) { // null comparison evals to null
       return true;
    } 
....
}{code}
because `JR2` value is null, so `ptr.getLength() == 0` is true then return true 
directly.

 

In *AndOrExpression.java* (org/apache/phoenix/expression/AndOrExpression.java)
{code:java}
public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
    boolean isNull = false;
    for (int i = 0; i < children.size(); i++) {
        Expression child = children.get(i);
        // If partial state is available, then use that to know we've already 
evaluated this
        // child expression and do not need to do so again.
        if (partialEvalState == null || !partialEvalState.get(i)) {
            // Call through to child evaluate method matching parent call to 
allow child to optimize
            // evaluate versus getValue code path.
            if (child.evaluate(tuple, ptr)) {
                // Short circuit if we see our stop value
                if (isStopValue((Boolean) PBoolean.INSTANCE.toObject(ptr, 
child.getDataType()))) {
                    return true;
                } else if (partialEvalState != null) {
                    partialEvalState.set(i);
                }
            } else {
                isNull = true;
            }
        }
    }
    if (isNull) {
        return false;
    }
    return true;
} {code}
 

because evaluate `JR2` return true so `("JR1" in (2303) or "JR2" in (10846))` 
is true.

 

In *BooleanExpressionFilter.Java*   
(org/apache/phoenix/filter/BooleanExpressionFilter.java)

 
{code:java}
    @Override
    public String toString() {
        return expression.toString();
    }    @edu.umd.cs.findbugs.annotations.SuppressWarnings(
            value="NP_BOOLEAN_RETURN_NULL",
            justification="Returns null by design.")
    protected Boolean evaluate(Tuple input) {
        try {
            if (!expression.evaluate(input, tempPtr)) {
                return null;
            }
        } catch (IllegalDataException e) {
            return Boolean.FALSE;
        }
        return (Boolean)expression.getDataType().toObject(tempPtr);
    }{code}
now  `(Boolean)expression.getDataType().toObject(tempPtr) = true` ( `"ZN" in 
(10846) is true` cause ) ** 

 

In *MultiKeyValueComparisonFilter.java* 
(org/apache/phoenix/filter/MultiKeyValueComparisonFilter.java)

 
{code:java}
public ReturnCode filterKeyValue(Cell cell) {

    ...
    // We found a new column, so we can re-evaluate
    // TODO: if we have row key columns in our expression, should
    // we always evaluate or just wait until the end?
    this.matchedColumn = this.evaluate(inputTuple);
    if (this.matchedColumn == null) {
        if (inputTuple.isImmutable()) {
            this.matchedColumn = Boolean.FALSE;
        } else {
            return ReturnCode.INCLUDE_AND_NEXT_COL;
        }
    } 
...
}

public boolean filterRow() {
    if (this.matchedColumn == null && !inputTuple.isImmutable() && 
expression.requiresFinalEvaluation()) {
        inputTuple.setImmutable();
        this.matchedColumn = this.evaluate(inputTuple);
    }
    
    return ! (Boolean.TRUE.equals(this.matchedColumn));
}

{code}
Above, filterKeyValue -> this.matchedColumn = true -> filterRow -> return false 
-> not filter row

 

 

Thus, the reason of query:

 
{code:java}
select * from "TEST_TABLE" where (("JR1" in (2303) or "JR2" in (10846) ) and  
"ZN" in (10846) );  {code}
`("JR1" in (2303) or "JR2" in (10846)` is true and `isStopValue((Boolean) 
PBoolean.INSTANCE.toObject(ptr, child.getDataType())) = true` -> 
`(Boolean)expression.getDataType().toObject(tempPtr) = null` ->  

 this.matchedColumn = false -> filterRow -> return true -> filter row

 

And, the reason of query:

 
{code:java}
select * from "TEST_TABLE" where ( "AB" in (123) and ("JR1" in (2303) or "JR2" 
in (10846) )); {code}
hbase read cell and filter it by alphabetical order.  

first read "AB" and `"AB" in (123)` is true then `partialEvalState.set(0)`,

then read "JR1" and set subfilter `partialEvalState.set(0)`

final read "JR2" -> skip index 0 since partialEvalState.get(0)  ->  ("JR1" in 
(2303) or "JR2" in (10846) ) is true and PBoolean.INSTANCE.toObject(ptr, 
child.getDataType())) = true` -> 
`(Boolean)expression.getDataType().toObject(tempPtr) = null`

->   this.matchedColumn = false -> filterRow -> return true -> filter row

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to