ankitsultana commented on issue #16821:
URL: https://github.com/apache/pinot/issues/16821#issuecomment-3303697369

   > I don't follow, why should the NOT IS NULL get transformed to <> 0?
   > ...
   > Because that is the ill-defined semantics when using basic null handling 
(aka null handling is disabled)
   
   I don't think that's true and the existing semantics made sense to me. 
Nowhere in SSE do we rewrite the predicate `IS NOT NULL` to `<> 0` or anything 
like that. The SSE semantics were:
   
   1. If you use `IS NOT NULL / IS NULL`, the null vectors will be used. If 
null vectors don't exist in the segment, this is simplified to a match all or 
empty (i.e. we assume all columns as not nullable if null vectors are disabled).
   2. The default null value is used for storage **only**, and you can use 
comparisons against that value if you don't want to use null predicates. This 
is independent of whether you have null vectors enabled.
   
   These semantics make perfect sense to me and seem quite reasonable.
   
   **Note:** This discussion seems like a digression though, because I don't 
think you are transforming `IS NOT NULL` to `<> 0` or vice versa, right? At 
least I couldn't reproduce it with the latest Docker image.
   
   You can see the relevant FilterPlanNode code below and also verify this with 
the Realtime Quickstart.
   
   <img width="753" height="307" alt="Image" 
src="https://github.com/user-attachments/assets/a0a77ae0-8998-4c4b-ac08-af2ddb29f52c";
 />
   
   <img width="1884" height="820" alt="Image" 
src="https://github.com/user-attachments/assets/cd5d7b0f-845a-4f7b-9784-ab2fd6305d68";
 />
   
   > Yes, because previously it was returning incorrect results (again, as 
defined as the strange Pinot semantics).
   
   The results weren't incorrect. They were what the semantics were. The 
semantics didn't make sense I agree, but if the semantics were defined that way 
I don't agree that we can call the results incorrect.
   
   > Not exactly, given null handling is disabled, what Pinot does is literally 
to compare the value with the default null value for the column (which is 0 or 
MIN_INT, depending on whether the column is a metric or a dimension).
   
   Same as the first inline comment: I don't think Pinot does this. i.e. Pinot 
will not use the `IS NULL / IS NOT NULL` filter and automatically convert it to 
a default null value exact match filter, regardless of any config.
   
   Maybe I am mistaken, could you share a repro?
   
   > This is not a breaking change, it is a bugfix.
   
   Again, the new semantics make sense to me, but whether this is a breaking 
change or not is not up for debate. Query results for existing customers are 
going to change. You can call it a bugfix too, but this is a breaking change.
   
   As a platform at Uber we will have to assess whether this has impact to our 
customers or not. These null handling semantics and oddities were known to us 
and we had documented it in our internal Runbooks too to help users understand 
them.
   
   In fact I was trying to find a way to make this work with Lite Mode without 
triggering a b/w incompatible change like the one introduced in the related PR, 
when I realized that it's already been addressed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to