gortiz commented on issue #16821:
URL: https://github.com/apache/pinot/issues/16821#issuecomment-3301964304

   > I don't follow, why should the NOT IS NULL get transformed to <> 0?
   
   Because that is the ill-defined semantics when using basic null handling 
(aka null handling is disabled), as documented in 
https://docs.pinot.apache.org/developers/advanced/null-value-support. I think 
this basic null handling is very confusing and incorrect, but it is too late to 
change that decision. 
   
   > But the concern I have is that this is changing results for existing 
queries completely. 
   
   Yes, because previously it was returning incorrect results (again, as 
defined as the strange Pinot semantics).
   
   > Now, the filter would actually be evaluated using null vectors at segment 
level (if they are enabled)
   
   Not exactly, given null handling is disabled, what Pinot does is literally 
to compare the value with the default null value for the column (which is 0 or 
MIN_INT, depending on whether the column is a metric or a dimension). When null 
handling is enabled, Calcite will simplify the expression as false, producing 
the same plan as previously.
   
   > I'll need to check what the impact of this is at Uber, but if we do end up 
going with this approach we need to call this out as a breaking change.
   
   This is not a breaking change, it is a bugfix. Now MSE returns the same 
result as SSE in this situation. If you want actual SQL semantics, you need to 
enable null handling (which can be enabled by default).
   
   To be clear, I'm the one who applied the change, but I strongly suggest we 
start changing MSE to use null handling by default, prioritizing correctness 
over incorrect but slightly more efficient queries. Basic null handling is 
error prone and difficult to explain and understand. But having 2 different 
null handling modes is already too complex, making MSE behave differently than 
SSE when null handling is disabled would mean adding a 3 null handling mode, 
which makes everything even worse.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to