gortiz commented on issue #16821: URL: https://github.com/apache/pinot/issues/16821#issuecomment-3301964304
> I don't follow, why should the NOT IS NULL get transformed to <> 0? Because that is the ill-defined semantics when using basic null handling (aka null handling is disabled), as documented in https://docs.pinot.apache.org/developers/advanced/null-value-support. I think this basic null handling is very confusing and incorrect, but it is too late to change that decision. > But the concern I have is that this is changing results for existing queries completely. Yes, because previously it was returning incorrect results (again, as defined as the strange Pinot semantics). > Now, the filter would actually be evaluated using null vectors at segment level (if they are enabled) Not exactly, given null handling is disabled, what Pinot does is literally to compare the value with the default null value for the column (which is 0 or MIN_INT, depending on whether the column is a metric or a dimension). When null handling is enabled, Calcite will simplify the expression as false, producing the same plan as previously. > I'll need to check what the impact of this is at Uber, but if we do end up going with this approach we need to call this out as a breaking change. This is not a breaking change, it is a bugfix. Now MSE returns the same result as SSE in this situation. If you want actual SQL semantics, you need to enable null handling (which can be enabled by default). To be clear, I'm the one who applied the change, but I strongly suggest we start changing MSE to use null handling by default, prioritizing correctness over incorrect but slightly more efficient queries. Basic null handling is error prone and difficult to explain and understand. But having 2 different null handling modes is already too complex, making MSE behave differently than SSE when null handling is disabled would mean adding a 3 null handling mode, which makes everything even worse. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
