HeartSaVioR commented on issue #27834: revert [SPARK-24640][SQL] Return `NULL` from `size(NULL)` by default URL: https://github.com/apache/spark/pull/27834#issuecomment-596154932 It doesn't seem to be just a cosmetic change, as far as we hear actual use case being affected by. It would be OK to let it be -1 for non-aggregate given -1 can be still differentiated with valid values, but for aggregations the value of -1 is being handled as "valid" values silently, and provides correctness issue. That would require "pre-process" (having a column for the result of `size(col)`, but change to NULL if the value is -1) or hacky workaround. I feel we should have clear reason to have -1 for the return value, what benefits we get from having it to -1. At least they should be a kind of "trade-off" if we would like to decide and take one - if it doesn't even a trade-off, it's clearly a correctness issue we should fix. @ssimeonov Could you please share the details on workaround you took?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
