HeartSaVioR commented on issue #27834: revert [SPARK-24640][SQL] Return `NULL` 
from `size(NULL)` by default
URL: https://github.com/apache/spark/pull/27834#issuecomment-596154932
 
 
   It doesn't seem to be just a cosmetic change, as far as we hear actual use 
case being affected by. It would be OK to let it be -1 for non-aggregate given 
-1 can be still differentiated with valid values, but for aggregations the 
value of -1 is being handled as "valid" values silently, and provides 
correctness issue. That would require "pre-process" (having a column for the 
result of `size(col)`, but change to NULL if the value is -1) or hacky 
workaround.
   
   I feel we should have clear reason to have -1 for the return value, what 
benefits we get from having it to -1. At least they should be a kind of 
"trade-off" if we would like to decide and take one - if it doesn't even a 
trade-off, it's clearly a correctness issue we should fix.
   
   @ssimeonov 
   Could you please share the details on workaround you took?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to