viirya commented on PR #2570:
URL: https://github.com/apache/arrow-rs/pull/2570#issuecomment-1226022584

   I'm open to the feature flag naming. `sql_compliant` is used is because this 
is to follow SQL semantics regarding NaN handling. Maybe it sounds too broad as 
currently it only deals with NaN specially.
   
   > Have we considered just making this the default behaviour? 
   
   I'm not sure that this would make sense for other usecases other than SQL.
   
   > Note there is no SQL standard for NaN, and different engines may have 
different behaviors. For instance, in PostgresSQL NaN is only considered equal 
to NaN in sort, but not other cases. Therefore, I think we should clearly 
document the behavior introduced here.
   
   I think that PostgresSQL also treats NaN equal to NaN, as Spark does.
   
   Quote from https://www.postgresql.org/docs/current/datatype-numeric.html:
   
   > In order to allow numeric values to be sorted and used in tree-based 
indexes, PostgreSQL treats NaN values as equal, and greater than all non-NaN 
values.
   
   Just did a test `SELECT double precision 'NaN' = double precision 'NaN';` in 
PostgresSQL, it returns true.
   
   I agree that we should document it clearly. I will update the document.
   
   > Similar question as @tustvold : should we aim to make all the compute 
kernels SQL compliant? in that case we should no longer need a flag like this.
   
   As I answered above for @tustvold's question, I think we need both behaviors 
of compute kernels. For non-SQL usecases, current behavior is correct. But for 
SQL semantics, NaN not equal to NaN will cause practical issue when processing 
data, so we need different behavior with it. That's said, I think that we 
cannot just change all compute kernels to follow SQL semantics.
   
   > Also, could we have some tests for this too?
   
   I have some tests for this already.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to