realno commented on a change in pull request #1525: URL: https://github.com/apache/arrow-datafusion/pull/1525#discussion_r780589214
########## File path: datafusion/src/scalar.rs ########## @@ -526,6 +526,282 @@ macro_rules! eq_array_primitive { } impl ScalarValue { + /// Return true if the value is numeric + pub fn is_numeric(&self) -> bool { + matches!(self, + ScalarValue::Float32(_) + | ScalarValue::Float64(_) + | ScalarValue::Decimal128(_, _, _) + | ScalarValue::Int8(_) + | ScalarValue::Int16(_) + | ScalarValue::Int32(_) + | ScalarValue::Int64(_) + | ScalarValue::UInt8(_) + | ScalarValue::UInt16(_) + | ScalarValue::UInt32(_) + | ScalarValue::UInt64(_) + ) + } + + /// Add two numeric ScalarValues + pub fn add(lhs: &ScalarValue, rhs: &ScalarValue) -> Result<ScalarValue> { Review comment: The different value from postgres and google is because of the nature of stddev, there are two versions: 1. population and 2. sample. The one in this PR is population, looks like the default one with postgres and google is sample. The difference in the calculation is very minimal I can include the sampe version in the PR as well. Good catch! So my proposal is to add two functions: stddev_pop and stddev_samp (following Postgres standard), and have stddev default to stddev_samp. Does this look reasonable? @alamb And I will look into the inconsistency between query runs. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org