[GitHub] [arrow-datafusion] realno commented on a change in pull request #1525: Add stddev operator

GitBox Fri, 07 Jan 2022 15:52:07 -0800


realno commented on a change in pull request #1525:
URL: https://github.com/apache/arrow-datafusion/pull/1525#discussion_r780589214




##########
File path: datafusion/src/scalar.rs
##########
@@ -526,6 +526,282 @@ macro_rules! eq_array_primitive {
 }
 
 impl ScalarValue {
+    /// Return true if the value is numeric
+    pub fn is_numeric(&self) -> bool {
+        matches!(self,
+            ScalarValue::Float32(_)
+            | ScalarValue::Float64(_)
+            | ScalarValue::Decimal128(_, _, _)
+            | ScalarValue::Int8(_)
+            | ScalarValue::Int16(_)
+            | ScalarValue::Int32(_)
+            | ScalarValue::Int64(_)
+            | ScalarValue::UInt8(_)
+            | ScalarValue::UInt16(_)
+            | ScalarValue::UInt32(_)
+            | ScalarValue::UInt64(_)
+        )
+    }
+
+    /// Add two numeric ScalarValues
+    pub fn add(lhs: &ScalarValue, rhs: &ScalarValue) -> Result<ScalarValue> {

Review comment:
       The different value from postgres and google is because of the nature of 
stddev, there are two versions: 1. population and 2. sample. The one in this PR 
is population, looks like the default one with postgres and google is sample. 
The difference in the calculation is very minimal I can include the sampe 
version in the PR as well. Good catch! 
   
   So my proposal is to add two functions: stddev_pop and stddev_samp 
(following Postgres standard), and have stddev default to stddev_samp. Does 
this look reasonable? @alamb 
   
   And I will look into the inconsistency between query runs. Thanks! 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow-datafusion] realno commented on a change in pull request #1525: Add stddev operator

Reply via email to