alamb commented on code in PR #8662:
URL: https://github.com/apache/arrow-datafusion/pull/8662#discussion_r1437114322


##########
datafusion/physical-expr/src/aggregate/first_last.rs:
##########
@@ -459,10 +493,52 @@ impl LastValueAccumulator {
     }
 
     // Updates state with the values in the given row.
-    fn update_with_new_row(&mut self, row: &[ScalarValue]) {
-        self.last = row[0].clone();
-        self.orderings = row[1..].to_vec();
-        self.is_set = true;
+    fn update_with_new_row(&mut self, row: &[ScalarValue]) -> Result<()> {
+        let value = &row[0];
+        let orderings = &row[1..];
+        // Update when
+        // - no value in the state
+        // - There is no specific requirement, but a new value (most recent 
entry in terms of execution)
+        // - There is a more recent entry in terms of requirement
+        if !self.is_set
+            || self.orderings.is_empty()
+            || compare_rows(

Review Comment:
   ~I agree that since it is a single column max comparison this is probably 
fine (and no worse than the current implementation). If  we need to optimize 
performance we could probably implement specialized implementations (like 
`FirstValue<ArrowPrimitiveType>` and skip the copying entirely.~
   
   ~That is likely a premature optimization at this point~
   
   Update: Row format may well be a good idea (not for this PR). I will wait 
until I have reviewed this code to offer a more informed opinion



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to