rtpsw commented on code in PR #34392:
URL: https://github.com/apache/arrow/pull/34392#discussion_r1195522221
##########
cpp/src/arrow/acero/asof_join_node.cc:
##########
@@ -704,21 +846,27 @@ class InputState {
Rehash();
}
memo_.Store(ts, rb, latest_ref_row_, latest_time, GetLatestKey());
+ // negative tolerance means a last-known entry was stored - set
`updated` to `true`
updated = memo_.no_future_;
ARROW_ASSIGN_OR_RAISE(advanced, Advance());
} while (advanced);
- if (!memo_.no_future_) { // "updated" was not modified in the loop; set
it here
+ if (!memo_.no_future_ && latest_time >= ts) {
Review Comment:
This is where the exposed `latest_time` is used. If `latest_time` is less
than `ts` then we know upfront that `updated =
memo_.RemoveEntriesWithLesserTime(ts);` would have no effect, since all entries
would have time no greater than `latest_time` and hence less than `ts`. This is
an optimization.
##########
cpp/src/arrow/acero/asof_join_node.cc:
##########
@@ -704,21 +846,27 @@ class InputState {
Rehash();
}
memo_.Store(ts, rb, latest_ref_row_, latest_time, GetLatestKey());
+ // negative tolerance means a last-known entry was stored - set
`updated` to `true`
updated = memo_.no_future_;
ARROW_ASSIGN_OR_RAISE(advanced, Advance());
} while (advanced);
- if (!memo_.no_future_) { // "updated" was not modified in the loop; set
it here
+ if (!memo_.no_future_ && latest_time >= ts) {
Review Comment:
This is where the exposed `latest_time` is used. If `latest_time` is less
than `ts` then we know upfront that `updated =
memo_.RemoveEntriesWithLesserTime(ts);` would have no effect, since all entries
would have time no greater than `latest_time` and hence less than `ts`. This is
a small optimization.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]