Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on PR #19496: URL: https://github.com/apache/datafusion/pull/19496#issuecomment-3731540211 Thanks @mzabaluev -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey merged PR #19496: URL: https://github.com/apache/datafusion/pull/19496 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2668785046
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
Review Comment:
I felt that since this path has gone this far to handle nulls, it can handle
this case as well without further complicating the logic: it's a do-nothing
`else` clause vs. having to escape the `if`/`match` block somehow. The _respect
nulls_ path also needs to match the function kind all over again.
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
Review Comment:
I felt that since this path has gone this far to handle nulls, it can handle
this case as well without further complicating the logic: it's a do-nothing
`else` clause vs. having to escape the `if`/`match` blocks somehow. The
_respect nulls_ path also needs to match the function kind all over again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2668785046
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
Review Comment:
I felt that since this path gone this far to handle nulls, it can handle
this case as well without further complicating the logic: it's a do-nothing
`else` clause vs. having to escape the `if`/`match` block somehow. The _respect
nulls_ path also needs to match the function kind all over again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2668050132
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices().next() {
+let result =
+ScalarValue::try_from_array(out,
valid_index)?;
+self.state.finalized_result = Some(result);
Review Comment:
As I understand, the code for the _respect nulls_ case relies on serial
evaluation, where `finalized_result` is set with the first row and reused ever
since. So the last value with be the same as the first.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2668015954
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices().next() {
+let result =
+ScalarValue::try_from_array(out,
valid_index)?;
+self.state.finalized_result = Some(result);
+} else {
+// The output is empty or all nulls, ignore
+}
+}
+if state.window_frame_range.start <
state.window_frame_range.end {
+state.window_frame_range.start =
+state.window_frame_range.end - 1;
+}
Review Comment:
Yes, the intent is to preserve this optimization. In this case, we need one
preceding value for ordering checks in a `RANGE` frame, otherwise the updated
range is not used.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2665102087
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices().next() {
+let result =
+ScalarValue::try_from_array(out,
valid_index)?;
+self.state.finalized_result = Some(result);
Review Comment:
I notice for below we memoize the last value in the `out` column whereas
here we do the first; does it make a difference?
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices().next() {
+let result =
+ScalarValue::try_from_array(out,
valid_index)?;
+self.state.finalized_result = Some(result);
+} else {
+// The output is empty or all nulls, ignore
+}
+}
+if state.window_frame_range.start <
state.window_frame_range.end {
+state.window_frame_range.start =
+state.window_frame_range.end - 1;
+}
Review Comment:
Is this equivalent to the `is_prunable` check (for `NthValueKind::First`)
where it ensures `n_range` > 0 and then if pruning it does
`state.window_frame_range.start =
state.window_frame_range.end.saturating_sub(buffer_size)`?
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
Review Comment:
There's a small pedantic case here; so if we have no null buffer (no nulls)
then we fallback to existing behaviour below. If we have a null buffer, we try
memoize if we haven't already but we'll still return from this point onwards
(we won't fallback). But if the null buffer has no nulls then we don't fall
through like before, we just handle the return here.
I guess it is the same result and I don't think it's too much of a concern
at runtime, but it's just a potential path I find inconsistent when reading
this code flow 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2663263379
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
Review Comment:
It would indeed be a weird/pedantic case; however here it might be
appropriate as we are slicing the original array so it could be likely we have
an array with nulls but slice into a region with no nulls 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2661996145
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
Review Comment:
Ah yes, I somehow confused the null count check with valid_indices
retrieving no values, while it's the opposite. Also I'm a bit new to
Arrow/Datafusion, so it's not clear to me if there are non-pathological
scenarios that could produce an array with the null buffer present, but
containing no nulls.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2661969517
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
+Ordering::Less => {
+let reverse_index = (-self.n) as usize;
+if n_range < reverse_index {
+// Outside the range, return NULL to avoid
allocating
+// for the sliding window that will be
discarded in the end.
+return None;
+}
+let mut window =
VecDeque::with_capacity(reverse_index);
Review Comment:
Thanks for the suggestion! With it, the implementation is faster across the
board accordingly to the added benchmark.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on PR #19496: URL: https://github.com/apache/datafusion/pull/19496#issuecomment-3710992619 Benchmark results against the branch base ``` nth_value_ignore_nulls/first_value_expanding/0%_nulls time: [229.32 µs 229.97 µs 230.68 µs] change: [−4.1052% −3.6770% −3.2233%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low mild 7 (7.00%) high mild 1 (1.00%) high severe nth_value_ignore_nulls/last_value_expanding/0%_nulls time: [229.63 µs 230.18 µs 230.78 µs] change: [−4.1494% −3.7747% −3.3808%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low mild 8 (8.00%) high mild nth_value_ignore_nulls/nth_value_10_expanding/0%_nulls time: [231.01 µs 231.38 µs 231.78 µs] change: [−3.6131% −3.3527% −3.0648%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe nth_value_ignore_nulls/nth_value_neg10_expanding/0%_nulls time: [231.72 µs 232.23 µs 232.83 µs] change: [−3.2292% −2.9837% −2.7387%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe nth_value_ignore_nulls/first_value_sliding_100/0%_nulls time: [231.63 µs 231.99 µs 232.41 µs] change: [−3.1696% −2.9230% −2.6831%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe nth_value_ignore_nulls/last_value_sliding_100/0%_nulls time: [231.89 µs 232.27 µs 232.71 µs] change: [−3.3218% −3.0671% −2.8237%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild nth_value_ignore_nulls/first_value_expanding/30%_nulls time: [519.53 µs 521.02 µs 522.69 µs] change: [−98.032% −98.025% −98.018%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe nth_value_ignore_nulls/last_value_expanding/30%_nulls time: [21.887 ms 21.925 ms 21.970 ms] change: [−15.961% −15.752% −15.531%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe nth_value_ignore_nulls/nth_value_10_expanding/30%_nulls time: [605.66 µs 606.99 µs 608.56 µs] change: [−97.648% −97.640% −97.632%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild nth_value_ignore_nulls/nth_value_neg10_expanding/30%_nulls time: [12.734 ms 12.765 ms 12.804 ms] change: [−50.084% −49.902% −49.690%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 10 (10.00%) high mild 5 (5.00%) high severe nth_value_ignore_nulls/first_value_sliding_100/30%_nulls time: [459.39 µs 460.95 µs 462.60 µs] change: [−73.979% −73.883% −73.779%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Benchmarking nth_value_ignore_nulls/last_value_sliding_100/30%_nulls: Warming up for 3. s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.5s, enable flat sampling, or reduce sample count to 60. nth_value_ignore_nulls/last_value_sliding_100/30%_nulls time: [1.0836 ms 1.0861 ms 1.0889 ms] change: [−37.896% −37.683% −37.418%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe nth_value_ignore_nulls/first_value_expanding/50%_nulls
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2659448260
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
+Ordering::Less => {
+let reverse_index = (-self.n) as usize;
+if n_range < reverse_index {
+// Outside the range, return NULL to avoid
allocating
+// for the sliding window that will be
discarded in the end.
+return None;
+}
+let mut window =
VecDeque::with_capacity(reverse_index);
Review Comment:
I think if we do something like this it can work without needing a separate
container?
```rust
Ordering::Less => {
let reverse_index = (-self.n) as usize;
let total_len = nulls.len();
let null_count = nulls.null_count();
let valid_indices_len = total_len - null_count;
if reverse_index > valid_indices_len {
return None;
}
nulls
.valid_indices()
.nth(valid_indices_len - reverse_index)
.map(|idx| idx + offset)
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on PR #19496: URL: https://github.com/apache/datafusion/pull/19496#issuecomment-3707779972 > > Would be good if there were some benchmarks to check this is indeed a performance boost > > I have added a benchmark. It shows significant improvement in many cases, but it also shows that the `VecDeque` is problematic. I will try to replace it with a ring buffer. Would you be able to post the benchmark results for us to see? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2659440568
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
Review Comment:
We can chain it like
```rust
if let Some(nulls) = slice.nulls()
&& nulls.null_count() > 0
{
```
> and the iterator returned by `nulls.valid_indices()` being empty will do
the right thing anyway?
But I think it would still iterate through the whole null buffer anyway? So
if we're looking via a performance lens perhaps this approach is worth
considering.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2659436621
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
+Ordering::Less => {
+let reverse_index = (-self.n) as usize;
+if n_range < reverse_index {
+// Outside the range, return NULL to avoid
allocating
+// for the sliding window that will be
discarded in the end.
+return None;
+}
+let mut window =
VecDeque::with_capacity(reverse_index);
Review Comment:
> Unfortunately, `BitIndexIterator` is not bidirectional.
Ah you're right, I was mistakenly thinking of
[`BitIterator`](https://docs.rs/arrow/latest/arrow/util/bit_iterator/struct.BitIterator.html#impl-DoubleEndedIterator-for-BitIterator%3C'_%3E)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2659217039
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
+Ordering::Less => {
+let reverse_index = (-self.n) as usize;
+if n_range < reverse_index {
+// Outside the range, return NULL to avoid
allocating
+// for the sliding window that will be
discarded in the end.
+return None;
+}
+let mut window =
VecDeque::with_capacity(reverse_index);
Review Comment:
The queue is a bad solution, indeed. I think a simple ring buffer will do
much better here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on PR #19496: URL: https://github.com/apache/datafusion/pull/19496#issuecomment-3707453071 > Would be good if there were some benchmarks to check this is indeed a performance boost I have added a benchmark. It shows significant improvement in many cases, but it also shows that the `VecDeque` is problematic. I will try to replace it with a ring buffer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
mzabaluev commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2659184727
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
Review Comment:
That will be another branch, and the iterator returned by
`nulls.valid_indices()` being empty will do the right thing anyway?
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
Review Comment:
The nulls buffer is in a slice limited by the range, so any valid index
should be within the range (adjusted with the offset).
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
+Ordering::Less => {
+let reverse_index = (-self.n) as usize;
+if n_range < reverse_index {
+// Outside the range, return NULL to avoid
allocating
+// for the sliding window that will be
discarded in the end.
+return None;
+}
+let mut window =
VecDeque::with_capacity(reverse_index);
Review Comment:
Unfortunately, `BitIndexIterator` is not bidirectional.
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices()
Re: [PR] perf: optimize `NthValue` when `ignore_nulls` is true [datafusion]
Jefffrey commented on code in PR #19496:
URL: https://github.com/apache/datafusion/pull/19496#discussion_r2658779933
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -370,6 +371,33 @@ impl PartitionEvaluator for NthValueEvaluator {
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> {
let out = &state.out_col;
let size = out.len();
+if self.ignore_nulls {
+match self.state.kind {
+// Prune on first non-null output in case of FIRST_VALUE
+NthValueKind::First => {
+if let Some(nulls) = out.nulls() {
+if self.state.finalized_result.is_none() {
+if let Some(valid_index) =
nulls.valid_indices().next() {
+let result =
+ScalarValue::try_from_array(out,
valid_index)?;
+self.state.finalized_result = Some(result);
+} else {
+// The output is empty or all nulls, ignore
+}
+}
+if state.window_frame_range.start <
state.window_frame_range.end {
+state.window_frame_range.start =
+state.window_frame_range.end - 1;
+}
+return Ok(());
+} else {
+// Fall through to the main case because there are no
nulls
+}
+}
+// Do not memoize for other kinds when nulls are ignored
Review Comment:
The logic here is really hard to follow with all the nesting present
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+.map(|idx| idx + range.start)
+}
Review Comment:
I notice we now omit the early check of being out of range?
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
Review Comment:
This whole null logic block is quite indented; would be good to see if we
can refactor it
##
datafusion/functions-window/src/nth_value.rs:
##
@@ -519,6 +467,87 @@ impl PartitionEvaluator for NthValueEvaluator {
}
}
+impl NthValueEvaluator {
+fn valid_index(&self, array: &ArrayRef, range: &Range) ->
Option {
+let n_range = range.end - range.start;
+if self.ignore_nulls {
+// Calculate valid indices, inside the window frame boundaries.
+let slice = array.slice(range.start, n_range);
+if let Some(nulls) = slice.nulls() {
+return match self.state.kind {
+NthValueKind::First => {
+nulls.valid_indices().next().map(|idx| idx +
range.start)
+}
+NthValueKind::Last => {
+nulls.valid_indices().last().map(|idx| idx +
range.start)
+}
+NthValueKind::Nth => {
+match self.n.cmp(&0) {
+Ordering::Greater => {
+// SQL indices are not 0-based.
+let index = (self.n as usize) - 1;
+nulls
+.valid_indices()
+.nth(index)
+
