Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-10 Thread via GitHub


Dandandan merged PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-10 Thread via GitHub


Dandandan commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466834081

   Thanks @delamarch3 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-10 Thread via GitHub


Dandandan commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466802543

   > I'm not sure calling `take` on each iteration will work because it takes 
ownership of the `Iterator` so it needs to be reassigned on each `run_ends` 
iteration, but then it will always take on the first n elements, unless you had 
something else in mind? I did try `BitIterator::new(filter_values.values(), 
start, end - start)` to try achieve something similar but It turned out to be 
slower.
   
   Of course, you're right.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-10 Thread via GitHub


Dandandan commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466802092

   For me looks good - I feel somehow there probably should be more performance 
on the table.
   
   I believe the reason is `BitIterator` is currently doing the same as 
`filter_values.value_unchecked` so for using the iterator it is only the extra 
cost of the bounds check.
   
   I think it should be possible to improve `BitIterator` (use bitshift rather 
than using the current index) and use `unwrap_unchecked()` to be slightly more 
performant than the current solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-10 Thread via GitHub


delamarch3 commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466781885

   I'm not sure calling `take` on each iteration will work because it takes 
ownership of the `Iterator` so it needs to be reassigned on each `run_ends` 
iteration, but then it will always take on the first n elements, unless you had 
something else in mind? I did try `BitIterator::new(filter_values.values(), 
start, end - start)` to try achieve something similar but It turned out to be 
slower.
   
   I had to create a temp variable with the count for assigning `keep` after 
the loop, ie `let keep = count > count_before`, for the algorithm to still be 
correct, but this was also slower.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-09 Thread via GitHub


Dandandan commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466305290

   ```
   for _ in start..end {
   if let Some(pred) = preds.next() {
   count += R::Native::from(pred);
   keep |= pred
   }
   }
   ```
   
   Maybe you could try something like (two minor changes)?
   
   ```
   for pred in preds.take(end-start) {
   count += R::Native::from(pred);
   }
   let keep = count > 0;
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-09 Thread via GitHub


delamarch3 commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466246617

   I've run the `filter_kernel` benchmark I added for the run array in 
https://github.com/apache/arrow-rs/pull/6706 with the different approaches, 
here are the results I get:
   
   ```rust
   for pred in filter_values
   .iter()
   .skip(start as usize)
   .take((end - start) as usize)
   {
   count += R::Native::from(pred);
   keep |= pred
   }
   ```
   ```text
   Benchmarking filter run array (kept 1/2): Warming up for 3. s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 52.1s, or reduce sample count to 10.
   filter run array (kept 1/2)
   time:   [542.98 ms 549.50 ms 556.59 ms]
   Found 10 outliers among 100 measurements (10.00%)
 7 (7.00%) high mild
 3 (3.00%) high severe
   
   Benchmarking filter run array high selectivity (kept 1023/1024): Warming up 
for 3. s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 54.3s, or reduce sample count to 10.
   Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 
100 samples in estimated 54.256 s (100 iterations
   filter run array high selectivity (kept 1023/1024)
   time:   [550.25 ms 555.80 ms 561.74 ms]
   Found 4 outliers among 100 measurements (4.00%)
 4 (4.00%) high mild
   
   Benchmarking filter run array low selectivity (kept 1/1024): Warming up for 
3. s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 53.5s, or reduce sample count to 10.
   filter run array low selectivity (kept 1/1024)
   time:   [536.14 ms 540.44 ms 545.14 ms]
   Found 11 outliers among 100 measurements (11.00%)
 6 (6.00%) high mild
 5 (5.00%) high severe
   ```
   
   ```rust
   for _ in start..end {
   if let Some(pred) = preds.next() {
   count += R::Native::from(pred);
   keep |= pred
   }
   }
   ```
   ```text
   filter run array (kept 1/2)
   time:   [598.70 µs 601.93 µs 605.25 µs]
   change: [-99.892% -99.890% -99.889%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
 3 (3.00%) high severe
   
   Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 
100 samples in estimated 6.0573 s (15k iterations
   filter run array high selectivity (kept 1023/1024)
   time:   [386.55 µs 388.17 µs 389.91 µs]
   change: [-99.931% -99.930% -99.929%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
 2 (2.00%) high mild
 2 (2.00%) high severe
   
   filter run array low selectivity (kept 1/1024)
   time:   [239.93 µs 240.46 µs 241.04 µs]
   change: [-99.956% -99.955% -99.955%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 12 outliers among 100 measurements (12.00%)
 6 (6.00%) high mild
 6 (6.00%) high severe
   ```
   
   These two are similar but after running a few times the low selectivity 
benchmark seems slightly faster in this one
   ```rust
   end -= end.saturating_sub(filter_values.len() as u64);
   for pred in (start..end).map(|i| unsafe { filter_values.value_unchecked(i as 
usize) }) {
   count += R::Native::from(pred);
   keep |= pred
   }
   ```
   ```text
   filter run array (kept 1/2)
   time:   [581.12 µs 584.01 µs 586.90 µs]
   change: [-2.5195% -1.1178% +0.1036%] (p = 0.11 > 
0.05)
   No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
 3 (3.00%) high mild
 2 (2.00%) high severe
   
   Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 
100 samples in estimated 5.5900 s (15k iterations
   filter run array high selectivity (kept 1023/1024)
   time:   [359.79 µs 361.40 µs 363.47 µs]
   change: [-7.7904% -5.5816% -3.1503%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 14 outliers among 100 measurements (14.00%)
 3 (3.00%) high mild
 11 (11.00%) high severe
   
   filter run array low selectivity (kept 1/1024)
   time:   [209.87 µs 210.45 µs 211.09 µs]
   change: [-13.950% -13.255% -12.616%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
 3 (3.00%) high mild
 3 (3.00%) high severe
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif

Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-09 Thread via GitHub


delamarch3 commented on code in PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#discussion_r1835333826


##
arrow-select/src/filter.rs:
##
@@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: 
&FilterPredicate) -> Result(
-re_arr: &RunArray,
-pred: &FilterPredicate,
+array: &RunArray,
+predicate: &FilterPredicate,
 ) -> Result, ArrowError>
 where
 R::Native: Into + From,
 R::Native: AddAssign,
 {
-let run_ends: &RunEndBuffer = re_arr.run_ends();
+let run_ends: &RunEndBuffer = array.run_ends();
 let mut values_filter = BooleanBufferBuilder::new(run_ends.len());
 let mut new_run_ends = vec![R::default_value(); run_ends.len()];
 
-let mut start = 0i64;
+let mut start = 0u64;
 let mut i = 0;
 let mut count = R::default_value();
-let filter_values = pred.filter.values();
+let filter_values = predicate.filter.values();
 
-for end in run_ends.inner().into_iter().map(|i| (*i).into()) {
+for mut end in run_ends.inner().into_iter().map(|i| (*i).into() as u64) {
 let mut keep = false;
 
-for pred in filter_values
-.iter()
-.skip(start as usize)
-.take((end - start) as usize)
-{
+let difference = end.saturating_sub(filter_values.len() as u64);
+end -= difference;
+
+// Safety: we subtract the difference off `end` so we are always 
within bounds
+for pred in (start..end).map(|i| unsafe { 
filter_values.value_unchecked(i as usize) }) {

Review Comment:
   Thanks, I'll try this out and post the results



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


Dandandan commented on code in PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#discussion_r1835084600


##
arrow-select/src/filter.rs:
##
@@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: 
&FilterPredicate) -> Result(
-re_arr: &RunArray,
-pred: &FilterPredicate,
+array: &RunArray,
+predicate: &FilterPredicate,
 ) -> Result, ArrowError>
 where
 R::Native: Into + From,
 R::Native: AddAssign,
 {
-let run_ends: &RunEndBuffer = re_arr.run_ends();
+let run_ends: &RunEndBuffer = array.run_ends();
 let mut values_filter = BooleanBufferBuilder::new(run_ends.len());
 let mut new_run_ends = vec![R::default_value(); run_ends.len()];
 
-let mut start = 0i64;
+let mut start = 0u64;
 let mut i = 0;
 let mut count = R::default_value();
-let filter_values = pred.filter.values();
+let filter_values = predicate.filter.values();
 
-for end in run_ends.inner().into_iter().map(|i| (*i).into()) {
+for mut end in run_ends.inner().into_iter().map(|i| (*i).into() as u64) {
 let mut keep = false;
 
-for pred in filter_values
-.iter()
-.skip(start as usize)
-.take((end - start) as usize)
-{
+let difference = end.saturating_sub(filter_values.len() as u64);
+end -= difference;
+
+// Safety: we subtract the difference off `end` so we are always 
within bounds
+for pred in (start..end).map(|i| unsafe { 
filter_values.value_unchecked(i as usize) }) {

Review Comment:
   Perhaps we could iterate on the `filter_values`? E.g. `let mut preds = 
filter_values.iter()` and calling preds.next()` in the loop. That way you can 
avoid using unsafe, while it might still generate fast code (?).



##
arrow-select/src/filter.rs:
##
@@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: 
&FilterPredicate) -> Result(
-re_arr: &RunArray,
-pred: &FilterPredicate,
+array: &RunArray,
+predicate: &FilterPredicate,
 ) -> Result, ArrowError>
 where
 R::Native: Into + From,
 R::Native: AddAssign,
 {
-let run_ends: &RunEndBuffer = re_arr.run_ends();
+let run_ends: &RunEndBuffer = array.run_ends();
 let mut values_filter = BooleanBufferBuilder::new(run_ends.len());
 let mut new_run_ends = vec![R::default_value(); run_ends.len()];
 
-let mut start = 0i64;
+let mut start = 0u64;
 let mut i = 0;
 let mut count = R::default_value();
-let filter_values = pred.filter.values();
+let filter_values = predicate.filter.values();
 
-for end in run_ends.inner().into_iter().map(|i| (*i).into()) {
+for mut end in run_ends.inner().into_iter().map(|i| (*i).into() as u64) {
 let mut keep = false;
 
-for pred in filter_values
-.iter()
-.skip(start as usize)
-.take((end - start) as usize)
-{
+let difference = end.saturating_sub(filter_values.len() as u64);
+end -= difference;
+
+// Safety: we subtract the difference off `end` so we are always 
within bounds
+for pred in (start..end).map(|i| unsafe { 
filter_values.value_unchecked(i as usize) }) {

Review Comment:
   Perhaps we could iterate on the `filter_values`? E.g. `let mut preds = 
filter_values.iter()` and calling `preds.next()` in the loop. That way you can 
avoid using unsafe, while it might still generate fast code (?).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


Dandandan commented on code in PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#discussion_r1835084600


##
arrow-select/src/filter.rs:
##
@@ -423,30 +423,30 @@ fn filter_array(values: &dyn Array, predicate: 
&FilterPredicate) -> Result(
-re_arr: &RunArray,
-pred: &FilterPredicate,
+array: &RunArray,
+predicate: &FilterPredicate,
 ) -> Result, ArrowError>
 where
 R::Native: Into + From,
 R::Native: AddAssign,
 {
-let run_ends: &RunEndBuffer = re_arr.run_ends();
+let run_ends: &RunEndBuffer = array.run_ends();
 let mut values_filter = BooleanBufferBuilder::new(run_ends.len());
 let mut new_run_ends = vec![R::default_value(); run_ends.len()];
 
-let mut start = 0i64;
+let mut start = 0u64;
 let mut i = 0;
 let mut count = R::default_value();
-let filter_values = pred.filter.values();
+let filter_values = predicate.filter.values();
 
-for end in run_ends.inner().into_iter().map(|i| (*i).into()) {
+for mut end in run_ends.inner().into_iter().map(|i| (*i).into() as u64) {
 let mut keep = false;
 
-for pred in filter_values
-.iter()
-.skip(start as usize)
-.take((end - start) as usize)
-{
+let difference = end.saturating_sub(filter_values.len() as u64);
+end -= difference;
+
+// Safety: we subtract the difference off `end` so we are always 
within bounds
+for pred in (start..end).map(|i| unsafe { 
filter_values.value_unchecked(i as usize) }) {

Review Comment:
   Perhaps we could iterate on the `filter_values`? E.g. `let mut preds = 
filter_values.iter()` and calling `filter_values.next()` in the loop. That way 
you can avoid using unsafe, while it might still generate fast code (?).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


alamb commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465625035

   > I'll try to add one into `filter_kernels`
   
   Thank you -- if you could do so as a separate PR that would be most helpful 
(so it is easy to compare with these changes) 🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


delamarch3 commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465621676

   I'll try to add one into `filter_kernels`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


delamarch3 commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465583647

   @alamb Sorry, I wrote a separate benchmark for this but I didn't commit it, 
it's not consistent with the results it returns on each run so I thought it 
needed work but there was enough of a difference to compare. I can add it into 
this PR though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


alamb commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465571567

   I apologize for being denise @delamarch3  -- but I spent a while trying to 
find the benchmarks you are running and I couldn't figure out which they were. 
Is it the `filter_kernels`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Undo run end filter performance regression [arrow-rs]

2024-11-08 Thread via GitHub


alamb commented on PR #6691:
URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2465514372

   I am running the benchmarks on this PR to verify. Thank you @delamarch3 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org