[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4989: Add support for linear range calculation in WINDOW functions
mustafasrepo commented on code in PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#discussion_r1082729694 ## datafusion/common/src/utils.rs: ## @@ -22,8 +22,16 @@ use arrow::array::ArrayRef; use arrow::compute::SortOptions; use std::cmp::Ordering; +/// Given column vectors, returns row at `idx`. +pub fn get_row_at_idx(columns: &[ArrayRef], idx: usize) -> Result> { +columns +.iter() +.map(|arr| ScalarValue::try_from_array(arr, idx)) +.collect() +} + /// This function compares two tuples depending on the given sort options. -fn compare( +pub fn compare_rows( Review Comment: This function is used both in `datafusion/common` and `datafusion/physical_expr.` Hence unfortunately `pub(crate)` wouldn't work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4989: Add support for linear range calculation in WINDOW functions
mustafasrepo commented on code in PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#discussion_r1082669525 ## datafusion/common/src/utils.rs: ## @@ -103,6 +111,53 @@ where Ok(low) } +/// This function searches for a tuple of given values (`target`) among the given +/// rows (`item_columns`) via a linear scan. It assumes that `item_columns` is sorted +/// according to `sort_options` and returns the insertion index of `target`. +/// Template argument `SIDE` being `true`/`false` means left/right insertion. +pub fn linear_search( +item_columns: &[ArrayRef], +target: &[ScalarValue], +sort_options: &[SortOptions], +) -> Result { +let low: usize = 0; +let high: usize = item_columns +.get(0) +.ok_or_else(|| { +DataFusionError::Internal("Column array shouldn't be empty".to_string()) +})? +.len(); +let compare_fn = |current: &[ScalarValue], target: &[ScalarValue]| { Review Comment: `LexicographicalComparator` API, compares values at the two indices and returns their ordering. This useful to find change detection, or partition boundaries. However, in our case we need to search for specific value inside Array (possibly not existing in the array.). However, maybe with some kind of tweak, we maybe able to use `LexicographicalComparator` for our use case. I will think about it in detail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4989: Add support for linear range calculation in WINDOW functions
mustafasrepo commented on code in PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#discussion_r1082607725 ## datafusion/common/src/utils.rs: ## @@ -103,6 +111,53 @@ where Ok(low) } +/// This function searches for a tuple of given values (`target`) among the given +/// rows (`item_columns`) via a linear scan. It assumes that `item_columns` is sorted +/// according to `sort_options` and returns the insertion index of `target`. +/// Template argument `SIDE` being `true`/`false` means left/right insertion. +pub fn linear_search( Review Comment: To make it compatible with `bisect` api, we add two versions; where one version takes `sort_options: &[SortOptions]`, other version takes `compare_fn: F`. We wanted to add both versions in case anyone need them. In terms of functionality first version is not necessary. However, since constructing comparator function from `SortOptions` is a bit cumbersome, we wanted to have that version also. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4989: Add support for linear range calculation in WINDOW functions
mustafasrepo commented on code in PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#discussion_r1082607725 ## datafusion/common/src/utils.rs: ## @@ -103,6 +111,53 @@ where Ok(low) } +/// This function searches for a tuple of given values (`target`) among the given +/// rows (`item_columns`) via a linear scan. It assumes that `item_columns` is sorted +/// according to `sort_options` and returns the insertion index of `target`. +/// Template argument `SIDE` being `true`/`false` means left/right insertion. +pub fn linear_search( Review Comment: To make it compatible with `bisect` api, we add two versions; where one version takes `sort_options: &[SortOptions]`, other version takes `compare_fn: F`. We want to put both in case anyone need it. In terms of functinality first version is not necessary. However, since constructing comparator function from `SortOptions` is a bit cumbersome, we wanted to have that version also. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4989: Add support for linear range calculation in WINDOW functions
mustafasrepo commented on code in PR #4989: URL: https://github.com/apache/arrow-datafusion/pull/4989#discussion_r1082607725 ## datafusion/common/src/utils.rs: ## @@ -103,6 +111,53 @@ where Ok(low) } +/// This function searches for a tuple of given values (`target`) among the given +/// rows (`item_columns`) via a linear scan. It assumes that `item_columns` is sorted +/// according to `sort_options` and returns the insertion index of `target`. +/// Template argument `SIDE` being `true`/`false` means left/right insertion. +pub fn linear_search( Review Comment: To make it compatible with `bisect` api, we have one version taking `sort_options: &[SortOptions]`, one version taking `compare_fn: F`. We want to put it in case anyone need it. Since constructing comparator function from `SortOptions` is a bit cumbersome. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org