edponce commented on a change in pull request #11019:
URL: https://github.com/apache/arrow/pull/11019#discussion_r706223836
##########
File path: cpp/src/arrow/compute/api_vector.cc
##########
@@ -140,6 +143,29 @@ PartitionNthOptions::PartitionNthOptions(int64_t pivot)
: FunctionOptions(internal::kPartitionNthOptionsType), pivot(pivot) {}
constexpr char PartitionNthOptions::kTypeName[];
+SelectKOptions::SelectKOptions(int64_t k, std::vector<SortKey> sort_keys)
+ : FunctionOptions(internal::kSelectKOptionsType),
+ k(k),
+ sort_keys(std::move(sort_keys)) {}
+
+bool SelectKOptions::is_top_k() const {
+ for (const auto& k : sort_keys) {
+ if (k.order != SortOrder::Descending) {
+ return false;
+ }
Review comment:
Not major important, but I think there is a logic discrepancy for
identifying topK/bottomK in `is_top_x` functions and how it is reasoned about
it in the core SelectK code. Based on these loops, it is topK iff all keys are
in descending order and it is bottomK iff all keys are in ascending order. So
if keys order are mixed then it is neither, but in the SelectK logic
([here](https://github.com/apache/arrow/pull/11019/files#diff-bf5bdbbf3ffa5d706a841eb38face9739c18d515eeb829e31749c7deff2e9384R2347-R2352)
and
[here](https://github.com/apache/arrow/pull/11019/files#diff-bf5bdbbf3ffa5d706a841eb38face9739c18d515eeb829e31749c7deff2e9384R2060-R2063))
if is not one then it is assumed to be the other. Also, the visit code only
looks at the first key.
This discrepancy only has implications if `SortOrder` is not a binary enum
(which I do not foresee happening).
Nevertheless, for consistency, I recommend to make these loops identify
"topness" based solely on the first key.
```c++
bool SelectKOptions::is_top_k() const {
for (const auto& key : sort_keys) {
return key.order == SortOrder::Descending;
}
return true;
}
```
Also, this bypasses the need to iterate through all the keys.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]