zanmato1984 commented on code in PR #44394:
URL: https://github.com/apache/arrow/pull/44394#discussion_r1823743699


##########
cpp/src/arrow/compute/api_vector.h:
##########
@@ -705,5 +736,56 @@ Result<std::shared_ptr<Array>> PairwiseDiff(const Array& 
array,
                                             bool check_overflow = false,
                                             ExecContext* ctx = NULLPTR);
 
+/// \brief Return the reverse indices of the given indices.
+///
+/// For indices[i] = x, reverse_indices[x] = i. And reverse_indices[x] = null 
if x does
+/// not appear in the input indices. For indices[i] = x where x < 0 or x >= 
output_length,
+/// it is ignored. If multiple indices point to the same value, the last one 
is used.
+///
+/// For example, with indices = [null, 0, 3, 2, 4, 1, 1], the reverse indices 
is
+///   [1, 6, 3]                    if output_length = 3,
+///   [1, 6, 3, 2, 4, null, null]  if output_length = 7.

Review Comment:
   It is particularly useful for cases where the indices are only a subset of 
the output (in other words, the output contains "holes"/`null`s). Consider an 
input `[3, 0]` (imagine that row `1` and `2` are evaluated to `null` in the 
`if_else` special form's condition so they go to neither `true` nor `false` 
branch, thus having no slots in the input to `reverse_indices`/`permute`), it 
is desired to have the output as `[1, null, null, 0]` (as opposed to a 
presumable length-2 output `[1, null]`).
   
   This is also noted in my comments in one of the tests: 
https://github.com/apache/arrow/pull/44394/files#diff-ee27abfb87a9105ebbdb8bdcdd26aaec752f2631181a078e3ef5a07b4f00d1ceR995-R1000



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to