felipecrv commented on code in PR #44394:
URL: https://github.com/apache/arrow/pull/44394#discussion_r1823600990
##########
cpp/src/arrow/compute/api_vector.h:
##########
@@ -705,5 +736,56 @@ Result<std::shared_ptr<Array>> PairwiseDiff(const Array&
array,
bool check_overflow = false,
ExecContext* ctx = NULLPTR);
+/// \brief Return the reverse indices of the given indices.
Review Comment:
"reverse index" already means a different thing in database literature.
This operation calculates the "inverse permutation" [2] and there is plenty
of old references to it.
[1] https://en.wikipedia.org/wiki/Reverse_index
[2] https://mathworld.wolfram.com/InversePermutation.html
##########
cpp/src/arrow/compute/api_vector.cc:
##########
@@ -155,6 +155,11 @@ static auto kPairwiseOptionsType =
GetFunctionOptionsType<PairwiseOptions>(
DataMember("periods", &PairwiseOptions::periods));
static auto kListFlattenOptionsType =
GetFunctionOptionsType<ListFlattenOptions>(
DataMember("recursive", &ListFlattenOptions::recursive));
+static auto kReverseIndicesOptionsType =
GetFunctionOptionsType<ReverseIndicesOptions>(
+ DataMember("output_length", &ReverseIndicesOptions::output_length),
+ DataMember("output_type", &ReverseIndicesOptions::output_type));
+static auto kPermuteOptionsType = GetFunctionOptionsType<PermuteOptions>(
+ DataMember("output_length", &PermuteOptions::output_length));
Review Comment:
It's very confusing to call this function permute [1]. It's a scatter [1],
right?
https://en.wikipedia.org/wiki/Permute_instruction (gather and scatter are
different forms of permute)
##########
cpp/src/arrow/compute/api_vector.h:
##########
@@ -705,5 +736,56 @@ Result<std::shared_ptr<Array>> PairwiseDiff(const Array&
array,
bool check_overflow = false,
ExecContext* ctx = NULLPTR);
+/// \brief Return the reverse indices of the given indices.
+///
+/// For indices[i] = x, reverse_indices[x] = i. And reverse_indices[x] = null
if x does
+/// not appear in the input indices. For indices[i] = x where x < 0 or x >=
output_length,
+/// it is ignored. If multiple indices point to the same value, the last one
is used.
Review Comment:
I think this explanation is confusing, but we can work on this later.
##########
cpp/src/arrow/compute/api_vector.h:
##########
@@ -705,5 +736,56 @@ Result<std::shared_ptr<Array>> PairwiseDiff(const Array&
array,
bool check_overflow = false,
ExecContext* ctx = NULLPTR);
+/// \brief Return the reverse indices of the given indices.
+///
+/// For indices[i] = x, reverse_indices[x] = i. And reverse_indices[x] = null
if x does
+/// not appear in the input indices. For indices[i] = x where x < 0 or x >=
output_length,
+/// it is ignored. If multiple indices point to the same value, the last one
is used.
+///
+/// For example, with indices = [null, 0, 3, 2, 4, 1, 1], the reverse indices
is
+/// [1, 6, 3] if output_length = 3,
+/// [1, 6, 3, 2, 4, null, null] if output_length = 7.
Review Comment:
Is the `ouput_length` parameter that useful? Compared to assuming the same
length as the input?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]