lidavidm commented on pull request #11853:
URL: https://github.com/apache/arrow/pull/11853#issuecomment-992498224


   For the failures with chunked arrays, the kernel needs these options set:
   
   ```diff
   diff --git a/cpp/src/arrow/compute/kernels/vector_replace.cc 
b/cpp/src/arrow/compute/kernels/vector_replace.cc
   index d85965cfd..8ee4bbfe1 100644
   --- a/cpp/src/arrow/compute/kernels/vector_replace.cc
   +++ b/cpp/src/arrow/compute/kernels/vector_replace.cc
   @@ -799,6 +799,8 @@ void RegisterVectorFunction(FunctionRegistry* registry,
        kernel.mem_allocation = MemAllocation::type::PREALLOCATE;
        kernel.signature = Functor<FixedType>::GetSignature(get_id.id);
        kernel.exec = std::move(exec);
   +    kernel.can_execute_chunkwise = false;
   +    kernel.output_chunked = false;
        DCHECK_OK(func->AddKernel(std::move(kernel)));
      };
      auto add_primitive_kernel = [&](detail::GetTypeId get_id) {
   diff --git a/cpp/src/arrow/compute/kernels/vector_replace_test.cc 
b/cpp/src/arrow/compute/kernels/vector_replace_test.cc
   index 742facf19..48a0b40f5 100644
   --- a/cpp/src/arrow/compute/kernels/vector_replace_test.cc
   +++ b/cpp/src/arrow/compute/kernels/vector_replace_test.cc
   @@ -110,7 +110,7 @@ class TestReplaceKernel : public ::testing::Test {
                                      const std::shared_ptr<ChunkedArray> array,
                                      const std::shared_ptr<ChunkedArray>& 
expected) {
        ASSERT_OK_AND_ASSIGN(auto actual, func(Datum(*array), nullptr));
   -    AssertChunkedEqual(*expected, *actual.chunked_array());
   +    AssertChunkedEquivalent(*expected, *actual.chunked_array());
      }
   ```
   
   The test still fails, but I'll leave that for further debugging. 
   
   Basically, with a chunked array input, by default, the compute 
infrastructure will split up the inputs and feed the kernel one chunk at a 
time, instead of the entire chunked array. In that case, it expects the kernel 
to output Array, not ChunkedArray, as it will assemble the final ChunkedArray 
itself. Also, in that case, it expects the kernel to keep track of its own 
state (in KernelState in KernelContext). Setting these options tells the 
compute infrastructure not to do this splitting and not to expect an Array 
output.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to