Light-City commented on code in PR #35098:
URL: https://github.com/apache/arrow/pull/35098#discussion_r1168122295


##########
cpp/src/arrow/array/data.cc:
##########
@@ -144,6 +144,8 @@ std::shared_ptr<ArrayData> ArrayData::Slice(int64_t off, 
int64_t len) const {
   } else {
     copy->null_count = null_count != 0 ? kUnknownNullCount : 0;
   }
+  for (auto& child : copy->child_data) 
+    child = child->Slice(copy->offset, copy->length);

Review Comment:
   > > Can we provide arraydata with slice operations on child_data?
   > 
   > What is the use case? Both `ArrayData` and `Buffer` do not provide slice 
operations so it would be non-trivial to do this.
   
   case: I have an two phase avg.
   
   phase1: returns an intermediate result struct array<sum, count>.
   
   phase2: input: struct array<sum, count>, output: float8, in Consume, I 
imitate SumImpl to implement addition to struct<sum, count>,split it to sum_arr 
and count_arr.
   
   just like SumImpl,  `this->sum += SumArray<CType, SumCType, 
SimdLevel>(*data);` data is from :`const auto& data = batch[0].array();`. it is 
an arraydata not array.
   
   so for our phase 2 consum, input data is a struct array data, code like this:
   ```cpp
   auto arraydata = *(batch[0].array());
   ConsumeSumArray(arraydata.child_data[0]);
   ConsumeCountArray(arraydata.child_data[1]);
   ```
   
   But in the grouping scenario, I may have my batch sliced, so arraydata 
should also be sliced, so I have to do this here
   
   ```
   auto array_data = batch[0].array();
   auto struct_array = std::make_shared<StructArray>(array_data);
   ConsumeSumArray(struct_array->field(0)->data());
   ConsumeCountArray(struct_array->field(1)->data());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to