Hi Wenbo, I'd like to known what's the *three* `buffers` are in ArraySpan. What are > `1` means when `GetValues` called?
The meaning of buffers in an ArraySpan depends on the layout of its data type. FixedSizeBinary is a fixed-size primitive type, so it has two buffers, one validity buffer and one data buffer. So GetValues(1) would return a pointer to the data buffer. Layouts of data types can be found here[1]. what is the actual type should I get from `GetValues`? > Buffer data is stored as raw bytes (uint8_t) but can be reinterpreted as any type to suit your need. The template parameter for GetValue is simply forwarded to reinterpret_cast. There are discussions[2] on the soundness of using uint8_t to represent bytes but it is what we use now. Since you are only doing a memcpy, uint8_t should be good. Maybe, `auto *out_values = out->array_span_mutable()->GetValues(uint8_t > *>(1);` and `memcpy(*out_values++, some_ptr, 32);`? > I may be missing something, but why copy to *out_values++ instead of *out_values and add 32 to out_values afterwards? Otherwise I agree this is the way to go. [1] https://arrow.apache.org/docs/format/Columnar.html#buffer-listing-for-each-layout [2] https://github.com/apache/arrow/issues/36123 On Mon, Jul 17, 2023 at 4:44 PM Wenbo Hu <huwenbo1...@gmail.com> wrote: > Hi, > I'm using Acero as the stream executor to run large scale data > transformation. The core data used in UDF is `ArraySpan` in > `ExecSpan`, but not much document on ArraySpan. I'd like to known > what's the *three* `buffers` are in ArraySpan. What are `1` means when > `GetValues` called? > For input data, I can use a `ArraySpanVisitor` to iterator over > different input types. But for output data, I don't know how to write > to the`array_span_mutable()` if it is not a simple c_type. > For example, I'm implementing a sha256 udf, which input is > `arrow::utf8()` and the output is `arrow::fixed_size_binary(32)`, then > how can I directly write to the out buffers and what is the actual > type should I get from `GetValues`? > Maybe, `auto *out_values = > out->array_span_mutable()->GetValues(uint8_t *>(1);` and > `memcpy(*out_values++, some_ptr, 32);`? > > -- > --------------------- > Best Regards, > Wenbo Hu, >