Well, working with malloc buffers is a separate beast altogether :-) But I
suggest you follow the arrow/buffer.h header.

When we malloc by ourselves, we tell arrow buffers that our client code
would manage the lifespan of the buffer. So, we just have to pass the start
of the allocated memory to the arrow buffer.

I'd do the following.

  int elms = 640;
  // Allocate specified number of bytes of memory.
  uint8_t *mem_res = (uint8_t *) malloc(elms * sizeof(uint8_t));
  // Construct a mutable buffer from the allocated memory. --> use
std::make_shared
// there's no reason to move the uint8_t*. It doesn't serve any purpose.
  auto data_buf = std::make_shared<arrow::MutableBuffer>(mem_res, elms *
sizeof(uint8_t));
  // Get a pointer to the buffer's data.
// this assert should pass
assert(mem_res == data_buf->mutable_data());

On Tue, Sep 28, 2021 at 10:12 AM Weber, Eugene F Jr CIV (USA) <
[email protected]> wrote:

> Hi Niranda,
>
> Thanks. Ultimately what I'm trying to do is use memory that has been
> allocated by another STL. So first I was trying to simply use a created
> Arrow Buffer as an Arrow Array, which you example showed me how to do. Next
> I'm trying to create an Arrow Buffer from malloc'd memory. Finally, I was
> going to try to create an Arrow Buffer from the STL allocated memory. Seems
> a logical learning progression.
>
> I hate to run here for answers. I am researching and reading, but
> increasing my C++ knowledge and learning Arrow is proving to be a slow go.
> (Forehead bloody from pounding it against screen full of compiler errors,
> Lol) So I do greatly appreciate the help.
>
> Using AllocateBuffer returns a Result<std::shared_ptr<Buffer>>
>
> *  int elms = 640; *
> *  auto data_res = arrow::AllocateBuffer(elms * sizeof(uint8_t));*
>
> ArrayData::Make requires a shared_ptr, and the Result class allows for the
> error handling your code demonstrates. When I try and implement the
> functionality of the two lines above using Malloc I come up with this:
>
> *  int elms = 640;*
> *  // Allocate specified number of bytes of memory.*
> *  uint8_t *mem_res = (uint8_t *) malloc(elms * sizeof(uint8_t));*
> *  // Construct a mutable buffer from the allocated memory.*
> *  arrow::MutableBuffer data_buf(std::move(mem_res), elms);*
> *  // Get a pointer to the buffer's data.*
> *  auto data_res = data_buf.mutable_data();*
>
> But the return type from mutable_data() is a raw pointer to the buffer
> data, not a shared_ptr. If I change the last line to try to construct a
> Result class I get an error because of that.
>
> *  // auto data_res = data_buf.mutable_data();*
> *   arrow::Result<std::shared_ptr<arrow::Buffer>> data_res =
> data_buf.mutable_data();*
>
> error: conversion from ‘uint8_t*’ {aka ‘unsigned char*’} to non-scalar
> type ‘arrow::Result<std::shared_ptr<arrow::Buffer> >’ requested
>
> How can I convert the raw pointer to a shared_ptr?
>
> Thanks,
>
> Gene
>
>
> ------------------------------
> *From:* Niranda Perera [[email protected]]
> *Sent:* Monday, September 27, 2021 10:21 AM
> *To:* [email protected]
> *Subject:* Re: [Non-DoD Source] Re: C++ Arrays & Buffers
>
> All active links contained in this email were disabled. Please verify the
> identity of the sender, and confirm the authenticity of all links contained
> within the message prior to copying and pasting the address to a Web
> browser.
>
> ------------------------------
>
>
> Furthermore, now if you want to change data in the buffer, you can access
> the data_buffer in the ArrayData as follows
> int64* values = array_data->GetMutableValues<int64_t>(1); // 1 indicates
> data buffer
>
> On Mon, Sep 27, 2021 at 10:18 AM Niranda Perera <[email protected]
>  < Caution-mailto:[email protected] > > wrote:
>
>> Hi Gene,
>>
>> I didn't quite understand what your requirement is. But Let me try to
>> give an example of how buffers can be used to create an ArrayData obj.
>> Maybe that could help your case.
>>
>> Say I want to create an Int64 array of 500 elements.
>>
>>   int elms = 500;
>>   auto data_res = arrow::AllocateBuffer(elms * sizeof(int64_t));
>>   if (!data_res.ok()) {
>>     // handle error
>>   }
>>   auto data_buf = std::move(data_res).ValueOrDie();
>>
>>  //  alternatively you can do this in a function that returns a Status
>>  //  ARROW_ASSIGN_OR_RAISE(auto data_buf, arrow::AllocateBuffer(elms
>> *sizeof(int64_t)));
>>
>>   // now, if we dont need a validity buffer (i.e. all elems are valid) we
>> could keep
>>   // the validity buffer as nullptr. Else you could use AllocateBitmap
>> for that
>>   auto array_data = ArrayData::Make(arrow::int64(), elms, {nullptr,
>> std::move(data_buf)},
>>                                     /*nullcount =*/0,
>>                                     /*offset=*/0);
>>
>>
>> If you want to allocate a buffer yourself, using malloc, you could use
>> this constructor in Buffer
>> Caution-
>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/buffer.h#L58
>>  < Caution-
>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/buffer.h#L58 >
>>
>> I hope this helps.
>>
>> On Mon, Sep 27, 2021 at 9:28 AM Weber, Eugene F Jr CIV (USA) <
>> [email protected] < Caution-mailto:
>> [email protected] > > wrote:
>>
>>> Hi,
>>>
>>> I'm trying to wrap a buffer inside a DataArray container as suggested,
>>> but my C++ skills are still a bit weak. I've read through files and tried
>>> numerous inputs for the first parameter to ArrayData Make, but cannot get
>>> it to compile. I'll keep working on this, but would appreciate any help
>>> offered.
>>>
>>> Thanks,
>>>
>>> Gene
>>> --------
>>>
>>> #include <cstdint>
>>> #include <iostream>
>>> #include <vector>
>>>
>>> #include <arrow/api.h>
>>> #include <arrow/type_fwd.h>
>>> #include <arrow/array/data.h>
>>>
>>> using arrow::BufferBuilder;
>>> using arrow::Buffer;
>>>
>>> int main(int argc, char** argv) {
>>>
>>>   arrow::Result<std::unique_ptr<Buffer>> maybe_buffer =
>>> arrow::AllocateBuffer(121751207936);
>>>   if (!maybe_buffer.ok()) {
>>>    std::cout << "Error Allocating Buffer from Memory Pool\n";
>>>    // Could check /proc/meminfo for: MemAvailable:   128252780 kB
>>>    exit(1);
>>>   }
>>>
>>>   std::shared_ptr<arrow::Buffer> buffer = *std::move(maybe_buffer);
>>>   int64_t buf_size = buffer->size();
>>>
>>>   arrow::ArrayData adc;
>>>   adc.Make(??????, 1000, buffer, -1, 0);
>>>
>>> //  static std::shared_ptr<ArrayData> Make(std::shared_ptr<DataType>
>>> type, int64_t length,
>>> //
>>>  std::vector<std::shared_ptr<Buffer>> buffers,
>>> //                                         int64_t null_count =
>>> kUnknownNullCount,
>>> //                                         int64_t offset = 0);
>>>
>>>   return EXIT_SUCCESS;
>>> }
>>> ________________________________________
>>> From: Antoine Pitrou [[email protected] < Caution-mailto:
>>> [email protected] > ]
>>> Sent: Monday, September 20, 2021 6:00 AM
>>> To: [email protected] < Caution-mailto:[email protected] >
>>> Subject: [Non-DoD Source] Re: C++ Arrays & Buffers
>>>
>>> Hello Eugene,
>>>
>>> On Mon, 20 Sep 2021 09:33:26 +0000
>>> "Weber, Eugene F Jr CIV (USA)" <[email protected]
>>>  < Caution-mailto:[email protected] > > wrote:
>>> >
>>> > I've gone through the documentation but I'm still unclear about the
>>> usage of buffers with respect to arrays.
>>> >
>>> > "A Buffer encapsulates a pointer and data size .... Buffers are
>>> untyped: they simply denote a physical memory area"
>>> >
>>> > "The central type in Arrow is the class arrow::Array. An array
>>> represents a known-length sequence of values all having the same type.
>>> Internally, those values are represented by one or several buffers ...."
>>> >
>>> > The Array section then goes on to explain how to build Arrays with the
>>> ArrayBuilder base class, and concrete subclasses. There doesn't appear to
>>> be any need to allocate buffer space first. There is also a BufferBuilder
>>> class. But since there does not appear to be a way associate a created
>>> buffer with an array, I don't understand when explicit buffer creation
>>> would be used?
>>>
>>> The C++ documentation is unfortunately incomplete.  Using ArrayBuilder
>>> subclasses is one way of creating arrays if you want to populate them
>>> with logical values (e.g. int64_t values for a Int64Array).
>>> But you can also create buffers directly and wrap them inside a
>>> ArrayData container:
>>> Caution-Caution-
>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/data.h#L73
>>>  < Caution-
>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/data.h#L73
>>>  >
>>>
>>> (this is useful if e.g. you want your array to point to *existing*
>>> memory)
>>>
>>> Then you can call MakeArray to get an actual Array subclass from the
>>> ArrayData container:
>>> Caution-Caution-
>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/util.h#L38
>>>  < Caution-
>>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/util.h#L38
>>>  >
>>>
>>> Regards
>>>
>>> Antoine.
>>
>>
>>
>> --
>> Niranda Perera
>> Caution-https://niranda.dev/ < Caution-https://niranda.dev/ >
>> @n1r44 < Caution-https://twitter.com/N1R44 >
>>
>>
>
> --
> Niranda Perera
> Caution-https://niranda.dev/ < Caution-https://niranda.dev/ >
> @n1r44 < Caution-https://twitter.com/N1R44 >
>
>

-- 
Niranda Perera
https://niranda.dev/
@n1r44 <https://twitter.com/N1R44>

Reply via email to