hi Andrew,

slightly related but probably also slightly off-topic:
(for inspiration) you may want to look at how this is done in groot/rarrow
where tools are exported to
- expose a ROOT "schema" as an Arrow Schema
- expose a ROOT Tree as an Arrow Table

groot/rarrow isn't working on zero-copy of ROOT data, though.

hth,
-s

On Thu, Jan 23, 2020 at 2:03 PM Andrew Melo <[email protected]> wrote:

> Hello all,
>
> I work in particle physics, which has standardized on the ROOT (
> http://root.cern) file format to store/process our data. The format
> itself is quite complicated, but the relevant part here is that after
> parsing/decompression, we end up with value and offset buffers holding our
> data.
>
> What I'd like to do is represent these data in-memory in the Arrow format.
> I've written a very rough POC where I manually put an Arrow stream into a
> ByteBuffer, then replaced the values and offset buffers with the bytes from
> my files., and I'm wondering what's the "proper" way to do this is. From my
> reading of the code, it appears (?) that what I want to do is produce a
> org.apache.arrow.vector.types.pojo.Schema object, and N ArrowRecordBatch
> objects, then use MessageSerializer to stick them into a ByteBuffer one
> after each other.
>
> Is this correct? Or, is there another API I'm missing?
>
> Thanks!
> Andrew
>

Reply via email to