Hi Mahmut,

I wonder if you can describe at a higher level what you are doing that
requires so many allocations or rebuildings. The example you provide of
modifying the underlying offset pointer seems a little strange to me as I
thought one of the architectural goals of  those structures was to be
immutable.

It might also help to show/explain some examples of what types of
performance improvements would be enabled.

Depending on what exactly you are doing, I wonder if you could use the Rust
unsafe API for your advanced usecases rather than having to extend arrow
itself.

Andrew

On Thu, Oct 8, 2020 at 9:10 AM vertexclique vertexclique <
vertexcli...@gmail.com> wrote:

> Hi;
>
> Let me start with my aim and how things are evolved in my mind.
> Through extensive usage of Arrow API, I've realized that we are doing so
> many unnecessary allocations and rebuilding for simple things like offset
> changes. (At least that's what I am doing).
>
> That said, it is tough to make the tradeoff of iterator overhead in
> reconstruction, and other extra bits come with the ArrayData and Array
> construction. I see that tests are also so long because of the
> reconstruction of the intermediate results.
>
> Use case 1, below code won't do something:
>
>         std::mem::swap(&mut child_data.offset(), &mut 40);
>
> Due to private fields, such as the simple operation mentioned above, that
> will enable the developer for advanced cases, is blocked.
>
> I propose the following:
>
> There is a feature gate macro that exposes fields to enable doing this:
>
>         std::mem::swap(&mut child_data.offset, &mut 40);
>
> Macro will check the feature called `*exposed*` to enable conditional
> compilation for fields.
> This can be for anything. That said, we put a disclaimer in the README
> about the exposed API that it shouldn't be used unless you know what you
> are doing.
>
> An important part of this, that it will enable so many things from the
> performance perspective. Which we can also internally use when the exposed
> feature is enabled.
>
> What do you think of it? If you feel good about it, I want to incorporate
> this into the codebase asap.
>
> Best,
> Mahmut Bulut
>

Reply via email to