> If we go forward with these changes, it would be a good
opportunity for us to clarify in our docs/website that the "Arrow format"
is not a single thing.

The idea of using Arrow as a common memory format for interchange between
C/C++ implementations makes lots of sense to me.

What if we took a middle ground with a StringView that could have either 1)
inlined strings, 2) an offset, or 3) an arbitrary pointer and left the
arbitrary pointer "implementation defined"

That way arrow would still have the same IPC / ABI, though implementations
other than C/++ would likely reject arbitrary pointer representation as
"not implemented"

Andrew

To be clear, I am fairly sure the Rust implementation could consume this
new format (by assuming the pointers are valid), but I think it very
unlikely to produce the new format given Rust's memory safety model





On Thu, Dec 23, 2021 at 11:59 AM Neal Richardson <
neal.p.richard...@gmail.com> wrote:

> > I think in this particular case, we should consider the C ABI /
> > in-memory representation and IPC format as separate beasts. If an
> > implementation of Arrow does not want to use this string-view array
> > type at all (for example, if it created memory safety issues in Rust),
> > then it can choose to convert to the existing string array
> > representation when receiving a C ABI payload. Whether or not there is
> > an alternate IPC format for this data type seems like a separate
> > question -- my preference actually would be to support this for
> > in-memory / C ABI use but not to alter the IPC format.
> >
>
> I think this idea deserves some clarification or at least more exposition.
> On first reading, it was not clear to me that we might add things to the
> in-memory Arrow format but not IPC, that that was even an option. I'm
> guessing I'm not the only one who missed that.
>
> If these new types are only part of the Arrow in-memory format, then it's
> not the case that reading/writing IPC files involves no serialization
> overhead. I recognize that that's technically already the case since IPC
> supports compression now, but it's not generally how we talk about the
> relationship between the IPC and in-memory formats (see our own FAQ [1],
> for example). If we go forward with these changes, it would be a good
> opportunity for us to clarify in our docs/website that the "Arrow format"
> is not a single thing.
>
> Neal
>
> [1]: https://arrow.apache.org/faq/
>

Reply via email to