I also think this would be a worthwhile addition and help the project
expand in more areas. Beyond the Apache Spark optimization use case, having
Arrow interoperability with the Python data science stack on BE would be
very useful. I have looked at the remaining PRs for Java and they seem
pretty minimal and straightforward. Implementing the equivalent record
batch swapping as done in C++ at [1] would be a little more involved, but
still reasonable. Would it make sense to create a branch to apply all
remaining changes with CI to get a better picture before deciding on
bringing into master branch?  I could help out with shepherding this effort
and assist in maintenance, if we decide to accept.

Bryan

[1] https://github.com/apache/arrow/pull/7507

On Mon, Aug 31, 2020 at 1:42 PM Wes McKinney <wesmck...@gmail.com> wrote:

> I think it's well within the right of an implementation to reject BE
> data (or non-native-endian), but if an implementation chooses to
> implement and maintain the endianness conversions, then it does not
> seem so bad to me.
>
> On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau <jacq...@apache.org> wrote:
> >
> > And yes, for those of you looking closely, I commented on ARROW-245 when
> it
> > was committed. I just forgot about it.
> >
> > It looks like I had mostly the same concerns then that I do now :) Now
> I'm
> > just more worried about format sprawl...
> >
> > On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau <jacq...@apache.org>
> wrote:
> >
> > > What do you mean?  The Endianness field (a Big|Little enum) was added 4
> > >> years ago:
> > >> https://issues.apache.org/jira/browse/ARROW-245
> > >
> > >
> > > I didn't realize that was done, my bad. Good example of format rot
> from my
> > > pov.
> > >
> > >
> > >
>

Reply via email to