Thank Kazuaki for the survey and thank Micah for starting the discussion. I do not oppose supporting BE. In fact, I am in general optimistic about the performance impact (for Java). IMO, this is going to be a painful way (many byte order related problems are tricky to debug), so I hope we can make it short.
It is good that someone is willing to take this on, and I would like to provide help if needed. Best, Liya Fan On Tue, Sep 1, 2020 at 7:25 AM Bryan Cutler <cutl...@gmail.com> wrote: > I also think this would be a worthwhile addition and help the project > expand in more areas. Beyond the Apache Spark optimization use case, having > Arrow interoperability with the Python data science stack on BE would be > very useful. I have looked at the remaining PRs for Java and they seem > pretty minimal and straightforward. Implementing the equivalent record > batch swapping as done in C++ at [1] would be a little more involved, but > still reasonable. Would it make sense to create a branch to apply all > remaining changes with CI to get a better picture before deciding on > bringing into master branch? I could help out with shepherding this effort > and assist in maintenance, if we decide to accept. > > Bryan > > [1] https://github.com/apache/arrow/pull/7507 > > On Mon, Aug 31, 2020 at 1:42 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > I think it's well within the right of an implementation to reject BE > > data (or non-native-endian), but if an implementation chooses to > > implement and maintain the endianness conversions, then it does not > > seem so bad to me. > > > > On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau <jacq...@apache.org> > wrote: > > > > > > And yes, for those of you looking closely, I commented on ARROW-245 > when > > it > > > was committed. I just forgot about it. > > > > > > It looks like I had mostly the same concerns then that I do now :) Now > > I'm > > > just more worried about format sprawl... > > > > > > On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau <jacq...@apache.org> > > wrote: > > > > > > > What do you mean? The Endianness field (a Big|Little enum) was > added 4 > > > >> years ago: > > > >> https://issues.apache.org/jira/browse/ARROW-245 > > > > > > > > > > > > I didn't realize that was done, my bad. Good example of format rot > > from my > > > > pov. > > > > > > > > > > > > > > >