Hi Micah, Thank you. Your proposal also sounds reasonable to me. Best Regards, Kazuaki Ishizaki
Fan Liya <liya.fa...@gmail.com> wrote on 2020/09/22 15:51:58: > From: Fan Liya <liya.fa...@gmail.com> > To: dev <dev@arrow.apache.org>, Micah Kornfield <emkornfi...@gmail.com> > Date: 2020/09/22 15:52 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > Hi Micah, > > Thanks for your summary. Your proposal sounds reasonable to me. > > Best, > Liya Fan > > > On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > I wanted to give this thread a bump, does the proposal I made below sound > > reasonable? > > > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > > > If I read the responses so far it seems like the following might be a > > good > > > compromise/summary: > > > > > > 1. It does not seem too invasive to support native endianness in > > > implementation libraries. As long as there is appropriate performance > > > testing and CI infrastructure to demonstrate the changes work. > > > 2. It is up to implementation maintainers if they wish to accept PRs that > > > handle byte swapping between different architectures. (Right now it > > sounds > > > like C++ is potentially OK with it and for Java at least Jacques is > > opposed > > > to it? > > > > > > Testing changes that break big-endian can be a potential drag on > > developer > > > productivity but there are methods to run locally (at least on more > > recent > > > OSes). > > > > > > Thoughts? > > > > > > Thanks, > > > Micah > > > > > > On Mon, Aug 31, 2020 at 7:08 PM Fan Liya <liya.fa...@gmail.com> wrote: > > > > > >> Thank Kazuaki for the survey and thank Micah for starting the > > discussion. > > >> > > >> I do not oppose supporting BE. In fact, I am in general optimistic about > > >> the performance impact (for Java). > > >> IMO, this is going to be a painful way (many byte order related problems > > >> are tricky to debug), so I hope we can make it short. > > >> > > >> It is good that someone is willing to take this on, and I would like to > > >> provide help if needed. > > >> > > >> Best, > > >> Liya Fan > > >> > > >> > > >> > > >> On Tue, Sep 1, 2020 at 7:25 AM Bryan Cutler <cutl...@gmail.com> wrote: > > >> > > >> > I also think this would be a worthwhile addition and help the project > > >> > expand in more areas. Beyond the Apache Spark optimization use case, > > >> having > > >> > Arrow interoperability with the Python data science stack on BE would > > be > > >> > very useful. I have looked at the remaining PRs for Java and they seem > > >> > pretty minimal and straightforward. Implementing the equivalent record > > >> > batch swapping as done in C++ at [1] would be a little more involved, > > >> but > > >> > still reasonable. Would it make sense to create a branch to apply all > > >> > remaining changes with CI to get a better picture before deciding on > > >> > bringing into master branch? I could help out with shepherding this > > >> effort > > >> > and assist in maintenance, if we decide to accept. > > >> > > > >> > Bryan > > >> > > > >> > [1] INVALID URI REMOVED > u=https-3A__github.com_apache_arrow_pull_7507&d=DwIBaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f- > ZCGj9Pg&m=SWYdav0xVJ2EmwdbH9QiN--pc-NOGm4aigFKX2tlZhs&s=JCN- > VA4mo9RyvlwhiCaHSTxQwcLL3CvkrMY8_mOhTks&e= > > >> > > > >> > On Mon, Aug 31, 2020 at 1:42 PM Wes McKinney <wesmck...@gmail.com> > > >> wrote: > > >> > > > >> > > I think it's well within the right of an implementation to reject BE > > >> > > data (or non-native-endian), but if an implementation chooses to > > >> > > implement and maintain the endianness conversions, then it does not > > >> > > seem so bad to me. > > >> > > > > >> > > On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau <jacq...@apache.org> > > >> > wrote: > > >> > > > > > >> > > > And yes, for those of you looking closely, I commented on > > ARROW-245 > > >> > when > > >> > > it > > >> > > > was committed. I just forgot about it. > > >> > > > > > >> > > > It looks like I had mostly the same concerns then that I do now :) > > >> Now > > >> > > I'm > > >> > > > just more worried about format sprawl... > > >> > > > > > >> > > > On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau < > > jacq...@apache.org> > > >> > > wrote: > > >> > > > > > >> > > > > What do you mean? The Endianness field (a Big|Little enum) was > > >> > added 4 > > >> > > > >> years ago: > > >> > > > >> INVALID URI REMOVED > u=https-3A__issues.apache.org_jira_browse_ARROW-2D245&d=DwIBaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f- > ZCGj9Pg&m=SWYdav0xVJ2EmwdbH9QiN--pc- > NOGm4aigFKX2tlZhs&s=DSjBn3vdsIO8m3CeztvbX3x6U_7sqWil6NmGY_jlZaQ&e= > > >> > > > > > > >> > > > > > > >> > > > > I didn't realize that was done, my bad. Good example of format > > rot > > >> > > from my > > >> > > > > pov. > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > >> > > > >> > > > > >