Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-10-07 Thread Micah Kornfield
In case any one wants to comment further, I've opened https://github.com/apache/arrow/pull/8374 to canonicalize the details. On Mon, Sep 28, 2020 at 9:08 PM Micah Kornfield wrote: > OK, I will try to update documentation

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-28 Thread Micah Kornfield
OK, I will try to update documentation reflecting this in the next few days (in particular it would be good to document which implementations are willing to support byte flipping). On Tue, Sep 22, 2020 at 3:30 AM Antoine Pitrou wrote: > > > Le 22/09/2020 à 06:36, Micah Kornfield a écrit : > > I

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Antoine Pitrou
Le 22/09/2020 à 06:36, Micah Kornfield a écrit : > I wanted to give this thread a bump, does the proposal I made below sound > reasonable? It does! Regards Antoine. > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield > wrote: > >> If I read the responses so far it seems like the

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Kazuaki Ishizaki
> Re: [Java] Supporting Big Endian) > > Hi Micah, > > Thanks for your summary. Your proposal sounds reasonable to me. > > Best, > Liya Fan > > > On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield > wrote: > > > I wanted to give this thread a b

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Fan Liya
Hi Micah, Thanks for your summary. Your proposal sounds reasonable to me. Best, Liya Fan On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield wrote: > I wanted to give this thread a bump, does the proposal I made below sound > reasonable? > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield >

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-21 Thread Micah Kornfield
I wanted to give this thread a bump, does the proposal I made below sound reasonable? On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield wrote: > If I read the responses so far it seems like the following might be a good > compromise/summary: > > 1. It does not seem too invasive to support native

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-13 Thread Micah Kornfield
If I read the responses so far it seems like the following might be a good compromise/summary: 1. It does not seem too invasive to support native endianness in implementation libraries. As long as there is appropriate performance testing and CI infrastructure to demonstrate the changes work. 2.

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Fan Liya
Thank Kazuaki for the survey and thank Micah for starting the discussion. I do not oppose supporting BE. In fact, I am in general optimistic about the performance impact (for Java). IMO, this is going to be a painful way (many byte order related problems are tricky to debug), so I hope we can

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Bryan Cutler
I also think this would be a worthwhile addition and help the project expand in more areas. Beyond the Apache Spark optimization use case, having Arrow interoperability with the Python data science stack on BE would be very useful. I have looked at the remaining PRs for Java and they seem pretty

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Wes McKinney
I think it's well within the right of an implementation to reject BE data (or non-native-endian), but if an implementation chooses to implement and maintain the endianness conversions, then it does not seem so bad to me. On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau wrote: > > And yes, for

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Jacques Nadeau
And yes, for those of you looking closely, I commented on ARROW-245 when it was committed. I just forgot about it. It looks like I had mostly the same concerns then that I do now :) Now I'm just more worried about format sprawl... On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau wrote: > What do

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Jacques Nadeau
> > What do you mean? The Endianness field (a Big|Little enum) was added 4 > years ago: > https://issues.apache.org/jira/browse/ARROW-245 I didn't realize that was done, my bad. Good example of format rot from my pov.

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Antoine Pitrou
arks on a little-endian platform to avoid performance regression. >>> >>> [1] https://arrow.apache.org/blog/2017/07/26/spark-arrow/ >>> [2] >>> >> https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html >>> [3] >>>

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-30 Thread Jacques Nadeau
; [5] https://github.com/apache/arrow/pull/7507#discussion_r46819873 > > [6] https://github.com/apache/arrow/pull/7507 > > [7] https://github.com/apache/arrow/pull/7940#issuecomment-672690540 > > > > Best Regards, > > Kazuaki Ishizaki > > > > Wes McKinney wr

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-30 Thread Micah Kornfield
sday-morning-keynotes > [5] https://github.com/apache/arrow/pull/7507#discussion_r46819873 > [6] https://github.com/apache/arrow/pull/7507 > [7] https://github.com/apache/arrow/pull/7940#issuecomment-672690540 > > Best Regards, > Kazuaki Ishizaki > > Wes McKinney wrote on 2020/08/26 21:

RE: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
ney > To: dev , Micah Kornfield > Cc: Fan Liya > Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
ornfield > Cc: Fan Liya > Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e.g. > Java) is impractical due t

RE: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
; Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e.g. > Java) is impractical due to performance regressions on L

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Wes McKinney
hi Micah, I agree with your reasoning. If supporting BE in some languages (e.g. Java) is impractical due to performance regressions on LE platforms, then I don't think it's worth it. But if it can be handled at compile time or without runtime overhead, and tested / maintained properly on an

[DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-25 Thread Micah Kornfield
I'm expanding the scope of this thread since it looks like work has also started for making golang support BigEndian architectures. I think as a community we should come to a consensus on whether we want to support Big Endian architectures in general. I don't think it is a good outcome if some

Re: [Java] Supporting Big Endian

2020-08-18 Thread Micah Kornfield
My thoughts on the points raised so far: * Does supporting Big Endian increase the reach of Arrow by a lot? Probably not a significant amount, but it does provide one more avenue of adoption. * Does it increase code complexity? Yes. I agree this is a concern. The PR in question did not seem

Re: [Java] Supporting Big Endian

2020-08-16 Thread Fan Liya
Thank Kazuaki Ishizaki for working on this. IMO, supporting the big-endian should be a large change, as in many places of the code base, we have implicitly assumed the little-endian platform (e.g.

Re: [Java] Supporting Big Endian

2020-08-14 Thread Jacques Nadeau
Hey Micah, thanks for starting the discussion. I just skimmed that thread and it isn't entirely clear that there was a conclusion that the overhead was worth it. I think everybody agrees that it would be nice to have the code work on both platforms. On the flipside, the code noise for a rare case

Re: [Java] Supporting Big Endian

2020-08-14 Thread Kazuaki Ishizaki
:36 Subject:[EXTERNAL] [Java] Supporting Big Endian Kazuaki Ishizak has started working on Big Endian support in Java (including setting up CI for it). Thank you! We previously discussed support for Big Endian architectures in C++ [1] and generally agreed that it was a reasonable thing

[Java] Supporting Big Endian

2020-08-14 Thread Micah Kornfield
Kazuaki Ishizak has started working on Big Endian support in Java (including setting up CI for it). Thank you! We previously discussed support for Big Endian architectures in C++ [1] and generally agreed that it was a reasonable thing to do. Similar to C++ I think as long as we have a working