I suspect that relaxing the constraint to native endianness (and
including this in any IPC/RPC metadata (per ARROW-245) will not cause
too many problems. One of the challenges for us will be testing and
continuous integration -- what are the options for running the test
suite on a regular basis on big endian platforms? I know that in
pandas we occasionally ran into esoteric test failures for the PPC /
big-endian Debian package builds but for the most part there haven't
been any problems.

- Wes

On Fri, Aug 5, 2016 at 4:39 AM, Sanjay Rao <getsanjay...@live.com> wrote:
> Some places where explicit check for Little Endian is there-
> ./memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java:    if 
> (!NATIVE_ORDER || buf.order() != ByteOrder.BIG_ENDIAN) 
> {./memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java:      
> throw new IllegalStateException("Arrow only runs on LittleEndian systems.");
> Sanjay
>> From: pchan...@maprtech.com
>> Date: Thu, 4 Aug 2016 17:04:34 -0700
>> Subject: Re: Is there plan to support BigEndian Systems like SUN SPARC 
>> Hardware ?
>> To: dev@arrow.apache.org; emkornfi...@gmail.com
>> CC: jul...@dremio.com
>>
>> Drill's assumption of little endian is in the ValueVector code, and Arrow
>> has inherited the same assertion. (
>> https://github.com/apache/arrow/blob/master/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java#L58
>> )
>>
>> In the Java implementation, the underlying Netty implementation handles the
>> conversion between endianness fairly well, so potentially this assert can
>> be removed from here and Drill can move this higher up in the Drill code.
>>
>>
>> Parth
>>
>> On Thu, Aug 4, 2016 at 1:14 PM, Micah Kornfield <emkornfi...@gmail.com>
>> wrote:
>>
>> > Hi Julien,
>> > Thats the theory.  I don't think that there is anything in the C++ code
>> > base that should break but we don't have access to hardware to verify that.
>> >
>> > The java Arrow code currently asserts that it is running on a little endian
>> > machine.  I did a very quick scan of the Java code and didn't see anything
>> > there would break on a big-endian system, but according to at least one
>> > person who is working on Drill, it seems that Drill assumes little
>> > endianness (I don't know if this is in Arrow/ValueVector code or it is
>> > higher up the stack in the Drill code).
>> >
>> > Thanks,
>> > Micah
>> >
>> >
>> > On Thu, Aug 4, 2016 at 11:36 AM, Julien Le Dem <jul...@dremio.com> wrote:
>> >
>> > > So it sounds like right now it just works as long as there are no
>> > > inter-system communication (with different endianness) because both java
>> > > and c++ code just use the underlying endianness.
>> > > Is that correct?
>> > >
>> > >
>> > > On Thu, Aug 4, 2016 at 11:17 AM, Micah Kornfield <emkornfi...@gmail.com>
>> > > wrote:
>> > >
>> > >> Hi Sanjay,
>> > >> I think we are trying to work that out now.  As you've seen with some of
>> > >> you initial investigation we have no coverage for big-endian machines
>> > yet.
>> > >> But in the long run, we should be able to make it work (it seems like
>> > >> there
>> > >> might be some difference of opinion on how to make it work).
>> > >>
>> > >> Thanks,
>> > >> Micah
>> > >>
>> > >> On Mon, Aug 1, 2016 at 11:16 AM, Sanjay Rao <getsanjay...@live.com>
>> > >> wrote:
>> > >>
>> > >> > Hi Wes, Hi Micah,
>> > >> > I understood what you meant, so point 2. Arrow working with Big Endian
>> > >> > machine to Big Endian shouldn't be an issue right ?
>> > >> > Please confirm.
>> > >> > Thanks,Sanjay
>> > >> > > From: wesmck...@gmail.com
>> > >> > > Date: Mon, 1 Aug 2016 11:07:07 -0700
>> > >> > > Subject: Re: Is there plan to support BigEndian Systems like SUN
>> > SPARC
>> > >> > Hardware ?
>> > >> > > To: dev@arrow.apache.org; emkornfi...@gmail.com
>> > >> > >
>> > >> > > hey Micah,
>> > >> > >
>> > >> > > On Mon, Aug 1, 2016 at 11:02 AM, Micah Kornfield <
>> > >> emkornfi...@gmail.com>
>> > >> > wrote:
>> > >> > > > Hi Wes,
>> > >> > > > The point I was trying to argue from an earlier thread is that the
>> > >> most
>> > >> > > > common cases for relocation are:
>> > >> > > > 1.  Little endian machine to little endian machine (most likely
>> > same
>> > >> > > > machine)
>> > >> > > > 2.  big endian machine to big endian machine (most likely same
>> > >> machine)
>> > >> > > > 3.  big endian machine to little endian machine or vice versa
>> > >> > > >
>> > >> > > > The purpose of the metadata would be to make use-cases 1 and 2
>> > >> possible
>> > >> > > > without byte-swapping.  Use case 3 would obviously require byte
>> > >> > swapping
>> > >> > > > but for an initial implementation the code could simply indicate
>> > >> that
>> > >> > it is
>> > >> > > > not supported.
>> > >> > > >
>> > >> > > > This seems less complex to me than actually implementing any sort
>> > of
>> > >> > > > byte-swapping logic while still supporting the widest variety of
>> > >> > hardware
>> > >> > > > with the same code for the most common use-cases.
>> > >> > >
>> > >> > > This makes sense. My comments were for the situation that a big
>> > endian
>> > >> > > system would be exposing memory to an unknown consumer -- for
>> > example,
>> > >> > > if we implemented an RPC wire format for Arrow memory, then in
>> > general
>> > >> > > a big endian system would need to send little-endian integers to an
>> > >> > > arbitrary receiver. I'm not sure the best way to provide for easy
>> > >> > > native-endianness support for cases 1/2, but trying to fully solve
>> > >> > > this problem now seems premature until we've established some of
>> > these
>> > >> > > tools (so long as we haven't painted ourselves into a corner).
>> > >> > >
>> > >> > > - Wes
>> > >> > >
>> > >> > > >
>> > >> > > > Thanks,
>> > >> > > > Micah
>> > >> > > >
>> > >> > > > P.S. If anybody can provide pointers I'd be interested to
>> > understand
>> > >> > which
>> > >> > > > pieces of the java code make assumptions about little-endianness.
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Julien
>> > >
>> >
>

Reply via email to