Re: [Discuss] IPC Specification, flatbuffers and unaligned memory accesses

Antoine Pitrou Tue, 02 Jul 2019 07:32:00 -0700


I guess I still dont understand how the IPC stream format works :-/


To put it clearly: what happens in Flight?  Will a Flight message
automatically get the "stream continuation message" in front of it?


Le 02/07/2019 à 16:15, Wes McKinney a écrit :
> On Tue, Jul 2, 2019 at 4:23 AM Antoine Pitrou <[email protected]> wrote:
>>
>>
>> Le 02/07/2019 à 00:20, Wes McKinney a écrit :
>>> Thanks for the references.
>>>
>>> If we decided to make a change around this, we could call the first 4
>>> bytes a stream continuation marker to make it slightly less ugly
>>>
>>> * 0xFFFFFFFF: continue
>>> * 0x00000000: stop
>>
>> Do you mean it would be a separate IPC message?
> 
> No, I think this is only about how we could change the message prefix
> from 4 bytes to 8 bytes
> 
> https://github.com/apache/arrow/blob/master/docs/source/format/IPC.rst#encapsulated-message-format
> 
> Currently a 0x00000000 (0 metadata size) is used as an end-of-stream
> marker. So what I was saying is that the first 8 bytes could be
> 
> <4 bytes: stream continuation><int32_t metadata size>
> 
>>
>>
>>>
>>> On Mon, Jul 1, 2019 at 4:35 PM Micah Kornfield <[email protected]> 
>>> wrote:
>>>>
>>>> Hi Wes,
>>>> I'm not an expert on this either, my inclination mostly comes from some 
>>>> research I've done.  I think it is important to distinguish two cases:
>>>> 1.  unaligned access at the processor instruction level
>>>> 2.  undefined behavior
>>>>
>>>> From my reading unaligned access is fine on most modern architectures and 
>>>> it seems the performance penalty has mostly been eliminated.
>>>>
>>>> Undefined behavior is a compiler/language concept.  The problem is the 
>>>> compiler can choose to do anything in UB scenarios, not just the "obvious" 
>>>> translation.  Specifically, the compiler is under no obligation to 
>>>> generate the unaligned access instructions, and if it doesn't SEGVs ensue. 
>>>>  Two examples, both of which relate to SIMD optimizations are linked below.
>>>>
>>>> I tend to be on the conservative side with this type of thing but if we 
>>>> have experts on the the ML that can offer a more informed opinion, I would 
>>>> love to hear it.
>>>>
>>>> [1] http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html
>>>> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65709
>>>>
>>>> On Mon, Jul 1, 2019 at 1:41 PM Wes McKinney <[email protected]> wrote:
>>>>>
>>>>> The <0xffffffff><int32_t size> solution is downright ugly but I think
>>>>> it's one of the only ways that achieves
>>>>>
>>>>> * backward compatibility (new clients can read old data)
>>>>> * opt-in forward compatibility (if we want to go to the labor of doing
>>>>> so, sort of dangerous)
>>>>> * old clients receiving new data do not blow up (they will see a
>>>>> metadata length of -1)
>>>>>
>>>>> NB 0xFFFFFFFF <length> would look like:
>>>>>
>>>>> In [13]: np.array([(2 << 32) - 1, 128], dtype=np.uint32)
>>>>> Out[13]: array([4294967295,        128], dtype=uint32)
>>>>>
>>>>> In [14]: np.array([(2 << 32) - 1, 128],
>>>>> dtype=np.uint32).view(np.int32)
>>>>> Out[14]: array([ -1, 128], dtype=int32)
>>>>>
>>>>> In [15]: np.array([(2 << 32) - 1, 128], dtype=np.uint32).view(np.uint8)
>>>>> Out[15]: array([255, 255, 255, 255, 128,   0,   0,   0], dtype=uint8)
>>>>>
>>>>> Flatbuffers are 32-bit limited so we don't need all 64 bits.
>>>>>
>>>>> Do you know in what circumstances unaligned reads from Flatbuffers
>>>>> might cause an issue? I do not know enough about UB but my
>>>>> understanding is that it causes issues on some specialized platforms
>>>>> where for most modern x86-64 processors and compilers it is not really
>>>>> an issue (though perhaps a performance issue)
>>>>>
>>>>> On Sun, Jun 30, 2019 at 6:36 PM Micah Kornfield <[email protected]> 
>>>>> wrote:
>>>>>>
>>>>>> At least on the read-side we can make this detectable by using something 
>>>>>> like <0xffffffff><int32_t size> instead of int64_t.  On the write side 
>>>>>> we would need some sort of default mode that we could flip on/off if we 
>>>>>> wanted to maintain compatibility.
>>>>>>
>>>>>> I should say I think we should fix it.  Undefined behavior is unpaid 
>>>>>> debt that might never be collected or might cause things to fail in 
>>>>>> difficult to diagnose ways. And pre-1.0.0 is definitely the time.
>>>>>>
>>>>>> -Micah
>>>>>>
>>>>>> On Sun, Jun 30, 2019 at 3:17 PM Wes McKinney <[email protected]> wrote:
>>>>>>>
>>>>>>> On Sun, Jun 30, 2019 at 5:14 PM Wes McKinney <[email protected]> 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> hi Micah,
>>>>>>>>
>>>>>>>> This is definitely unfortunate, I wish we had realized the potential
>>>>>>>> implications of having the Flatbuffer message start on a 4-byte
>>>>>>>> (rather than 8-byte) boundary. The cost of making such a change now
>>>>>>>> would be pretty high since all readers and writers in all languages
>>>>>>>> would have to be changed. That being said, the 0.14.0 -> 1.0.0 version
>>>>>>>> bump is the last opportunity we have to make a change like this, so we
>>>>>>>> might as well discuss it now. Note that particular implementations
>>>>>>>> could implement compatibility functions to handle the 4 to 8 byte
>>>>>>>> change so that old clients can still be understood. We'd probably want
>>>>>>>> to do this in C++, for example, since users would pretty quickly
>>>>>>>> acquire a new pyarrow version in Spark applications while they are
>>>>>>>> stuck on an old version of the Java libraries.
>>>>>>>
>>>>>>> NB such a backwards compatibility fix would not be forward-compatible,
>>>>>>> so the PySpark users would need to use a pinned version of pyarrow
>>>>>>> until Spark upgraded to Arrow 1.0.0. Maybe that's OK
>>>>>>>
>>>>>>>>
>>>>>>>> - Wes
>>>>>>>>
>>>>>>>> On Sun, Jun 30, 2019 at 3:01 AM Micah Kornfield 
>>>>>>>> <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> While working on trying to fix undefined behavior for unaligned memory
>>>>>>>>> accesses [1], I ran into an issue with the IPC specification [2] which
>>>>>>>>> prevents us from ever achieving zero-copy memory mapping and having 
>>>>>>>>> aligned
>>>>>>>>> accesses (i.e. clean UBSan runs).
>>>>>>>>>
>>>>>>>>> Flatbuffer metadata needs 8-byte alignment to guarantee aligned 
>>>>>>>>> accesses.
>>>>>>>>>
>>>>>>>>> In the IPC format we align each message to 8-byte boundaries.  We then
>>>>>>>>> write a int32_t integer to to denote the size of flat buffer metadata,
>>>>>>>>> followed immediately  by the flatbuffer metadata.  This means the
>>>>>>>>> flatbuffer metadata will never be 8 byte aligned.
>>>>>>>>>
>>>>>>>>> Do people care?  A simple fix  would be to use int64_t instead of 
>>>>>>>>> int32_t
>>>>>>>>> for length.  However, any fix essentially breaks all previous client
>>>>>>>>> library versions or incurs a memory copy.
>>>>>>>>>
>>>>>>>>> [1] https://github.com/apache/arrow/pull/4757
>>>>>>>>> [2] https://arrow.apache.org/docs/ipc.html

Re: [Discuss] IPC Specification, flatbuffers and unaligned memory accesses

Reply via email to