On 04.12.2016 14:14, Luca Boccassi wrote:
> 
>> But on how to fix this. Given [0] you seem to be just passing out the
>> pointer to your internal buffer written without padding out to the user
>> via zmq_msg_data (?)
>>
>> The documentation of that function states that you must not access
>> zmq_msg_t directly, so if nobody actual does do so regardless zmq can
>> change this structure and stay compatible?
>> If so can zmq insert alignment padding between the message headers and
>> the payload so zmq_msg_data returns aligned data?
> 
> This is an interesting suggestion so I had a look, but unfortunately I
> don't think it can be done.
> 
> The pointer returned points directly to the buffer as it was written by
> the TCP/IPC socket read. Since it's being written as it is received,
> there is no way at that stage to know what's header and what's payload,
> and insert padding. The actual decoding of the data is done at a later
> stage, when it's too late to get away with just adjusting pointers.
> 

Thanks for the explanation. It is probably better to focus on fixing
tango then.

> 
>> This would be very good for compatibility, on non-x86 arches it might
>> even be better for performance. Unaligned access can be very slow on
>> some of the less powerful cpus.
>>
>> (Also even on x86 you can get into alignment issues due to these
>> buffers, in particular with numerical applications where
>> auto-vectorization by the compilers is involved)
>>
>> [0] http://lists.zeromq.org/pipermail/zeromq-dev/2016-November/031096.html
> 
> Ultimately, performance-wise, this change from Jens allow one less
> malloc and copy per message (it became effectively a zero-copy receive
> from the point of view of userspace), and I'd be extremely surprised if
> the cost of unalignment would out-weight the gains.
> 

You have to consider that when the payload consists of a lot of data
that has alignment requirements e.g. lots of integers or floats every
access to the payload buffer is unaligned. With expensive unaligned
access you might be better of copying to an aligned buffer than working
on this buffer directly.
But then on x86 unaligned access is almost free, so it most likely is a
net plus in performance on the majority of systems.
A benchmark on e.g. ARM, should this platform be relevant to ZMQ, would
be interesting to see if adjusting payload alignment for a potential ZMQ
3.1 is worthwhile to look at.

cheers,
Julian Taylor

Reply via email to