First, make sure you add the -O2 compiler option in godbolt, so that these are actually optimized. If you do that, `direct()` becomes two instructions (on both architectures), while `indirect()` on ARM is still 9 instructions.
It's true that on x86_64, this change will have no negative impact, as you observed. But that's specifically because x86_64 supports unaligned reads and writes, and so on this platform you don't actually need to change anything to support unaligned buffers. On ARM, your example is generating an out-of-line function call to memcpy. I could be wrong, but I think this will be heavier than you are imagining. There are three issues: - The function call itself takes several instructions. - An out-of-line function call will force the compiler to be more conservative about optimizations around it. When a getter is inlined into a larger function body, this could lead to a lot more overhead than is visible in the godbolt example. For example, caller-saved registers used by that outer function would need to be saved and restored around each call. - The glibc implementation of memcpy() itself needs to be designed to handle any size of memcpy, and is optimized for larger, variable-sized copies, since small fixed copies would normally be inlined. Several branches will be needed even for a small copy. Here's the code: https://github.com/lattera/glibc/blob/master/string/memcpy.c And macros it depends on: https://github.com/lattera/glibc/blob/master/sysdeps/generic/memcopy.h It's hard to say how much effect all this would really have, but it would make me uncomfortable. But it might not be too hard to convince the compiler to generate a fixed sequence of byte copies, rather than a memcpy call. That could be a lot better. I'm kind of surprised that GCC doesn't optimize it this way automatically, TBH. BTW it looks like arm64 gets optimized to an unaligned load just like x86_64. So the future seems to be one where we don't need to worry about alignment anymore. Maybe that's a good argument for going ahead with this approach now. -Kenton On Thu, Jan 9, 2020 at 10:03 PM David Renshaw <[email protected]> wrote: > I want to make it easy and safe for users of capnproto-rust to read > messages from unaligned buffers without copying. (See this github issue > <https://github.com/capnproto/capnproto-rust/issues/101>.) > > Currently, a user must pass their unaligned buffer through unsafe fn > bytes_to_words() > <https://github.com/capnproto/capnproto-rust/blob/d1988731887b2bbb0ccb35c68b9292d98f317a48/capnp/src/lib.rs#L82-L88>, > asserting that they believe their hardware to be okay with unaligned reads. > In other words, we require that the user understand some tricky low-level > processor details, and that the user preclude their software from running > on many platforms. > > (With libraries like sqlite, zmq, redis, and many others, there simply is > no way to request that a buffer be aligned -- you are just given an array > of bytes. You can copy the bytes into an aligned buffer, but that has a > performance cost and a complexity cost (who owns the new buffer?).) > > I believe that it would be better for capnproto-rust to work natively on > unaligned buffers. In fact, I have a work-in-progress branch that achieves > this, essentially by changing a bunch of direct memory accesses into tiny > memcpy() calls. This c++ godbolt snippe <https://godbolt.org/z/Wki7uy>t > captures the main idea, and shows that, on x86_64 at least, the extra > indirection gets optimized away completely. Indeed, my performance > measurements so far support the hypothesis that there will be no > performance cost in the x86_64 case. For processors that don't support > unaligned access, the extra copy will still be there (e.g. > https://godbolt.org/z/qgsGMT), but I hypothesize that it will be fast. > > All in all, this change seems to me like a big usability win. So I'm > wondering: have I missed anything in the above analysis? Are there good > reasons I shouldn't make the change? > > - David > > -- > You received this message because you are subscribed to the Google Groups > "Cap'n Proto" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/capnproto/CABR6rW-JpiJntc0i7O4cVywzfvd2YnVp89BgYeJp_Gwzoc_Edg%40mail.gmail.com > <https://groups.google.com/d/msgid/capnproto/CABR6rW-JpiJntc0i7O4cVywzfvd2YnVp89BgYeJp_Gwzoc_Edg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/CAJouXQ%3DHAeRDqg3rWzyySKyW_NXo_HNmW8ucY_bVXn%2BjHi0fog%40mail.gmail.com.
