Re: Architectures where unaligned access is (not) OK?
* Riku Voipio riku.voi...@iki.fi, 2014-11-24, 11:07: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. On armel and armhf you can enable SIGBUS on unaligned access with: echo 4 /proc/cpu/alignment Can we get this enabled on buildds? -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150525114220.ga8...@jwilk.net
Re: Architectures where unaligned access is (not) OK?
Usually when you're loading an unaligned value you're also loading a particular representation and endianness, which might not be the native one, so I've often written code like this: uint32_t load32(void *p) { unsigned char *c = p; return c[0] | (uint32_t)c[1] 8 | (uint32_t)c[2] 16 | (uint32_t)c[3] 24; } (The first of the three casts isn't really necessary ...) Unfortunately, GCC doesn't seem to turn that into a single load instruction even when it could be. (Why not?) If you know that you want the native representation and endianness there's this possibility: uint32_t load32(void *p) { uint32_t r; memcpy(r, p, sizeof(r)); return r; } GCC turns that into a single load instruction on i386, amd64 and arm64. Edmund -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAHDciUexgfHwWX_ijWrC1NWqhX7sc_3ktGwU8Ya4Ntjq_UX=f...@mail.gmail.com
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 06:01:11PM +0100, Jakub Wilk wrote: * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. On armel and armhf you can enable SIGBUS on unaligned access with: echo 4 /proc/cpu/alignment Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141124090723.ga20...@afflict.kos.to
Re: Architectures where unaligned access is (not) OK?
On Sat, Nov 22, 2014 at 11:03:16PM +0100, Bastien ROUCARIES wrote: On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote: * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34: i386: __asm__(pushf\norl $0x4,(%esp)\npopf); It works! Actually, it works so well it makes puts(hello world) die with SIGBUS. :-( Yeah, that's my experience, too. glibc is not alignment check-safe on i386 and amd64. If you turn it on in an LD_PRELOAD using _init, it segfaults before main. I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell free to add to it. Thanks for updates. However, I'm not sure if the current unaligned accesses letters make it clear: Y=Yes, O=Often generally ok but ma fail in some specific case, T=Traps may be fixed by kernel (super slow),M=Maybe generally not ok, N=No raise SIGBUS. See detail below Now, the M/T for armel/armhf/arm64 which is ambigous and misleading. Better make if clear: armel: no armhf: yes (exceptions as listed by Leif) arm64: yes (exceptions as listed by Leif) And then link to Leif's mail rather than copy the text unattributed on already long wikipage. Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141124093825.gb20...@afflict.kos.to
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 02:40:01PM +, Wookey wrote: Whilst that is nice and correct I'm not sure the meaning is clear to a software engineer, which is: Don't do unaligned access on arm64. It's always inefficient, sometimes extremely inefficient, and sometimes won't work at all. That's an exaggeration. libav/ffmpeg for example has: fast_unaligned_if_any=aarch64 ppc x86 armv[6-8]*) enable fast_clz fast_unaligned ;; So the developers consider unaligned accesses a faster option than aligned on armv6+. Feel free to benchmark with and without the flag :) I can understand CPU/cache designers pain in implementing unaligned access, but reality is that ARMv7+ supports unaligned accesses quite well and this sometimes allows software engineers sometimes to write more clean and maintainable code. This doesn't mean that unaligned accesses make always sense. There is plenty of unaligned accesses that could be converted to more clean code that does aligned accesses... arm64 simply doesn't permit unaligned access in the core. Some types of access are wrapped (in hardware) so that 2 access will get done and you are returned the relevant pieces of the result, but this doesn't apply to all access types and is obviously slower/energy inefficent. If your data is split over aligment boundaries, you will need multiple memory accesses anyways. Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141124103056.gc20...@afflict.kos.to
Re: Architectures where unaligned access is (not) OK?
On Mon, Nov 24, 2014 at 10:38 AM, Riku Voipio riku.voi...@iki.fi wrote: On Sat, Nov 22, 2014 at 11:03:16PM +0100, Bastien ROUCARIES wrote: On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote: * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34: i386: __asm__(pushf\norl $0x4,(%esp)\npopf); It works! Actually, it works so well it makes puts(hello world) die with SIGBUS. :-( Yeah, that's my experience, too. glibc is not alignment check-safe on i386 and amd64. If you turn it on in an LD_PRELOAD using _init, it segfaults before main. I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell free to add to it. Thanks for updates. However, I'm not sure if the current unaligned accesses letters make it clear: Y=Yes, O=Often generally ok but ma fail in some specific case, T=Traps may be fixed by kernel (super slow),M=Maybe generally not ok, N=No raise SIGBUS. See detail below i am open to suggestion Now, the M/T for armel/armhf/arm64 which is ambigous and misleading. Better make if clear: armel: no No = sigbus ? or trap ? armhf: yes (exceptions as listed by Leif) So it is often like i386 arm64: yes (exceptions as listed by Leif) often And then link to Leif's mail rather than copy the text unattributed on already long wikipage. OK will add link. But I think the page will grow with fpu stuff... Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/cae2spaa-sbo8j8rpnvo2ogqzcftwr4_ztardoxtytemqvac...@mail.gmail.com
Re: Architectures where unaligned access is (not) OK?
On 21/11/14 13:31, Bernhard R. Link wrote: Otherwise that memory might afterwards be regarded as lzo_memops_TU2_struct lzo_memops_TU2_struct is declared with __attribute__((__may_alias__)), so actually the right thing should be happening WRT aliasing in this case. On 21/11/14 13:21, Thorsten Glaser wrote: • for i386 and especially amd64, all subarchitectures supported by Debian/Linux jessie suffer so much from unaligned access, speed-wise, that it’s worth the overhead of forcing aligned access (i386, i486 maybe were not as badly affected) I was hoping this statement was correct, because if it was, avoiding unaligned accesses would be a clear win regardless, and the right thing to do would be entirely uncontroversial. Unfortunately, on my x86-64 laptop, my patched liblzo2 with -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as the unpatched one for a simple test-case (uncompress linux_3.17.orig.tar.xz to linux_3.17.orig.tar in a tmpfs, time lzop -c linux_3.17.orig.tar /dev/null, repeat 3 times; results agree within 10%). I'm trying out a slightly different approach: keeping the unaligned accesses via casts like *(uint16_t *) on architectures where lzodefs.h specifically allows them, but disabling the casts via struct { char[n] } conditional on alignof(that struct) == 1, which seem to be the problematic ones. The CPUs for which lzodefs.h uses those casts are amd64, arm* conditional on target CPU (so armel but not armhf in Debian terms), arm64, cris, i386, m68k conditional on target CPU (__mc68020__ but not __mcoldfire__), powerpc* if big-endian, and s390*. S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/54721f74.7010...@debian.org
Re: Architectures where unaligned access is (not) OK?
On 23/11/14 17:55, Simon McVittie wrote: Unfortunately, on my x86-64 laptop, my patched liblzo2 with -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as the unpatched one [...] I'm trying out a slightly different approach: keeping the unaligned accesses via casts like *(uint16_t *) on architectures where lzodefs.h specifically allows them, but disabling the casts via struct { char[n] } conditional on alignof(that struct) == 1, which seem to be the problematic ones. That fixed the performance regression on amd64 while still working correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff. If anyone has better ideas, I'm happy to cancel the delayed upload and let someone take over fixing the bug. S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/54725b89.5020...@debian.org
Re: Architectures where unaligned access is (not) OK?
On 23.11.2014 23:11, Simon McVittie wrote: On 23/11/14 17:55, Simon McVittie wrote: Unfortunately, on my x86-64 laptop, my patched liblzo2 with -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as the unpatched one [...] I'm trying out a slightly different approach: keeping the unaligned accesses via casts like *(uint16_t *) on architectures where lzodefs.h specifically allows them, but disabling the casts via struct { char[n] } conditional on alignof(that struct) == 1, which seem to be the problematic ones. That fixed the performance regression on amd64 while still working correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff. If anyone has better ideas, I'm happy to cancel the delayed upload and let someone take over fixing the bug. what works well is just replacing the offending memory loads with the memcpy call. As the size of the memcpy call is constant the compiler will take care of emitting code appropriate for the platform. It doesn't help speed up the code on the trapping arches but at least one does not penalize the ones where it is allowed. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/54725fea.8050...@googlemail.com
Re: Architectures where unaligned access is (not) OK?
On 23/11/14 22:30, Julian Taylor wrote: what works well is just replacing the offending memory loads with the memcpy call. As the size of the memcpy call is constant the compiler will take care of emitting code appropriate for the platform. Ah, even better; the timing on x86-64 comes out the same. I'll cancel the NMU and retry. S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/547262e4.40...@debian.org
Re: Architectures where unaligned access is (not) OK?
On Sun, 2014-11-23 at 23:30 +0100, Julian Taylor wrote: On 23.11.2014 23:11, Simon McVittie wrote: On 23/11/14 17:55, Simon McVittie wrote: Unfortunately, on my x86-64 laptop, my patched liblzo2 with -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as the unpatched one [...] I'm trying out a slightly different approach: keeping the unaligned accesses via casts like *(uint16_t *) on architectures where lzodefs.h specifically allows them, but disabling the casts via struct { char[n] } conditional on alignof(that struct) == 1, which seem to be the problematic ones. That fixed the performance regression on amd64 while still working correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff. If anyone has better ideas, I'm happy to cancel the delayed upload and let someone take over fixing the bug. what works well is just replacing the offending memory loads with the memcpy call. [...] That is not necessarily true, e.g. in this function void copy_foo(struct foo *dst, const struct foo *src) { memcpy(dst, src, sizeof(*dst)); } the compiler is still allowed to assume that src has the proper alignment for struct foo and to optimise the memcpy() accordingly. And yes, this is something that gcc really does. Pointers to an unaligned instance of a structure generally need to be declared as void *, char * or unsigned char * (or const-qualified versions). Ben. -- Ben Hutchings Never put off till tomorrow what you can avoid all together. signature.asc Description: This is a digitally signed message part
Re: Architectures where unaligned access is (not) OK?
On 23/11/14 22:54, Ben Hutchings wrote: in this function void copy_foo(struct foo *dst, const struct foo *src) { memcpy(dst, src, sizeof(*dst)); } the compiler is still allowed to assume that src has the proper alignment for struct foo and to optimise the memcpy() accordingly. I don't *think* lzo relies on that; the struct assignment I mentioned in a previous mail is part of its fallback implementation of what is basically 1-, 2-, 4- and 8-byte memcpy. The arguments seem to be unsigned char * in practice. liblzo2 seems to be one of these codebases that bases its idea of how C works on portability folklore and the assumption that the compiler and standard C library are the most naive implementation possible :-( S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5472686e.5040...@debian.org
Re: Architectures where unaligned access is (not) OK?
* Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34: i386: __asm__(pushf\norl $0x4,(%esp)\npopf); It works! Actually, it works so well it makes puts(hello world) die with SIGBUS. :-( -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141122204243.ga1...@jwilk.net
Re: Architectures where unaligned access is (not) OK?
On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote: * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34: i386: __asm__(pushf\norl $0x4,(%esp)\npopf); It works! Actually, it works so well it makes puts(hello world) die with SIGBUS. :-( Yeah, that's my experience, too. glibc is not alignment check-safe on i386 and amd64. If you turn it on in an LD_PRELOAD using _init, it segfaults before main. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Architectures where unaligned access is (not) OK?
On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote: * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34: i386: __asm__(pushf\norl $0x4,(%esp)\npopf); It works! Actually, it works so well it makes puts(hello world) die with SIGBUS. :-( Yeah, that's my experience, too. glibc is not alignment check-safe on i386 and amd64. If you turn it on in an LD_PRELOAD using _init, it segfaults before main. I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell free to add to it. Bastien brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAE2SPAZzY4ubyQS=L+822=hfPRn5=wu18vxed8jphiteauj...@mail.gmail.com
Re: Architectures where unaligned access is (not) OK?
* Simon McVittie: - OK: any-i386, any-amd64 SSE2 is part of amd64 and i386, and has strict alignment requirements. This is why stack alignment bugs in the toolchain are usually fatal. (We still support SSE2-less i386 installations, I think, but some libraries will use SSE2 when available.) i386 also supports alignment checking. It used to be possible to run quite a bit of code with that flag switched on, but nowadays, glibc string functions use unaligned accesses in some cases because on current implementations, they are faster than the alternatives, so this way of debugging alignment issues no longer works. - not OK: armel Many architectures take a significant performance hit. Usually, this is because unaligned accesses are emulated in a kernel trap (which can be switched off to debug these performance issues, hence the differences in system behavior). Some older i386/amd64 implementations have relatively costly unaligned access, too, but only in the order of a couple of cycles, not the hundreds or thousands kernel emulation will require. It is very difficult to write correct C code which uses unaligned pointers because they are an aliasing violtion as well. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/87k32mfl41@mid.deneb.enyo.de
Architectures where unaligned access is (not) OK?
A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? * Would it be safer to assume that future architectures are in the unaligned accesses are OK category, or the not OK category? The ones I know for sure are: - OK: any-i386, any-amd64 - not OK: armel I believe powerpc, s390 and arm64 might be in the OK category, and mips* and sparc in the not OK category. I've seen conflicting information about which category armhf is in: on #757037, Marc Kleine-Budde said that ARMv6 and up guarantee that unaligned access is fine, but I've also found Ubuntu bug reports about unaligned access failures on armhf. The context is that https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 describes lzo2 failing to start up on armel due to unaligned memory accesses. lzo2 has a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing clever things with casting pointers. If the maintainer doesn't object (or fix the bug of course), I intend to NMU lzo2 to use that macro on at least armel; I would like to sanity-check whether I should be using a blacklist or whitelist approach, and which architectures other than armel should be on the blacklist, or which architectures other than x86 should be on the whitelist. Relatedly, if we have typedef struct lzo_memops_TU2_struct { unsigned char a[2]; } lzo_memops_TU2; *(lzo_memops_TU2 *) (void *) dest = *(const lzo_memops_TU2 *) (void *) src; is gcc within its rights to optimize that into an aligned 16-bit load and an aligned 16-bit store, even though alignof(lzo_memops_TU2) == 1; or should gcc be emitting pessimistic byte-by-byte code for that? (Unfortunately this is itself a slight simplification, because this part of lzo2 is a maze of twisty typedefs all different.) Thanks, S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/546f333a.9050...@debian.org
Re: Architectures where unaligned access is (not) OK?
On Fri, 21 Nov 2014, Simon McVittie wrote: failing to start up on armel due to unaligned memory accesses. lzo2 has a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing clever things with casting pointers. If the maintainer doesn't object Please define this macro unconditionally: • “clever” things with pointers are often Undefined Behaviour™ and GCC is a repeat offender at optimising these into brokenness; LLVM also makes advantage of the C standard here • for i386 and especially amd64, all subarchitectures supported by Debian/Linux jessie suffer so much from unaligned access, speed-wise, that it’s worth the overhead of forcing aligned access (i386, i486 maybe were not as badly affected) • it’s good to have the same codepath independent of the arch compiled for, instead of slightly differing across arches Thanks, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in Notes on Programming in C -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/alpine.deb.2.11.1411211418570.1...@tglase.lan.tarent.de
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? Not that I would be such a person, but people who are have created and published a memorandum at: https://wiki.debian.org/ArchitectureSpecificsMemo Please help enhancing it by adding lines and filling tables. -- Wikipage Best regards David Kalnischkies signature.asc Description: Digital signature
Re: Architectures where unaligned access is (not) OK?
* Simon McVittie s...@debian.org [141121 13:42]: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? [...] The ones I know for sure are: - OK: any-i386, any-amd64 Are you sure about those, especially amd64? AFAIK some newer instructions (I think something about vector code) have much tighter requirements than the old 80386 modus operandi of everything might be unaligned, it is just slower then). The context is that https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 describes lzo2 failing to start up on armel due to unaligned memory accesses. lzo2 has a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing clever things with casting pointers. 'clever things with casting pointers' sounds like someone thought C was an assembler and not a language with quite tight requirements on what you are allowed to do. If something does 'clever things' I strongly recommend to compile it with -O0. Optimisation of the compiler is only supposed to keep the behaviour of conforming code, with 'clever' things everything is possible. If the maintainer doesn't object (or fix the bug of course), I intend to NMU lzo2 to use that macro on at least armel; I would like to sanity-check whether I should be using a blacklist or whitelist approach, and which architectures other than armel should be on the blacklist, or which architectures other than x86 should be on the whitelist. Relatedly, if we have typedef struct lzo_memops_TU2_struct { unsigned char a[2]; } lzo_memops_TU2; *(lzo_memops_TU2 *) (void *) dest = *(const lzo_memops_TU2 *) (void *) src; is gcc within its rights to optimize that into an aligned 16-bit load and an aligned 16-bit store, even though alignof(lzo_memops_TU2) == 1; or should gcc be emitting pessimistic byte-by-byte code for that? With the *(const lzo_memops_TU2 *) (void *) src you tell the compiler that src is pointing to a memory address that is a lzo_memops_TU2_struct. In a case where it is not gcc is free to assume that this code path is not inteded to be ever executed and may replace it in any way it see fit to cope with other code paths (luckily replacing the code with 'system(rm -rf /);' is unlikely to happen, though that would totally legit for a compiler in cases src points to anything else.) For dest things are a bit more complicated. If the alignment is wrong gcc might replace the code with anything it wants. Otherwise that memory might afterwards be regarded as lzo_memops_TU2_struct and accessing it as anything else is fair game for the compiler to assume it is dead code. Bernhard R. Link -- F8AC 04D5 0B9B 064B 3383 C3DA AFFC 96D1 151D FFDC -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121133100.ga8...@client.brlink.eu
Re: Architectures where unaligned access is (not) OK?
https://wiki.debian.org/ArchitectureSpecificsMemo -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/546f3ec4.5030...@zoho.com
Re: Architectures where unaligned access is (not) OK?
I thought there was a flag bit you could set on x86 that causes unaligned access to trap there too. --Sam -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/0149d2a9901a-acd65aff-6bd6-41af-9767-5df9484e76b1-000...@email.amazonses.com
Re: Architectures where unaligned access is (not) OK?
On Fri, 21 Nov 2014 12:42:34 +, Simon McVittie wrote: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? * Would it be safer to assume that future architectures are in the unaligned accesses are OK category, or the not OK category? The ones I know for sure are: - OK: any-i386, any-amd64 - not OK: armel I believe powerpc, s390 and arm64 might be in the OK category, and mips* and sparc in the not OK category. Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. -- Saludos, Felipe Sateler -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/m4ngps$85h$1...@ger.gmane.org
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? * Would it be safer to assume that future architectures are in the unaligned accesses are OK category, or the not OK category? The ones I know for sure are: - OK: any-i386, any-amd64 - not OK: armel I believe powerpc, s390 and arm64 might be in the OK category, and mips* and sparc in the not OK category. I've seen conflicting information about which category armhf is in: on #757037, Marc Kleine-Budde said that ARMv6 and up guarantee that unaligned access is fine, but I've also found Ubuntu bug reports about unaligned access failures on armhf. So, the short answer for the ARM case is that ARMv6 and up guarantee that the processor can be configured such that_some_ unaligned accesses are fine (or at least just inefficient). Linux by default enab this mode - so: - All =32-bit load/store operations can be unaligned. - NEON load-stores can be unaligned (unless an explicit alignment is specified in the encoding). But: - VFP load/stores must be naturally aligned. - Load-multiple/store-multiple and ldrd/strd do not work with 32-bit alignment, but can sometimes get fixed up by the kernel _at_a_substantial_performance_penalty_. PUSH/POP usually requires 32-bit alignment, but then the ABI requires 64-bit alignment of the stack, so that is less of an issue. Unaligned load-exclusive/store-exclusive operations are not supported. No unaligned accesses are permitted to memory regions of Device or Strongly-ordered type. But some permutation of this will be true for most architectures, and in user-space this would only affect things prodding /dev/mem or otherwise mmaping from a device driver. ARM64 similarly makes it possible to trap unaligned accesses, but if that is not enabled (which it isn't by Linux), everything apart from load-exclusive/store-exclusive, load-acquire/store-release and Device memory accesses will be handled by hardware. There are additional restrictions with regards to guarantees about the atomicity of memory accesses, but I won't go into that now since it is a bit of a special case. / Leif -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121140055.gk22...@bivouac.eciton.net
Re: Architectures where unaligned access is (not) OK?
+++ Leif Lindholm [2014-11-21 14:00 +]: On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? * Would it be safer to assume that future architectures are in the unaligned accesses are OK category, or the not OK category? It would be safer to assume that unaligned accesses are a bad plan and will always come with a performace hit if they work at all. So, the short answer for the ARM case is that ARMv6 and up guarantee that the processor can be configured such that_some_ unaligned accesses are fine (or at least just inefficient). snip nice complete summary High-level summary for software engineers: Don't do unaligned access on armhf or armel. It's always inefficient, sometimes extremely inefficent, and in various cases is not permitted at all. ARM64 similarly makes it possible to trap unaligned accesses, but if that is not enabled (which it isn't by Linux), everything apart from load-exclusive/store-exclusive, load-acquire/store-release and Device memory accesses will be handled by hardware. Whilst that is nice and correct I'm not sure the meaning is clear to a software engineer, which is: Don't do unaligned access on arm64. It's always inefficient, sometimes extremely inefficient, and sometimes won't work at all. arm64 simply doesn't permit unaligned access in the core. Some types of access are wrapped (in hardware) so that 2 access will get done and you are returned the relevant pieces of the result, but this doesn't apply to all access types and is obviously slower/energy inefficent. If the kernel ends up fixing up unaligned accesses that's impresively inefficient, as on other arches. Wookey -- Principal hats: Linaro, Debian, Wookware, ARM http://wookware.org/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121144001.gj27...@stoneboat.aleph1.co.uk
Re: Architectures where unaligned access is (not) OK?
* Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. Now that sparc is no long a release architecture, these kind of problems are likely to go unnoticed. :-( -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121170111.ga4...@jwilk.net
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 6:01 PM, Jakub Wilk jw...@debian.org wrote: * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. Now that sparc is no long a release architecture, these kind of problems are likely to go unnoticed. :-( Sparc was also really good to detect fpu error. And it is a loss for scientific software. s390 (without x) was also good to detect ptrdiff_t and size_t implicit cast (see for instance http://savannah.gnu.org/bugs/?36883). Bastien -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121170111.ga4...@jwilk.net -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/cae2spaab14vbouzndyrnqnp7hlpfpruxdcrkvbxnt_fr2yn...@mail.gmail.com
Re: Architectures where unaligned access is (not) OK?
On 21.11.2014 18:01, Jakub Wilk wrote: * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. Now that sparc is no long a release architecture, these kind of problems are likely to go unnoticed. :-( the not yet released gcc 5.0 has -fsanitize=alignment which can tell you about unaligned access on all arches, can be quite useful if you care about it. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/546f82e8.7060...@googlemail.com
Re: Architectures where unaligned access is (not) OK?
On 21/11/14 13:21, Thorsten Glaser wrote: On Fri, 21 Nov 2014, Simon McVittie wrote: failing to start up on armel due to unaligned memory accesses. lzo2 has a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing clever things with casting pointers. Please define this macro unconditionally: [followed by a bunch of reasons that look sensible] Thanks. Based on this and the similar feedback from Bernhard and Wookey, I'm going to propose an NMU that does that. Thanks, S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/546f8448.7010...@debian.org
Re: Architectures where unaligned access is (not) OK?
On Fri, 2014-11-21 at 18:01:11 +0100, Jakub Wilk wrote: * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04: Sparc is definitely not ok. For evidence, see #721617, liblo was trying to fetch a double from a 4-byte aligned address. Experience with liblo shows that other architectures are just fine (or at least are just slower) with this type of unalignment. IME, sprac buildds are the only ones where where unaligned access in a build-time test suite triggers SIGBUS/SIGSEGV. Now that sparc is no long a release architecture, these kind of problems are likely to go unnoticed. :-( Generating SIGBUS should be configurable at least on ia64, hppa, powerpc and alpha. Most of those not available in Debian anymore. :( See PR_SET_UNALIGN in prctl(2). At least ia64 used to have a prctl(8) tool. Thanks, Guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121183004.ga21...@gaara.hadrons.org
Re: Architectures where unaligned access is (not) OK?
On Fri, 21 Nov 2014, Sam Hartman wrote: I thought there was a flag bit you could set on x86 that causes unaligned access to trap there too. 1. CR0.AM must be set. 2. Ask For The Pain! i386: __asm__(pushf\norl $0x4,(%esp)\npopf); x86-64: __asm__(pushf\norl $0x4,(%rsp)\npopf); -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121193440.ga14...@khazad-dum.debian.net
Re: Architectures where unaligned access is (not) OK?
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote: A couple of questions for people who know low-level things: * Of Debian's architectures (official and otherwise), which ones are known/defined/designed to be OK with unaligned accesses from user-space, and which ones (can be configured to) crash or give wrong answers? This is what OpenSSL uses: #define STRICT_ALIGNMENT 1 #ifndef PEDANTIC #if defined(__i386) || defined(__i386__)|| \ defined(__x86_64) || defined(__x86_64__) || \ defined(_M_IX86)|| defined(_M_AMD64)|| defined(_M_X64) || \ defined(__aarch64__)|| \ defined(__s390__) || defined(__s390x__) # undef STRICT_ALIGNMENT #endif #endif I assume that's the list they ended up with not having issues with it so far. I think I've seen one issue or an other on almost all the architectures, even if the wiki claims it works. man prctl says: PR_SET_UNALIGN (Only on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15; PowerPC, since Linux 2.6.18; Alpha, since Linux 2.6.22) Kurt -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121220349.ga13...@roeckx.be
Re: Architectures where unaligned access is (not) OK?
❦ 21 novembre 2014 17:34 -0200, Henrique de Moraes Holschuh h...@debian.org : I thought there was a flag bit you could set on x86 that causes unaligned access to trap there too. 1. CR0.AM must be set. 2. Ask For The Pain! i386: __asm__(pushf\norl $0x4,(%esp)\npopf); x86-64: __asm__(pushf\norl $0x4,(%rsp)\npopf); That's pretty interesting for unittests. Is one of those flags restored during context switch so that it could affect only a selected process? -- # Okay, what on Earth is this one supposed to be used for? 2.4.0 linux/drivers/char/cp437.uni signature.asc Description: PGP signature
Re: Architectures where unaligned access is (not) OK?
On Fri, 21 Nov 2014, Vincent Bernat wrote: ❦ 21 novembre 2014 17:34 -0200, Henrique de Moraes Holschuh h...@debian.org : I thought there was a flag bit you could set on x86 that causes unaligned access to trap there too. 1. CR0.AM must be set. 2. Ask For The Pain! i386: __asm__(pushf\norl $0x4,(%esp)\npopf); x86-64: __asm__(pushf\norl $0x4,(%rsp)\npopf); That's pretty interesting for unittests. Is one of those flags restored during context switch so that it could affect only a selected process? I never tried it. I suggest asking around in LKML. Also, read about Intel SMAP first. I have never looked into it, but since it has an AC flag, and added instructions STAC and CLAC, it might have overloaded or repurposed EFLAGS.AC. Make sure the stuff in the VDSO won't object to it, as well. It runs without a context switch, so it would #AC. Too bad unaligned access checks have not made it into valgrind memcheck yet (first proposed in 2011!)... -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141121230736.ga5...@khazad-dum.debian.net