Re: Architectures where unaligned access is (not) OK?

2015-05-25 Thread Jakub Wilk

* Riku Voipio riku.voi...@iki.fi, 2014-11-24, 11:07:
Sparc is definitely not ok. For evidence, see #721617, liblo was 
trying to fetch a double from a 4-byte aligned address. Experience 
with liblo shows that other architectures are just fine (or at least 
are just slower) with this type of unalignment.


IME, sprac buildds are the only ones where where unaligned access in a 
build-time test suite triggers SIGBUS/SIGSEGV.


On armel and armhf you can enable SIGBUS on unaligned access with:

echo 4  /proc/cpu/alignment


Can we get this enabled on buildds?

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150525114220.ga8...@jwilk.net



Re: Architectures where unaligned access is (not) OK?

2014-12-01 Thread Edmund Grimley Evans
Usually when you're loading an unaligned value you're also loading a
particular representation and endianness, which might not be the
native one, so I've often written code like this:

uint32_t load32(void *p)
{
  unsigned char *c = p;
  return
c[0] | (uint32_t)c[1]  8 | (uint32_t)c[2]  16 | (uint32_t)c[3]  24;
}

(The first of the three casts isn't really necessary ...)

Unfortunately, GCC doesn't seem to turn that into a single load
instruction even when it could be. (Why not?)

If you know that you want the native representation and endianness
there's this possibility:

uint32_t load32(void *p)
{
  uint32_t r;
  memcpy(r, p, sizeof(r));
  return r;
}

GCC turns that into a single load instruction on i386, amd64 and arm64.

Edmund


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAHDciUexgfHwWX_ijWrC1NWqhX7sc_3ktGwU8Ya4Ntjq_UX=f...@mail.gmail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-24 Thread Riku Voipio
On Fri, Nov 21, 2014 at 06:01:11PM +0100, Jakub Wilk wrote:
 * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04:
 Sparc is definitely not ok. For evidence, see #721617, liblo was
 trying to fetch a double from a 4-byte aligned address. Experience
 with liblo shows that other architectures are just fine (or at
 least are just slower) with this type of unalignment.
 
 IME, sprac buildds are the only ones where where unaligned access in
 a build-time test suite triggers SIGBUS/SIGSEGV.

On armel and armhf you can enable SIGBUS on unaligned access with:

echo 4  /proc/cpu/alignment

Riku


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141124090723.ga20...@afflict.kos.to



Re: Architectures where unaligned access is (not) OK?

2014-11-24 Thread Riku Voipio
On Sat, Nov 22, 2014 at 11:03:16PM +0100, Bastien ROUCARIES wrote:
 On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson
 sand...@crustytoothpaste.net wrote:
  On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote:
  * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34:
  i386:
 __asm__(pushf\norl $0x4,(%esp)\npopf);
 
  It works! Actually, it works so well it makes puts(hello world) die with
  SIGBUS. :-(
 
  Yeah, that's my experience, too.  glibc is not alignment check-safe on
  i386 and amd64.  If you turn it on in an LD_PRELOAD using _init, it
  segfaults before main.
 
 I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell
 free to add to it.

Thanks for updates. However, I'm not sure if the current unaligned
accesses letters make it clear:

Y=Yes, O=Often generally ok but ma fail in some specific case, T=Traps
may be fixed by kernel (super slow),M=Maybe generally not ok, N=No raise
SIGBUS. See detail below

Now, the M/T for armel/armhf/arm64 which is ambigous and misleading.
Better make if clear:

armel: no
armhf: yes (exceptions as listed by Leif)
arm64: yes (exceptions as listed by Leif)

And then link to Leif's mail rather than copy the text unattributed on
already long wikipage.

Riku


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141124093825.gb20...@afflict.kos.to



Re: Architectures where unaligned access is (not) OK?

2014-11-24 Thread Riku Voipio
On Fri, Nov 21, 2014 at 02:40:01PM +, Wookey wrote:
 Whilst that is nice and correct I'm not sure the meaning is clear to a
 software engineer, which is: Don't do unaligned access on arm64. It's
 always inefficient, sometimes extremely inefficient, and sometimes
 won't work at all.

That's an exaggeration.

libav/ffmpeg for example has:

  fast_unaligned_if_any=aarch64 ppc x86
  armv[6-8]*) enable fast_clz fast_unaligned ;;

So the developers consider unaligned accesses a faster option than
aligned on armv6+. Feel free to benchmark with and without the flag :)

I can understand CPU/cache designers pain in implementing unaligned access,
but reality is that ARMv7+ supports unaligned accesses quite well and this
sometimes allows software engineers sometimes to write more clean and
maintainable code. 

This doesn't mean that unaligned accesses make always sense. There is plenty
of unaligned accesses that could be converted to more clean code that does
aligned accesses... 
 
 arm64 simply doesn't permit unaligned access in the core. Some types
 of access are wrapped (in hardware) so that 2 access will get done and
 you are returned the relevant pieces of the result, but this doesn't
 apply to all access types and is obviously slower/energy
 inefficent. 

If your data is split over aligment boundaries, you will need multiple
memory accesses anyways. 

Riku


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141124103056.gc20...@afflict.kos.to



Re: Architectures where unaligned access is (not) OK?

2014-11-24 Thread Bastien ROUCARIES
On Mon, Nov 24, 2014 at 10:38 AM, Riku Voipio riku.voi...@iki.fi wrote:
 On Sat, Nov 22, 2014 at 11:03:16PM +0100, Bastien ROUCARIES wrote:
 On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson
 sand...@crustytoothpaste.net wrote:
  On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote:
  * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34:
  i386:
 __asm__(pushf\norl $0x4,(%esp)\npopf);
 
  It works! Actually, it works so well it makes puts(hello world) die with
  SIGBUS. :-(
 
  Yeah, that's my experience, too.  glibc is not alignment check-safe on
  i386 and amd64.  If you turn it on in an LD_PRELOAD using _init, it
  segfaults before main.

 I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell
 free to add to it.

 Thanks for updates. However, I'm not sure if the current unaligned
 accesses letters make it clear:

 Y=Yes, O=Often generally ok but ma fail in some specific case, T=Traps
 may be fixed by kernel (super slow),M=Maybe generally not ok, N=No 
 raise
 SIGBUS. See detail below

i am open to suggestion

 Now, the M/T for armel/armhf/arm64 which is ambigous and misleading.
 Better make if clear:

 armel: no

No = sigbus ? or trap ?
 armhf: yes (exceptions as listed by Leif)
So it is often like i386
 arm64: yes (exceptions as listed by Leif)
often

 And then link to Leif's mail rather than copy the text unattributed on
 already long wikipage.

OK will add link. But I think the page will grow with fpu stuff...

 Riku


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/cae2spaa-sbo8j8rpnvo2ogqzcftwr4_ztardoxtytemqvac...@mail.gmail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Simon McVittie
On 21/11/14 13:31, Bernhard R. Link wrote:
 Otherwise that memory
 might afterwards be regarded as lzo_memops_TU2_struct

lzo_memops_TU2_struct is declared with __attribute__((__may_alias__)),
so actually the right thing should be happening WRT aliasing in this case.

On 21/11/14 13:21, Thorsten Glaser wrote:
 • for i386 and especially amd64, all subarchitectures supported
   by Debian/Linux jessie suffer so much from unaligned access,
   speed-wise, that it’s worth the overhead of forcing aligned
   access (i386, i486 maybe were not as badly affected)

I was hoping this statement was correct, because if it was, avoiding
unaligned accesses would be a clear win regardless, and the right thing
to do would be entirely uncontroversial.

Unfortunately, on my x86-64 laptop, my patched liblzo2 with
-DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
the unpatched one for a simple test-case (uncompress
linux_3.17.orig.tar.xz to linux_3.17.orig.tar in a tmpfs, time lzop -c 
linux_3.17.orig.tar  /dev/null, repeat 3 times; results agree within 10%).

I'm trying out a slightly different approach: keeping the unaligned
accesses via casts like *(uint16_t *) on architectures where lzodefs.h
specifically allows them, but disabling the casts via
struct { char[n] } conditional on alignof(that struct) == 1, which seem
to be the problematic ones.

The CPUs for which lzodefs.h uses those casts are amd64, arm*
conditional on target CPU (so armel but not armhf in Debian terms),
arm64, cris, i386, m68k conditional on target CPU (__mc68020__ but not
__mcoldfire__), powerpc* if big-endian, and s390*.

S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/54721f74.7010...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Simon McVittie
On 23/11/14 17:55, Simon McVittie wrote:
 Unfortunately, on my x86-64 laptop, my patched liblzo2 with
 -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
 the unpatched one
[...]
 I'm trying out a slightly different approach: keeping the unaligned
 accesses via casts like *(uint16_t *) on architectures where lzodefs.h
 specifically allows them, but disabling the casts via
 struct { char[n] } conditional on alignof(that struct) == 1, which seem
 to be the problematic ones.

That fixed the performance regression on amd64 while still working
correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff.

If anyone has better ideas, I'm happy to cancel the delayed upload and
let someone take over fixing the bug.

S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/54725b89.5020...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Julian Taylor
On 23.11.2014 23:11, Simon McVittie wrote:
 On 23/11/14 17:55, Simon McVittie wrote:
 Unfortunately, on my x86-64 laptop, my patched liblzo2 with
 -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
 the unpatched one
 [...]
 I'm trying out a slightly different approach: keeping the unaligned
 accesses via casts like *(uint16_t *) on architectures where lzodefs.h
 specifically allows them, but disabling the casts via
 struct { char[n] } conditional on alignof(that struct) == 1, which seem
 to be the problematic ones.
 
 That fixed the performance regression on amd64 while still working
 correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff.
 
 If anyone has better ideas, I'm happy to cancel the delayed upload and
 let someone take over fixing the bug.
 


what works well is just replacing the offending memory loads with the
memcpy call. As the size of the memcpy call is constant the compiler
will take care of emitting code appropriate for the platform.
It doesn't help speed up the code on the trapping arches but at least
one does not penalize the ones where it is allowed.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/54725fea.8050...@googlemail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Simon McVittie
On 23/11/14 22:30, Julian Taylor wrote:
 what works well is just replacing the offending memory loads with the
 memcpy call. As the size of the memcpy call is constant the compiler
 will take care of emitting code appropriate for the platform.

Ah, even better; the timing on x86-64 comes out the same. I'll cancel
the NMU and retry.

S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/547262e4.40...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Ben Hutchings
On Sun, 2014-11-23 at 23:30 +0100, Julian Taylor wrote:
 On 23.11.2014 23:11, Simon McVittie wrote:
  On 23/11/14 17:55, Simon McVittie wrote:
  Unfortunately, on my x86-64 laptop, my patched liblzo2 with
  -DLZO_CFG_NO_UNALIGNED on all architectures seems to be half as fast as
  the unpatched one
  [...]
  I'm trying out a slightly different approach: keeping the unaligned
  accesses via casts like *(uint16_t *) on architectures where lzodefs.h
  specifically allows them, but disabling the casts via
  struct { char[n] } conditional on alignof(that struct) == 1, which seem
  to be the problematic ones.
  
  That fixed the performance regression on amd64 while still working
  correctly on armv5tel, so I've uploaded it as a DELAYED/7 NMU. See
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 for nmudiff.
  
  If anyone has better ideas, I'm happy to cancel the delayed upload and
  let someone take over fixing the bug.
  
 
 
 what works well is just replacing the offending memory loads with the
 memcpy call.
[...]

That is not necessarily true, e.g. in this function

void copy_foo(struct foo *dst, const struct foo *src)
{
memcpy(dst, src, sizeof(*dst));
}

the compiler is still allowed to assume that src has the proper
alignment for struct foo and to optimise the memcpy() accordingly.  And
yes, this is something that gcc really does.

Pointers to an unaligned instance of a structure generally need to be
declared as void *, char * or unsigned char * (or const-qualified
versions).

Ben.

-- 
Ben Hutchings
Never put off till tomorrow what you can avoid all together.


signature.asc
Description: This is a digitally signed message part


Re: Architectures where unaligned access is (not) OK?

2014-11-23 Thread Simon McVittie
On 23/11/14 22:54, Ben Hutchings wrote:
 in this function
 
   void copy_foo(struct foo *dst, const struct foo *src)
   {
   memcpy(dst, src, sizeof(*dst));
   }
 
 the compiler is still allowed to assume that src has the proper
 alignment for struct foo and to optimise the memcpy() accordingly.

I don't *think* lzo relies on that; the struct assignment I mentioned in
a previous mail is part of its fallback implementation of what is
basically 1-, 2-, 4- and 8-byte memcpy. The arguments seem to be
unsigned char * in practice.

liblzo2 seems to be one of these codebases that bases its idea of how C
works on portability folklore and the assumption that the compiler and
standard C library are the most naive implementation possible :-(

S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/5472686e.5040...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-22 Thread Jakub Wilk

* Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34:

i386:
   __asm__(pushf\norl $0x4,(%esp)\npopf);


It works! Actually, it works so well it makes puts(hello world) die 
with SIGBUS. :-(


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141122204243.ga1...@jwilk.net



Re: Architectures where unaligned access is (not) OK?

2014-11-22 Thread brian m. carlson
On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote:
 * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34:
 i386:
__asm__(pushf\norl $0x4,(%esp)\npopf);
 
 It works! Actually, it works so well it makes puts(hello world) die with
 SIGBUS. :-(

Yeah, that's my experience, too.  glibc is not alignment check-safe on
i386 and amd64.  If you turn it on in an LD_PRELOAD using _init, it
segfaults before main.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: Architectures where unaligned access is (not) OK?

2014-11-22 Thread Bastien ROUCARIES
On Sat, Nov 22, 2014 at 10:37 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 On Sat, Nov 22, 2014 at 09:42:43PM +0100, Jakub Wilk wrote:
 * Henrique de Moraes Holschuh h...@debian.org, 2014-11-21, 17:34:
 i386:
__asm__(pushf\norl $0x4,(%esp)\npopf);

 It works! Actually, it works so well it makes puts(hello world) die with
 SIGBUS. :-(

 Yeah, that's my experience, too.  glibc is not alignment check-safe on
 i386 and amd64.  If you turn it on in an LD_PRELOAD using _init, it
 segfaults before main.

I have improved https://wiki.debian.org/ArchitectureSpecificsMemo fell
free to add to it.

Bastien

 brian m. carlson / brian with sandals: Houston, Texas, US
 +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
 OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAE2SPAZzY4ubyQS=L+822=hfPRn5=wu18vxed8jphiteauj...@mail.gmail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-22 Thread Florian Weimer
* Simon McVittie:

 - OK: any-i386, any-amd64

SSE2 is part of amd64 and i386, and has strict alignment requirements.
This is why stack alignment bugs in the toolchain are usually fatal.
(We still support SSE2-less i386 installations, I think, but some
libraries will use SSE2 when available.)

i386 also supports alignment checking.  It used to be possible to run
quite a bit of code with that flag switched on, but nowadays, glibc
string functions use unaligned accesses in some cases because on
current implementations, they are faster than the alternatives, so
this way of debugging alignment issues no longer works.

 - not OK: armel

Many architectures take a significant performance hit.  Usually, this
is because unaligned accesses are emulated in a kernel trap (which can
be switched off to debug these performance issues, hence the
differences in system behavior).

Some older i386/amd64 implementations have relatively costly unaligned
access, too, but only in the order of a couple of cycles, not the
hundreds or thousands kernel emulation will require.

It is very difficult to write correct C code which uses unaligned
pointers because they are an aliasing violtion as well.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87k32mfl41@mid.deneb.enyo.de



Architectures where unaligned access is (not) OK?

2014-11-21 Thread Simon McVittie
A couple of questions for people who know low-level things:

* Of Debian's architectures (official and otherwise), which ones are
  known/defined/designed to be OK with unaligned accesses from
  user-space, and which ones (can be configured to) crash or give wrong
  answers?

* Would it be safer to assume that future architectures are in the
  unaligned accesses are OK category, or the not OK
  category?

The ones I know for sure are:

- OK: any-i386, any-amd64
- not OK: armel

I believe powerpc, s390 and arm64 might be in the OK category, and
mips* and sparc in the not OK category. I've seen conflicting
information about which category armhf is in: on #757037, Marc
Kleine-Budde said that ARMv6 and up guarantee that unaligned access is
fine, but I've also found Ubuntu bug reports about unaligned access
failures on armhf.

The context is that
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 describes lzo2
failing to start up on armel due to unaligned memory accesses. lzo2 has
a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing
clever things with casting pointers. If the maintainer doesn't object
(or fix the bug of course), I intend to NMU lzo2 to use that macro on at
least armel; I would like to sanity-check whether I should be using a
blacklist or whitelist approach, and which architectures other than
armel should be on the blacklist, or which architectures other than x86
should be on the whitelist.

Relatedly, if we have

typedef struct lzo_memops_TU2_struct {
  unsigned char a[2];
} lzo_memops_TU2;

*(lzo_memops_TU2 *) (void *) dest =
*(const lzo_memops_TU2 *) (void *) src;

is gcc within its rights to optimize that into an aligned 16-bit load
and an aligned 16-bit store, even though alignof(lzo_memops_TU2) == 1;
or should gcc be emitting pessimistic byte-by-byte code for that?

(Unfortunately this is itself a slight simplification, because this part
of lzo2 is a maze of twisty typedefs all different.)

Thanks,
S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/546f333a.9050...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Thorsten Glaser
On Fri, 21 Nov 2014, Simon McVittie wrote:

 failing to start up on armel due to unaligned memory accesses. lzo2 has
 a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing
 clever things with casting pointers. If the maintainer doesn't object

Please define this macro unconditionally:

• “clever” things with pointers are often Undefined Behaviour™
  and GCC is a repeat offender at optimising these into brokenness;
  LLVM also makes advantage of the C standard here

• for i386 and especially amd64, all subarchitectures supported
  by Debian/Linux jessie suffer so much from unaligned access,
  speed-wise, that it’s worth the overhead of forcing aligned
  access (i386, i486 maybe were not as badly affected)

• it’s good to have the same codepath independent of the arch
  compiled for, instead of slightly differing across arches

Thanks,
//mirabilos
-- 
Sometimes they [people] care too much: pretty printers [and syntax highligh-
ting, d.A.] mechanically produce pretty output that accentuates irrelevant
detail in the program, which is as sensible as putting all the prepositions
in English text in bold font.   -- Rob Pike in Notes on Programming in C


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/alpine.deb.2.11.1411211418570.1...@tglase.lan.tarent.de



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread David Kalnischkies
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote:
 A couple of questions for people who know low-level things:
 
 * Of Debian's architectures (official and otherwise), which ones are
   known/defined/designed to be OK with unaligned accesses from
   user-space, and which ones (can be configured to) crash or give wrong
   answers?

Not that I would be such a person, but people who are
have created and published a memorandum at:
https://wiki.debian.org/ArchitectureSpecificsMemo

Please help enhancing it by adding lines and filling tables.
 -- Wikipage


Best regards

David Kalnischkies


signature.asc
Description: Digital signature


Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Bernhard R. Link
* Simon McVittie s...@debian.org [141121 13:42]:
 A couple of questions for people who know low-level things:

 * Of Debian's architectures (official and otherwise), which ones are
   known/defined/designed to be OK with unaligned accesses from
   user-space, and which ones (can be configured to) crash or give wrong
   answers?
[...]
 The ones I know for sure are:

 - OK: any-i386, any-amd64

Are you sure about those, especially amd64? AFAIK some newer
instructions (I think something about vector code) have much tighter
requirements than the old 80386 modus operandi of everything might be 
unaligned,
it is just slower then).

 The context is that
 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757037 describes lzo2
 failing to start up on armel due to unaligned memory accesses. lzo2 has
 a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing
 clever things with casting pointers.

'clever things with casting pointers' sounds like someone thought C
was an assembler and not a language with quite tight requirements on
what you are allowed to do. If something does 'clever things' I strongly
recommend to compile it with -O0. Optimisation of the compiler is only
supposed to keep the behaviour of conforming code, with 'clever' things
everything is possible.

 If the maintainer doesn't object
 (or fix the bug of course), I intend to NMU lzo2 to use that macro on at
 least armel; I would like to sanity-check whether I should be using a
 blacklist or whitelist approach, and which architectures other than
 armel should be on the blacklist, or which architectures other than x86
 should be on the whitelist.
 
 Relatedly, if we have
 
 typedef struct lzo_memops_TU2_struct {
   unsigned char a[2];
 } lzo_memops_TU2;
 
 *(lzo_memops_TU2 *) (void *) dest =
 *(const lzo_memops_TU2 *) (void *) src;
 
 is gcc within its rights to optimize that into an aligned 16-bit load
 and an aligned 16-bit store, even though alignof(lzo_memops_TU2) == 1;
 or should gcc be emitting pessimistic byte-by-byte code for that?

With the *(const lzo_memops_TU2 *) (void *) src you tell the compiler
that src is pointing to a memory address that is a lzo_memops_TU2_struct. 
In a case where it is not gcc is free to assume that this code path is
not  inteded to be ever executed and may replace it in any way it see fit
to cope with other code paths (luckily replacing the code with
'system(rm -rf /);' is unlikely to happen, though that would totally
legit for a compiler in cases src points to anything else.)
For dest things are a bit more complicated. If the alignment is wrong
gcc might replace the code with anything it wants. Otherwise that memory
might afterwards be regarded as lzo_memops_TU2_struct and accessing it
as anything else is fair game for the compiler to assume it is dead
code.

Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121133100.ga8...@client.brlink.eu



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Rebecca N. Palmer

https://wiki.debian.org/ArchitectureSpecificsMemo


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/546f3ec4.5030...@zoho.com



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Sam Hartman
I thought there was a flag bit you could set on x86 that causes
unaligned access to trap there too.

--Sam


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/0149d2a9901a-acd65aff-6bd6-41af-9767-5df9484e76b1-000...@email.amazonses.com



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Felipe Sateler
On Fri, 21 Nov 2014 12:42:34 +, Simon McVittie wrote:

 A couple of questions for people who know low-level things:
 
 * Of Debian's architectures (official and otherwise), which ones are
   known/defined/designed to be OK with unaligned accesses from
   user-space, and which ones (can be configured to) crash or give wrong
   answers?
 
 * Would it be safer to assume that future architectures are in the
   unaligned accesses are OK category, or the not OK
   category?
 
 The ones I know for sure are:
 
 - OK: any-i386, any-amd64 - not OK: armel
 
 I believe powerpc, s390 and arm64 might be in the OK category, and
 mips* and sparc in the not OK category.

Sparc is definitely not ok. For evidence, see #721617, liblo was trying 
to fetch a double from a 4-byte aligned address. Experience with liblo 
shows that other architectures are just fine (or at least are just 
slower) with this type of unalignment.


-- 
Saludos,
Felipe Sateler


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/m4ngps$85h$1...@ger.gmane.org



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Leif Lindholm
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote:
 A couple of questions for people who know low-level things:
 
 * Of Debian's architectures (official and otherwise), which ones are
   known/defined/designed to be OK with unaligned accesses from
   user-space, and which ones (can be configured to) crash or give wrong
   answers?
 
 * Would it be safer to assume that future architectures are in the
   unaligned accesses are OK category, or the not OK
   category?
 
 The ones I know for sure are:
 
 - OK: any-i386, any-amd64
 - not OK: armel
 
 I believe powerpc, s390 and arm64 might be in the OK category, and
 mips* and sparc in the not OK category. I've seen conflicting
 information about which category armhf is in: on #757037, Marc
 Kleine-Budde said that ARMv6 and up guarantee that unaligned access is
 fine, but I've also found Ubuntu bug reports about unaligned access
 failures on armhf.

So, the short answer for the ARM case is that ARMv6 and up guarantee
that the processor can be configured such that_some_ unaligned
accesses are fine (or at least just inefficient). Linux by default
enab this mode - so:
- All =32-bit load/store operations can be unaligned.
- NEON load-stores can be unaligned (unless an explicit alignment is
  specified in the encoding).

But:
- VFP load/stores must be naturally aligned.
- Load-multiple/store-multiple and ldrd/strd do not work with 32-bit
  alignment, but can sometimes get fixed up by the kernel
  _at_a_substantial_performance_penalty_.

PUSH/POP usually requires 32-bit alignment, but then the ABI requires
64-bit alignment of the stack, so that is less of an issue.

Unaligned load-exclusive/store-exclusive operations are not supported.

No unaligned accesses are permitted to memory regions of Device or
Strongly-ordered type. But some permutation of this will be true for
most architectures, and in user-space this would only affect things
prodding /dev/mem or otherwise mmaping from a device driver.

ARM64 similarly makes it possible to trap unaligned accesses, but if
that is not enabled (which it isn't by Linux), everything apart from
load-exclusive/store-exclusive, load-acquire/store-release and Device
memory accesses will be handled by hardware.

There are additional restrictions with regards to guarantees about the
atomicity of memory accesses, but I won't go into that now since it is
a bit of a special case.

/
Leif


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121140055.gk22...@bivouac.eciton.net



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Wookey
+++ Leif Lindholm [2014-11-21 14:00 +]:
 On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote:
  A couple of questions for people who know low-level things:
  
  * Of Debian's architectures (official and otherwise), which ones are
known/defined/designed to be OK with unaligned accesses from
user-space, and which ones (can be configured to) crash or give wrong
answers?
  
  * Would it be safer to assume that future architectures are in the
unaligned accesses are OK category, or the not OK
category?

It would be safer to assume that unaligned accesses are a bad plan and
will always come with a performace hit if they work at all.

 So, the short answer for the ARM case is that ARMv6 and up guarantee
 that the processor can be configured such that_some_ unaligned
 accesses are fine (or at least just inefficient).

snip nice complete summary

High-level summary for software engineers: Don't do unaligned access
on armhf or armel. It's always inefficient, sometimes extremely
inefficent, and in various cases is not permitted at all.

 ARM64 similarly makes it possible to trap unaligned accesses, but if
 that is not enabled (which it isn't by Linux), everything apart from
 load-exclusive/store-exclusive, load-acquire/store-release and Device
 memory accesses will be handled by hardware.

Whilst that is nice and correct I'm not sure the meaning is clear to a
software engineer, which is: Don't do unaligned access on arm64. It's
always inefficient, sometimes extremely inefficient, and sometimes
won't work at all.

arm64 simply doesn't permit unaligned access in the core. Some types
of access are wrapped (in hardware) so that 2 access will get done and
you are returned the relevant pieces of the result, but this doesn't
apply to all access types and is obviously slower/energy
inefficent. If the kernel ends up fixing up unaligned accesses that's
impresively inefficient, as on other arches.

Wookey
-- 
Principal hats:  Linaro, Debian, Wookware, ARM
http://wookware.org/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121144001.gj27...@stoneboat.aleph1.co.uk



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Jakub Wilk

* Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04:
Sparc is definitely not ok. For evidence, see #721617, liblo was trying 
to fetch a double from a 4-byte aligned address. Experience with liblo 
shows that other architectures are just fine (or at least are just 
slower) with this type of unalignment.


IME, sprac buildds are the only ones where where unaligned access in a 
build-time test suite triggers SIGBUS/SIGSEGV.


Now that sparc is no long a release architecture, these kind of 
problems are likely to go unnoticed. :-(


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121170111.ga4...@jwilk.net



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Bastien ROUCARIES
On Fri, Nov 21, 2014 at 6:01 PM, Jakub Wilk jw...@debian.org wrote:
 * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04:

 Sparc is definitely not ok. For evidence, see #721617, liblo was trying to
 fetch a double from a 4-byte aligned address. Experience with liblo shows
 that other architectures are just fine (or at least are just slower) with
 this type of unalignment.


 IME, sprac buildds are the only ones where where unaligned access in a
 build-time test suite triggers SIGBUS/SIGSEGV.

 Now that sparc is no long a release architecture, these kind of problems are
 likely to go unnoticed. :-(

Sparc was also really good to detect fpu error. And it is a loss for
scientific software.

s390 (without x) was also good to detect ptrdiff_t and size_t implicit
cast (see for instance http://savannah.gnu.org/bugs/?36883).

Bastien

 --
 Jakub Wilk


 --
 To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
 with a subject of unsubscribe. Trouble? Contact
 listmas...@lists.debian.org
 Archive: https://lists.debian.org/20141121170111.ga4...@jwilk.net



-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/cae2spaab14vbouzndyrnqnp7hlpfpruxdcrkvbxnt_fr2yn...@mail.gmail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Julian Taylor
On 21.11.2014 18:01, Jakub Wilk wrote:
 * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04:
 Sparc is definitely not ok. For evidence, see #721617, liblo was
 trying to fetch a double from a 4-byte aligned address. Experience
 with liblo shows that other architectures are just fine (or at least
 are just slower) with this type of unalignment.
 
 IME, sprac buildds are the only ones where where unaligned access in a
 build-time test suite triggers SIGBUS/SIGSEGV.
 
 Now that sparc is no long a release architecture, these kind of problems
 are likely to go unnoticed. :-(
 

the not yet released gcc 5.0 has -fsanitize=alignment which can tell you
about unaligned access on all arches, can be quite useful if you care
about it.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/546f82e8.7060...@googlemail.com



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Simon McVittie
On 21/11/14 13:21, Thorsten Glaser wrote:
 On Fri, 21 Nov 2014, Simon McVittie wrote:
 failing to start up on armel due to unaligned memory accesses. lzo2 has
 a cpp macro, LZO_CFG_NO_UNALIGNED which can be defined to stop it doing
 clever things with casting pointers.
 
 Please define this macro unconditionally:
[followed by a bunch of reasons that look sensible]

Thanks. Based on this and the similar feedback from Bernhard and Wookey,
I'm going to propose an NMU that does that.

Thanks,
S


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/546f8448.7010...@debian.org



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Guillem Jover
On Fri, 2014-11-21 at 18:01:11 +0100, Jakub Wilk wrote:
 * Felipe Sateler fsate...@debian.org, 2014-11-21, 14:04:
 Sparc is definitely not ok. For evidence, see #721617, liblo was trying to
 fetch a double from a 4-byte aligned address. Experience with liblo shows
 that other architectures are just fine (or at least are just slower) with
 this type of unalignment.
 
 IME, sprac buildds are the only ones where where unaligned access in a
 build-time test suite triggers SIGBUS/SIGSEGV.
 
 Now that sparc is no long a release architecture, these kind of problems are
 likely to go unnoticed. :-(

Generating SIGBUS should be configurable at least on ia64, hppa, powerpc
and alpha. Most of those not available in Debian anymore. :(

See PR_SET_UNALIGN in prctl(2). At least ia64 used to have a prctl(8)
tool.

Thanks,
Guillem


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121183004.ga21...@gaara.hadrons.org



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Henrique de Moraes Holschuh
On Fri, 21 Nov 2014, Sam Hartman wrote:
 I thought there was a flag bit you could set on x86 that causes
 unaligned access to trap there too.

1. CR0.AM must be set.

2. Ask For The Pain!

i386:
__asm__(pushf\norl $0x4,(%esp)\npopf);

x86-64:
__asm__(pushf\norl $0x4,(%rsp)\npopf);

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121193440.ga14...@khazad-dum.debian.net



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Kurt Roeckx
On Fri, Nov 21, 2014 at 12:42:34PM +, Simon McVittie wrote:
 A couple of questions for people who know low-level things:
 
 * Of Debian's architectures (official and otherwise), which ones are
   known/defined/designed to be OK with unaligned accesses from
   user-space, and which ones (can be configured to) crash or give wrong
   answers?

This is what OpenSSL uses:
#define STRICT_ALIGNMENT 1
#ifndef PEDANTIC
#if defined(__i386) || defined(__i386__)|| \
defined(__x86_64)   || defined(__x86_64__)  || \
defined(_M_IX86)|| defined(_M_AMD64)|| defined(_M_X64) || \
defined(__aarch64__)|| \
defined(__s390__)   || defined(__s390x__)
# undef STRICT_ALIGNMENT
#endif
#endif

I assume that's the list they ended up with not having issues with
it so far. I think I've seen one issue or an other on almost
all the architectures, even if the wiki claims it works.

man prctl says:
   PR_SET_UNALIGN
  (Only  on:  ia64,  since  Linux  2.3.48;  parisc,
since  Linux  2.6.15;  PowerPC,  since  Linux  2.6.18; Alpha,
since Linux 2.6.22) 



Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121220349.ga13...@roeckx.be



Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Vincent Bernat
 ❦ 21 novembre 2014 17:34 -0200, Henrique de Moraes Holschuh h...@debian.org :

 I thought there was a flag bit you could set on x86 that causes
 unaligned access to trap there too.

 1. CR0.AM must be set.

 2. Ask For The Pain!

 i386:
 __asm__(pushf\norl $0x4,(%esp)\npopf);

 x86-64:
 __asm__(pushf\norl $0x4,(%rsp)\npopf);

That's pretty interesting for unittests. Is one of those flags restored
during context switch so that it could affect only a selected process?
-- 
# Okay, what on Earth is this one supposed to be used for?
2.4.0 linux/drivers/char/cp437.uni


signature.asc
Description: PGP signature


Re: Architectures where unaligned access is (not) OK?

2014-11-21 Thread Henrique de Moraes Holschuh
On Fri, 21 Nov 2014, Vincent Bernat wrote:
  ❦ 21 novembre 2014 17:34 -0200, Henrique de Moraes Holschuh 
 h...@debian.org :
  I thought there was a flag bit you could set on x86 that causes
  unaligned access to trap there too.
 
  1. CR0.AM must be set.
 
  2. Ask For The Pain!
 
  i386:
  __asm__(pushf\norl $0x4,(%esp)\npopf);
 
  x86-64:
  __asm__(pushf\norl $0x4,(%rsp)\npopf);
 
 That's pretty interesting for unittests. Is one of those flags restored
 during context switch so that it could affect only a selected process?

I never tried it.  I suggest asking around in LKML.

Also, read about Intel SMAP first.  I have never looked into it, but since
it has an AC flag, and added instructions STAC and CLAC, it might have
overloaded or repurposed EFLAGS.AC.

Make sure the stuff in the VDSO won't object to it, as well.  It runs
without a context switch, so it would #AC.

Too bad unaligned access checks have not made it into valgrind memcheck yet
(first proposed in 2011!)...

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20141121230736.ga5...@khazad-dum.debian.net