Shaun Pinney writes:
 > Hello all,
 > 
 > Essentially, we have code which works fine on x86/PowerPC but fails on ARM 
 > due
 > to differences in how misaligned accesses are handled.  The failures occur in
 > multiple large modules developed outside of our team and we need to find a
 > solution.  The best question to sum this up is, how can we use the compiler 
 > to
 > arrive at a complete solution to quickly identify all code locations which
 > generate misaligned accesses and/or prevent the compiler from generating
 > misaligned accesses?  Thanks for any advice.  I'll go into more detail below.
 > 
 > ---
 > We're using an ARM9 core (ARMv5) and notice that GCC generates misaligned 
 > load
 > instructions for certain modules in our platform.  For these modules, which 
 > work
 > correctly on x86/PowerPC, the misaligned loads causes failures.  This is 
 > because
 > the ARM rounds down misaligned addresses to the correct alignment, performs 
 > the
 > memory load, and rotates the data before placing in a register.  As a 
 > result, a
 > misaligned multi-byte load instruction on ARM actually loads memory below the
 > requested address and does not load all upper bytes from "address" to 
 > "address +
 > size - 1" so it appears to these modules as incorrect data.  On x86/PowerPC,
 > loads do provide bytes from "address" to "address + size - 1" regardless of
 > alignment, so there are no problems.
 > 
 > Fixing the code manually for ARM alignment has difficulties.  Due to the 
 > large
 > code volume of these external modules, it is difficult to identify all 
 > locations
 > which may be affected by misaligned accesses so the code can be rewritten.
 > Currently, the only way to detect these issues is to use -Wcast-align and 
 > view
 > the output to get a list of potential alignment issues.  This appears to 
 > list a
 > large number of false positives so sorting through and doing code 
 > investigation
 > to locate true problems looks very time-consuming.  On the runtime side, 
 > we've
 > enabled alignment exceptions to catch some additional cases, but the problem 
 > is
 > that exceptions are only thrown for running code.  There is always the chance
 > there is some more unexecuted 'hidden' code waiting to fail when the right
 > circumstance occurs.  I'd like to provably remove the problem entirely and
 > quickly.
 > 
 > One idea, to guarantee no load/store alignment problems will affect our 
 > product,
 > was to force the compiler to generate single byte load/store instructions in
 > place of multi byte load/store instructions when the alignment cannot be
 > verified by the compiler.  Such as, for pointer typecasts where the 
 > alignment is
 > increased (e.g. char * to int *), accesses to misaligned fields of packed 
 > data
 > structures, accesses to structure fields not allocated on the stack, etc.  Is
 > this available?  Obviously, this will add performance overhead, but would
 > clearly resolve the issue for affected modules.
 > 
 > Does the ARM compiler provide any other techniques to help with these types 
 > of
 > problems?  It'd be very helpful to find a fast and complete way to do this 
 > work.
 > Thanks!
 > 
 > Thanks again for your advice.
 > 
 > Best regards,
 > Shaun
 > 
 > BTW - our ARM also allows us to change the behavior of multi-byte load/store
 > instructions so they read from 'address' to 'address + size - 1'.  However, 
 > our
 > OS, indicates that it intentionally uses misaligned loads/stores, so changing
 > the ARM's load/store behavior to fix the module alignment problems would 
 > break
 > the OS in unknown places.  Also, because of this we cannot permanently enable
 > alignment exceptions either.  I plan to discuss this more with our OS vendor.

You don't name the platform OS but the obvious solution (to me anyway) is to run
the code on ARM/Linux. On that platform you can instruct the kernel to take 
various
actions on alignment faults. In particular, by

> echo 5 > /proc/cpu/alignment

you tell the kernel to log misalignment traps and then kill the offending 
process.

So you:

1. Run the application. It gets killed.
2. Retrieve the fault PC from the kernel message log.
3. Map it back to the application source. Fix the problem or add debugging code.
4. Repeat from step 1 until all alignment faults have been eliminated.

You can also instruct the kernel to (correctly) handle and emulate misaligned
loads/stores without killing the process. That allows you to run the code 
correctly,
though the fault handling will induce some performance overhead.

If you can't run Linux on your target HW then you could do the debugging in an
ARM emulator such as QEMU.

Reply via email to