Is there a document or standard (or group of standards) that define the
collective ABIs of GNU/Linux systems using ELF binary formats of various
CPU architectures, including at least:
IA32 (i386/i686/AMD64/EMT64/etc...)
ARM (v5, v5t, v7, etc...)
What is the policy of the GNU toolchain, does it attempt to support a
super-set of features where code was contributed but not directly
set/enforce policy itself. This being a matter for a distribution
creator to establish.
My concern comes from what looks to be non-backwards compatible changes
being made by the MeeGo distribution of Linux. Specifically the
enablement of SSSE3 IA32 instructions for "general purpose code
generation", one possible motive for this is that a particular hardware
vendor can claim in marketing the platform is optimized for XYZ
technology/product, the thing they are actively trying to sell of the
day. This doesn't necessarily make for a good engineering choice.
While I accept that any distribution can do what they want, given the
choice and resources I might wish to rebuild the entire open source
project based around a better set of rules over this matter, since I
have an interest in using the software but being a developer want the
least amount of headaches looking into the future. I also wish to
achieve the goals to be able to provide a fully native SSSE3 optimized
complete system but want other systems to behave in a repeatable and
consistent way when binaries ultimately end up shared across them
(desktop Linux).
Does GCC support use of newer CPU instructions for "general purpose code
generation" ? If so what kind of situations might they get selected for
use ? It is possible the situation is being misinterpreted by me, if
GCC can not actually schedule newer instructions in code generation.
So my next question is what support is there in the various formats,
technologies and runtime libraries to provide a backwards compatible
solution, such that a binary from one system when put on another can
have any hardware incompatibilities detected at the soonest opportunity,
for example upon execution, soon after execution, during DSO loading:
1) ELF magic or hwcap (this would allow the kernel to error during the
exec() system call, knowing that the format is not supported by the
system). DSOs loaded would also error with the same checking. Ideally
a hwcap system ideally doesn't want to be a rigid bitmask but some kind
of extensible ASN.1 system where anyone can register for their own
hierarchical domain and assign whatever they want from within it.
2) Use of a bespoke/custom Dynamic Linker path, I would guess any system
doing this would be free to implement an alternative ABI. Mixing
binaries between systems would result in them not working due to lack of
Dynamic Linker at that path.
3) Do the ABIs directly discuss of explain how such matters should be
addressed, i.e. by guarding the execution of new CPU instructions by
runtime checking. This might mean whole optimized DSOs are loaded
instead, or it might mean bunch of symbols would be redirected to
optimized code within the same DSO, or it might mean an inline change of
the flow of execution. While I understand ABIs maybe
open/loose/ambigious to allow new technologies and ideas to exist when
known interoperability problems appear then guidelines should exist that
provide technical answers to guide implementers so that good citizenship
may follow.
4) Use of some ELF section to describe additional runtime checking rules.
The problem stems from hit and miss users getting SIGILL due to use of
unguarded IA32 instructions being executed on non compatible (older)
CPUs. The kernel doesn't provide any trap and emulation, so the general
purpose applications abort resulting is possible data loss. Do
guidelines exist within the GNU/Linux ABI on how to be a "good citizen"
and help systems differentiate incompatible binaries so they simply
don't run instead of causing a SIGILL potentially some months after
execution started, because over the entire executable only a tiny
handful of these instructions got selected by the compiler for use and
that code didn't get run for a long time after the executable started.
The next matter is has anyone done any studies on the performance
difference when enablement of newer instructions is possible for
"general purpose code generation". I'm not so interested in specialized
use cases such as codecs, compression, encryption, graphics, etc... I
consider these specialized use cases for which many applications and
libraries already have a workable solution by "guarding" the execution
of instructions that optimize such algorithms by checking the CPU
runtime support. I'm interested in the facts on how much benefit
regular code gets from this choice.
Thanks,
Darryl