Is there a document or standard (or group of standards) that define the collective ABIs of GNU/Linux systems using ELF binary formats of various CPU architectures, including at least:
 IA32 (i386/i686/AMD64/EMT64/etc...)
 ARM (v5, v5t, v7, etc...)

What is the policy of the GNU toolchain, does it attempt to support a super-set of features where code was contributed but not directly set/enforce policy itself. This being a matter for a distribution creator to establish.



My concern comes from what looks to be non-backwards compatible changes being made by the MeeGo distribution of Linux. Specifically the enablement of SSSE3 IA32 instructions for "general purpose code generation", one possible motive for this is that a particular hardware vendor can claim in marketing the platform is optimized for XYZ technology/product, the thing they are actively trying to sell of the day. This doesn't necessarily make for a good engineering choice.

While I accept that any distribution can do what they want, given the choice and resources I might wish to rebuild the entire open source project based around a better set of rules over this matter, since I have an interest in using the software but being a developer want the least amount of headaches looking into the future. I also wish to achieve the goals to be able to provide a fully native SSSE3 optimized complete system but want other systems to behave in a repeatable and consistent way when binaries ultimately end up shared across them (desktop Linux).



Does GCC support use of newer CPU instructions for "general purpose code generation" ? If so what kind of situations might they get selected for use ? It is possible the situation is being misinterpreted by me, if GCC can not actually schedule newer instructions in code generation.



So my next question is what support is there in the various formats, technologies and runtime libraries to provide a backwards compatible solution, such that a binary from one system when put on another can have any hardware incompatibilities detected at the soonest opportunity, for example upon execution, soon after execution, during DSO loading:

1) ELF magic or hwcap (this would allow the kernel to error during the exec() system call, knowing that the format is not supported by the system). DSOs loaded would also error with the same checking. Ideally a hwcap system ideally doesn't want to be a rigid bitmask but some kind of extensible ASN.1 system where anyone can register for their own hierarchical domain and assign whatever they want from within it.

2) Use of a bespoke/custom Dynamic Linker path, I would guess any system doing this would be free to implement an alternative ABI. Mixing binaries between systems would result in them not working due to lack of Dynamic Linker at that path.

3) Do the ABIs directly discuss of explain how such matters should be addressed, i.e. by guarding the execution of new CPU instructions by runtime checking. This might mean whole optimized DSOs are loaded instead, or it might mean bunch of symbols would be redirected to optimized code within the same DSO, or it might mean an inline change of the flow of execution. While I understand ABIs maybe open/loose/ambigious to allow new technologies and ideas to exist when known interoperability problems appear then guidelines should exist that provide technical answers to guide implementers so that good citizenship may follow.

4) Use of some ELF section to describe additional runtime checking rules.



The problem stems from hit and miss users getting SIGILL due to use of unguarded IA32 instructions being executed on non compatible (older) CPUs. The kernel doesn't provide any trap and emulation, so the general purpose applications abort resulting is possible data loss. Do guidelines exist within the GNU/Linux ABI on how to be a "good citizen" and help systems differentiate incompatible binaries so they simply don't run instead of causing a SIGILL potentially some months after execution started, because over the entire executable only a tiny handful of these instructions got selected by the compiler for use and that code didn't get run for a long time after the executable started.



The next matter is has anyone done any studies on the performance difference when enablement of newer instructions is possible for "general purpose code generation". I'm not so interested in specialized use cases such as codecs, compression, encryption, graphics, etc... I consider these specialized use cases for which many applications and libraries already have a workable solution by "guarding" the execution of instructions that optimize such algorithms by checking the CPU runtime support. I'm interested in the facts on how much benefit regular code gets from this choice.

Thanks,

Darryl

Reply via email to