>>>> SIGBUS normally denotes unaligned access, but instruction in qustion >>>> pulls 16-bit value and effective address is 16-bit aligned... > > I just tried a test .S file with > > ldda [%sp+0+16]%asi, %f0 > ldda [%sp+0+8]%asi, %f0 > ldda [%sp+0+4]%asi, %f0 > ldda [%sp+0+2]%asi, %f0 > > And +4 is the first one it SIGBUS'd on. So if the alignment in > sparcv9a-mont is increases to +8, it would also work on T1.
Yes, but spacv9a-mont *relies* on +2, +4 and even +6. Offsets are used to pick 16-bit words constituting single [naturally aligned] 64-bit value, i.e. words reside on adjacent +2n offsets [with n=0-3]. It does work on UltraSPARC-I-IV and SPARC64 V-VII. But getting bn_mul_mont_fpu working on T1 is *not* the goal, because performance would be *horrible* (1/10th or worth). Idea implemented in updated sparcv9cap.c is to use this SIGBUS to heuristically detect T1 and to disable FP code in favor of pure IALU bn_mul_mont_int... ... But wait... The fact that I remember 1/10th coefficient must mean that sparcv9a-mont did work under Solaris on T1. Question is how. Chances are that Solaris kernel transparently fixes the ldda unaligned access in trap handler. Meaning that *if/when* Linux chooses to do the same, the above mentioned heuristic test will fail to detect T1... A. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org