Andy Polyakov via RT a écrit :
        I see a very strange bug in crypto/sparcv9cap.c. OpenSSL 1.0.0d checks
sparc capabilities with SIGILL signal. On sparc64 (both Linux and
solaris, with UltraSPARC III+ and T1 CPU's), SIGILL handler is called
and program terminates with SIGILL in _sparcv9_fmadd_probe:

00000001002a32d0<_sparcv9_fmadd_probe>:
1002a32d0: 81 b0 0d 80 impdep1 108, %f0, %f0, %f0
1002a32d4: 85 b0 8d 82 impdep1 108, %f2, %f2, %f2
1002a32d8: 81 c3 e0 08 retl
1002a32dc: 81 b8 04 40 impdep2 34, %f0, %f0, %f0<= here

        If I add printf() in signal handler, I see that it is called, and that
siglongjmp() works. With my printf(), my program doesn't abort with
SIGILL anymore but with SIGBUS (?!).
Could you 'truss -v sigaction,sigprocmask apps/openssl version' and
submit output?

        I cannot. My server doesn't run Solaris but Linux/sparc. I just have
seen the same bug a long time ago on Solaris but I don't have any
solaris server anymore.

Then run 'strace -v apps/openssl version'.

I'll do it, but I have to rebuild openssl without no-asm. I have no time to rebuild and test this evening, but I shall test as soon as possible.

Modifications :
static void common_handler(int sig)
{ printf("Signal handler\n"); siglongjmp(common_jmp,sig); }

        I don't understand why, with this trivial modification, my program run
fine (and of course prints "Signal handler" on stdout).
I don't understand. First use say that it fails with SIBUS (instead of
expected SIGILL) and then you say that program runs fine. Could you be
kind to clarify?

        OpenSSL uses SIGILL signal handler to check processor capabilities. Out
of the box, OpenSSL library aborts during initialization with SIGILL.
And if you check inside sources, you'll see that this SIGILL has to be
catch by a signal handler.

        To debug, I have added a simple printf() in this signal handler and I
have seen that it is called when SIGILL is raised. But I obtain a new
SIGBUS signal (!).

And SIGBUS should be caught too. In other words it does not work, i.e.
program does *not* "run fine", right?

        I agree, but I don't understand why :
- with printf() in signal handler, signal handler is called ;
- without printf() in signal handler, SIGILL is not caught.

        I have seen this bug some months ago (dec 2010) on a sparc T1 running
Solaris, but I'm not able to remember how I have fixed this trouble...
But do you have binary left? In worst case one (I) can disassemble it
and identify the change... Not that I really understand what's going on,
as I can't reproduce the problem on UltraSPARC-IIe and III...

        I don't have, sorry. But I'm pretty sure that this bug is in sparc
assembly.

The code was verified to work on US-IIe and III on Solaris. At some
point I had opportunity to test the signal catching even on US-T1
running Linux, but I'm not sure if check for fmadd was added later or
not. But either way it sounds more like signal mask getting screwed up
[than bug in machine code], if you set up SIGILL handler and mask it,
the program will be terminated as if handler was not set.

Yes, but by default, userland is only 32 bits on linux/sparc and I don't understand why you check sparcv9 instructions in 32 bits mode. My OpenSSL library is built in 32 bits mode.

As for generalization to Solaris. Is it possible that by the time you
had problem on Solaris it was another problem?

Maybe, but I'm not sure. SIGILL comes from same instruction (only in 32 bits mode). If I remember, openssl worked fine in 64 bits.

There was
Solaris-specific problem with sparcv9cap.c fixed last year. It was found
that libdevinfo.so was buggy and the code was abandoned in favour of
generic SIGILL-based detection.

        Regards,

        JKB
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to