Re: Core dump in RSA_check_key

Andy Polyakov Tue, 24 Jul 2012 02:21:06 -0700

>>>> I am seeing a core dump in RSA_check_key() function. The backtrace
>> is
>>>> as below.
>> What OS? Which OpenSSL version? Is it correct assumption that it's
>> custom/own program? Can you reproduce the problem with openssl utility,
>> with 'openssl rsa -in file.pem -check -noout')? Can you reproduce it
>> with 'openssl genrsa 2048 | openssl -check -noout'? Then there is big
>> question if sha1-x86_64.s:2240 is reliable. To answer that question you
>> have run 'disassemble' at gdb prompt and proceed till you see failed
>> instruction (marked with => or something) and write it down. What's
>> 'info reg' at that point? If we are talking about this segment:
> Sorry to get back after a long while. I am seeing this on Linux with custom 
> kernel.


Question was not about kernel, but about *program* suffering from crash.
Is it openssl that suffers from crash or something you've written
yourself [or got from somebody else]? The thing is that if you can't
reproduce the problem with openssl, it's likely to be problem with your
application. You say that 'openssl rsa -in file.pem -check -noout'
works, so... Problem with your application is pretty much your
responsibility, not much help can be offered [at least I can't offer
much]...

> OpenSSL version: 1.0.0g-fips.

I'd agree with Stephen that you should try something that we actually
stand behind [or turn to party responsible for 1.0.0g-fips]. Provided
that it's unlikely to be problem with assembler code [see below], there
is chance that you'll be able to reproduce problem with pure C debug
build, one that wouldn't show "optimized out" values. So that this
should be next step, i.e. try to reproduce it with build that would
allow to accurately examine complete back-trace.

> The backtrace and disassably is in the file attached with this email.
> I am not so good at debugging in assembly. This time I am seeing the 
> corruption flowing through RSA_eay_private_decrypt().
> 
>>         je      .Ldone_ssse3
>>         movdqa  64(%r11),%xmm6
>>         movdqa  0(%r11),%xmm9
>> =>      movdqu  0(%r9),%xmm0
>>         movdqu  16(%r9),%xmm1
>>         movdqu  32(%r9),%xmm2
>>         movdqu  48(%r9),%xmm3
>>
>> then there is only one possibility: corrupted input. I mean there is no
>> room for sha1_block_data_order_ssse3 to screw input parameters, its
>> caller has to do it...
> I do not see any invalid instruction like this.

But you see

rip            0x3717e68163     0x3717e68163

0x0000003717e68147:  je     0x3717e682e0
0x0000003717e6814d:  movdqa 0x40(%r11),%xmm6
0x0000003717e68153:  movdqa (%r11),%xmm9
0x0000003717e68158:  movdqu (%r9),%xmm0
0x0000003717e6815d:  movdqu 0x10(%r9),%xmm1
*0x000003717e68163*: movdqu 0x20(%r9),%xmm2
0x0000003717e68169:  movdqu 0x30(%r9),%xmm3

Tracing %r9 and %r10 gives following

0x0000003717e675bc: mov    %rsi,%r9  (input vector)
0x0000003717e675bf: mov    %rdx,%r10 (length of input in blocks)
0x0000003717e675c2: shl    $0x6,%r10
0x0000003717e675c6: add    %r9,%r10
...
0x0000003717e675f0: movdqu (%r9),%xmm0
0x0000003717e675f5: movdqu 0x10(%r9),%xmm1
0x0000003717e675fb: movdqu 0x20(%r9),%xmm2
0x0000003717e67601: movdqu 0x30(%r9),%xmm3
0x0000003717e6760c: add    $0x40,%r9
...
0x0000003717e68144: cmp    %r10,%r9
0x0000003717e68147: je     0x3717e682e0
...
0x0000003717e68158:  movdqu (%r9),%xmm0
0x0000003717e6815d:  movdqu 0x10(%r9),%xmm1
0x0000003717e68163:  movdqu 0x20(%r9),%xmm2
0x0000003717e68169:  movdqu 0x30(%r9),%xmm3
0x0000003717e68174:  add    $0x40,%r9
...
0x0000003717e682ce:  jmpq   0x3717e67650

Then we look at registers

r9             0x3718197fd2     236627525586
r10            0x3718194512     236627510546

As %r9 is larger than r10% there is only one possibility: length
argument was 0 or "negative". It's unlikely to be 0, because caller,
HASH_UPDATE in md32_common.h, ensures that, so it ought to be
"negative". "Negative" is in quotes, because length is treated as
unsigned and looking at sign is not really appropriate. Well, one can
argue that presence of sign is definitely wrong because there are no
processors that offer that large *physically* addressable memory, but
from formal programming viewpoint it would be inappropriate to examine
sign bit. But the original statement that it must be result of
corruption elsewhere holds true.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Re: Core dump in RSA_check_key

Reply via email to