Re: SPARC assembly trick in libcrypto breaks IBM Rational Purify

Kyle Hamilton Tue, 17 Mar 2009 02:34:35 -0700

You need to get off your high-horse.

-Kyle H


On Mon, Mar 16, 2009 at 2:23 PM, Kenneth Robinette
<supp...@securenetterm.com> wrote:
>
> You need to take this discussion offline.
>
> Ken
>
>
>
> --- On Mon, 3/16/09, Allan K Pratt <apr...@us.ibm.com> wrote:
>
>> From: Allan K Pratt <apr...@us.ibm.com>
>> Subject: Re: SPARC assembly trick in libcrypto breaks IBM Rational Purify
>> To: openssl-dev@openssl.org
>> Date: Monday, March 16, 2009, 3:49 PM
>> > And the only difference would be to relax pattern
>> recognition so that
>> delay slot is examined for %o7-based arithmetic for all
>> call instructions,
>> not only call .+8 in particular. Is this correctly
>> understood?
>>
>> Yes, you correctly understand this. But it's not as
>> easy as that. I don't
>> need to get into Purify implementation details, but
>> remember, if the
>> target gets pushed more than 13 bits away we need to turn
>> the call/add
>> into sethi/or/call/add or something like that. The fact
>> that this is in a
>> delay slot and also that the %o7 value from the call is a
>> source register
>> complicates things even more. It would not be impossible to
>> handle this
>> but there is the ROI to consider.
>>
>> You asked when, specifically, Purify stretches code. The
>> short answer is:
>> anywhere we need to. Definitely at the top of a function,
>> and at every
>> memory load or store instruction, and after function calls.
>> Beyond that,
>> we might do insertion on any instruction at all, subject to
>> our needs.
>>
>> <sales_pitch>
>> The basic Purify insertion is on load and store
>> instructions; everything
>> else is in support of that. Purify's whole value
>> proposition is to
>> pinpoint memory errors like reading uninitialized memory,
>> or touching
>> beyond the end of a block or the end of the current stack,
>> or touching
>> memory you've already freed. In contrast, malloc-debug
>> libraries only
>> report bad writes, and only after the fact. They spray
>> patterns into freed
>> memory in the hopes that bad reads will cause visible
>> misbehavior in the
>> program's future. Unlike those, Purify sees both reads
>> and writes when
>> they happen, pinpointing the faulting instruction instead
>> of telling you
>> "a bad thing happened sometime in the past."
>> </sales_pitch>
>>
>> Best case (on SPARC) is that we insert two instructions
>> before each load
>> or store. Worst case, we "unravel" instructions
>> out of delay slots, add
>> more instructions to "shadow" certain types of
>> register usage, and deal
>> with offsets that have grown too large by inserting
>> additional math.
>>
>> You asked how you can know that Purify will *not* do
>> insertion or stretch
>> your code. That's a little tricky. If you have two
>> non-global symbols that
>> identify data blocks, and there are no global symbols or
>> code
>> (instructions) between them, there won't be any
>> stretching from today's
>> Purify. But any instructions at all are subject to
>> insertion, and in some
>> cases we insert dead space (a "red zone") before
>> a global data symbol.
>>
>> Now, back to libcrypto: while you and I have been talking,
>> our resident
>> genius instrumentation engine guy has actually coded some
>> modifications to
>> support the .PIC.me.up pattern as it appears in 0.9.8j.
>> This supports our
>> current customers who use past, released versions of
>> libcrypto on SPARC. I
>> expect this change to appear in an upcoming release of
>> PurifyPlus, but I
>> can't commit to it or give a date because I'm not
>> authorized to commit to
>> future product features or support in a public forum.
>>
>> The new pattern recognizer is pretty specific, intending to
>> support
>> existing customers with libcrypto binaries. It's not a
>> general-purpose
>> recognizer for optimized interprocedural PIC sequences. It
>> recognizes
>> patterns that stay very close to this:
>>
>>    call target
>>    mov offset,%o0
>>    ...
>>
>> target:
>>    add %o0,%o7,%o0
>>
>> The new code recognizes this when "offset" is the
>> distance from the call
>> instruction to "target," and the "add"
>> really is the very first
>> instruction at the call target. We'll even patch the
>> offset if the
>> distance from the caller to the target grows past 13 bits.
>>
>> The developer also coded changes to recognize and patch the
>> self-relative
>> offset in data from .PIC.DES_SPtrans to DES_SPtrans. I
>> don't know the
>> details and restrictions on this one. Like I said, it's
>> really meant for
>> customers with current libcrypto binaries.
>>
>> Regardless of any new recognizers which might appear in the
>> future, there
>> are two Purify-safe ways to do PIC stuff on SPARC:
>>
>> Short form:
>> L1:     call8
>>         add     %o7,(target-L1),regZ
>>
>> Long form:
>>         sethi   %hi(target-L2),regX
>>         or      regX,%lo(target-L2),regY
>> L2:     call8
>>         add     regY,%o7,regZ
>>
>> The short form will work even if Purify stretches the
>> distance farther
>> than 13 bits will reach. Purify is flexible: regX, regY,
>> and regZ can be
>> different or they can overlap, and the call8 can happen any
>> time before
>> the add, and you can move the o7 result of the call8 to
>> another register
>> if you want and then use that: it doesn't have to stay
>> in o7. You can use
>> the same call8-derived base register for multiple PIC
>> computations, but
>> you can't use one computed address (like regZ) as the
>> base for another.
>> These two patterns work for both 32-bit and 64-bit
>> programs.
>>
>> Regarding the patch you referred to
>> (http://cvs.openssl.org/chngview?cn=17898): I'm sorry
>> to say Purify is not
>> as flexible as you might want. In the short form we
>> recognize "add" using
>> %o7 after call8, but not "sub." So the patched
>> aes_sparcv9 module is *not*
>> Purify-friendly yet. To fix this, change "sub" to
>> "add" and reverse the
>> subtraction that computes the offset:
>>
>> BAD:
>> 1:      call    .+8
>>         sub     %o7,1b-AES_Te,%o4
>>
>> GOOD:
>> 1:      call    .+8
>>         add     %o7,AES_Te-1b,%o4
>>
>> Thanks for working with us on this. Let me know if you have
>> more thoughts
>> or questions.
>>
>> -- Allan Pratt, apr...@us.ibm.com
>> Rational software division of IBM
>>
>> ______________________________________________________________________
>> OpenSSL Project
>> http://www.openssl.org
>> Development Mailing List
>> openssl-dev@openssl.org
>> Automated List Manager
>> majord...@openssl.org
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       openssl-dev@openssl.org
> Automated List Manager                           majord...@openssl.org
>
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Re: SPARC assembly trick in libcrypto breaks IBM Rational Purify

Reply via email to