> In the file "crypto/des/asm/des_enc.m4" there is a function called 
> ".PIC.me.up" which uses a trick to get its own address. Then it uses that 
> address as the base for getting other addresses in a position-independent 
> way. The trouble is, the trick is too tricky for Purify (and possibly 
> other analysis tools that operate by decoding the instruction stream).
> 
> The trick being used is that the caller puts a constant in a register 
> before making the call. The constant is the distance from the call 
> instruction to the start of the .PIC.me.up function. The first instruction 
> of .PIC.me.up is an ADD of this value to the caller's return PC. The 
> result is the .PIC.me.up function's own address.

What is it that makes Purify go crazy exactly? What is "purified"
machine code outcome in this case?

> This is not a usual way for a function to get its own address on SPARC, 

Consider 'void *foo(){return foo;}'. gcc-3.3.2 -fPIC generates following
code:

.LLGETPC0:
        retl
        add     %o7, %l7, %l7
foo:
        save    %sp, -112, %sp
        sethi   %hi(_GLOBAL_OFFSET_TABLE_-4), %l7
        call    .LLGETPC0
        add     %l7, %lo(_GLOBAL_OFFSET_TABLE_+4), %l7
        sethi   %hi(foo), %g1
        or      %g1, %lo(foo), %g1
        ld      [%l7+%g1], %g1
        mov     %g1, %i0
        ret
        restore

Even though .LGETPC0 does not return own address, the code is not any
different from .PIC.me.up, is it? Indeed, its input is set in delay
slot, which is then added to %o7... So .PIC.me.up is not as
unusual/tricky, at least it's not unique in the class [of
position-independent subroutines]...

For reference. call .+8 is not desired because it effectively corrupts
so called return address stack in branch prediction unit, the stack is
used to predict branch address for ret[l]. Penalty for corruption would
be the stack depth times mis-predicted branch penalty. Once again, this
is for reference.

> and Purify does not handle this kind of inter-procedural address 
> computation. We could add custom analysis logic in Purify to make this 
> work, but it would be very specific and fragile: a future change in 
> des_enc.m4 could make it break again.

But without understanding what's wrong exactly there is no guarantee
that some future change would work either. So please, bear with me:-)

> The following two changes to the custom SPARC assembly code in des_enc.m4 
> will make Purify work.
> 
> 1. Delete the first instruction of .PIC.me.up and replace it with this 
> sequence:
>         mov     %o7,global1
>         call    .+8
>         sub     %o7,4,out0

Does Purify treat call .+8 specially? Or does it treat [add|sub] %o7,...
in delay slot specially? In other words how come sub works in delay slot
here, while mov in delay slot in calls to .PIC.me.up doesn't? What about
gcc-generated code above?

>         mov     global1,%o7
> 
> 2. Change the delay-slot instructions of the calls to .PIC.me.up to NOP 
> instructions. Or just leave them alone: the current instructions there 
> become useless but harmless.

Would moving the instruction above call work? A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to