> The bad thing that happens when Purify doesn't recognize something like 
> this is that we fail to patch offsets to account for the code stretching 
> we do. If something used to be 100 bytes away, maybe after Purify 
> insertion it's 180, and we need to patch all offsets in relative 
> computations that compute its address.

I see. Would it handle

        call    label1
        sub     %o7,.-4-label2,%??

I have such code in another module, namely aes-sparcv9.pl in HEAD. If
not, would it manage if label1 and label2 were same?

> In this case, the constant in the 
> delay slot of the call to .PIC.me.up is a PC-relative offset, but we don't 
> recognize it as such as we don't change it like we should. Thus the 
> .PIC.me.up function doesn't compute its own address as intended, and the 
> wrong addresses end up in out2 and global1. Then bad things happen later. 
> (With luck, you crash;

But wouldn't crash mean that .PIC.me.up was moved by quite an offset?
And in such case it would be impossible to correct offset in above
mentioned sub %o7 [anyway], because the offset value is limited to 13
bits... Is this correct assumption/understanding? On the other hand, if
.PIC.me.up was actually moved, then wouldn't it busted anyway? Indeed,
if label1-label2 is resolved at assemble time (and it *is*!), it's
encoded just as any other constant, and then there is no way for you to
know that for example "sethi %hi(.des_and-.PIC.me.up),%??; or
%??,%lo(.des_and-.PIC.me.up),%??" needs to be corrected. But in this
case not even call .+8 would do... So... Does Purify really moves the
machine code around? Wouldn't it be sufficient for you to adjust only
call instructions so that control is temporarily passed to some
[dynamically generated] shim/trampoline stubs between all subroutines,
but leave the rest of machine code alone? In this case referring to %o7
would indeed break in .PIC.me.up, because %o7 would point at
shim/trampoline code it would have to return through. While above
mentioned reference to %o7 in delay slot would work fine... Could you
see if attached patch works?

> without luck, you silently get bad encryptions or 
> decryptions.)
> 
> Purify definitely does treat call8 specially: we know its behavior and its 
> purpose, distinct from ordinary "call" instructions.

Is it "call .+8" in particular or is it any call that doesn't leave a
basic block [or meet some other criteria]? Well, call itself denotes
basic block boundary, so it has to be some other criteria... If there is
one that is... Either way, misunderstand me correctly, I simply want to
understand the rules of the game. The question essentially is if there
is anything that could prevent Purify from "purifying" some particular
subroutine and keep two subroutines inseparable? Or keep two labels at
same distance? For reasons discussed above... Of course, if it doesn't
actually moves the basic blocks apart, then this questions wouldn't be
even posed.

> Are you sure about 
> call8 disrupting the retl prediction stack?

Not 100%, but I'm pretty sure...

> I ask because it's such a 
> common instruction in PIC code, you'd think the SPARC guys would know it's 
> not really a call and wouldn't push to the stack when they see it.

Why? It would only make hardware more complicated...

> If the 
> retl prediction stack problem is the only objection to using call8, I'd 
> like to try to check this out. I don't know where to look for internal 
> SPARC chip details like this, though. (I'm sure somebody on the PurifyPlus 
> team does; I just don't happen to.)

I wrote the comment mostly to explain why .PIC.me.up looks the way it
does, i.e. doesn't have call .+8. I didn't mean that call .+8 is
absolutely banned, only why it's not preferred.

> I didn't understand your question at the end about moving the last 
> instruction of my new sequence.

Because it wasn't about your new sequence. It was about "fixing" the old
one. I mean you said "change the delay slot to nop," and I asked if
"moving the mov instruction from the delay slot upward" would work. I
can see *now* that it was an ignorant question, but I had not a
slightest idea on how Purify works. Now I have *some*, but still... bear
with me:-) Well, not on the last question, but upper half of the
message... A.
--- des_enc.m4.orig	2005-12-15 23:55:16.000000000 +0100
+++ des_enc.m4	2009-03-09 23:16:05.000000000 +0100
@@ -1181,7 +1181,7 @@
 	save	%sp, FRAME, %sp
 
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	ld	[in0], in5                ! left
 	cmp	in2, 0                    ! enc
@@ -1239,7 +1239,7 @@
 	save	%sp, FRAME, %sp
 
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	! Set sbox address 1 to 6 and rotate halfs 3 left
 	! Errors caught by destest? Yes. Still? *NO*
@@ -1354,7 +1354,7 @@
 	save	%sp, FRAME, %sp
 	
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	ld	[in0], in5                ! left
 	add	in2, 120, in4             ! ks2
@@ -1396,7 +1396,7 @@
 	save	%sp, FRAME, %sp
 	
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	ld	[in0], in5                ! left
 	add	in3, 120, in4             ! ks3
@@ -1425,13 +1425,12 @@
 .DES_decrypt3.end:
 	.size	 DES_decrypt3,.DES_decrypt3.end-DES_decrypt3
 
-! input:	out0	offset between .PIC.me.up and caller
+! input:	out0	pointer to .PIC.me.up
 ! output:	out0	pointer to .PIC.me.up
 !		out2	pointer to .des_and
 !		global1	pointer to DES_SPtrans
 	.align	32
 .PIC.me.up:
-	add	out0,%o7,out0			! pointer to .PIC.me.up
 	sethi	%hi(.des_and-.PIC.me.up),out2
 	or	out2,%lo(.des_and-.PIC.me.up),out2
 	add	out0,out2,out2
@@ -1455,7 +1454,7 @@
 	define({IVEC},   { [%sp+BIAS+ARG0+4*ARGSZ] })
 
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	cmp	in5, 0                    ! enc   
 
@@ -1677,7 +1676,7 @@
 	define({KS3}, { [%sp+BIAS+ARG0+5*ARGSZ] })
 
 	call	.PIC.me.up
-	mov	.PIC.me.up-(.-4),out0
+	sub	%o7,(.-4)-.PIC.me.up,out0
 
 	LDPTR	[%fp+BIAS+ARG0+7*ARGSZ], local3          ! enc
 	LDPTR	[%fp+BIAS+ARG0+6*ARGSZ], local4          ! ivec

Reply via email to