Hi Paul,

> In my experience, dcbz slows down the hot-cache case because it adds a
> few cycles to the execution time of the inner loop, and on most 64-bit
> PowerPC implementations, it doesn't actually help even in the
> cold-cache case because the store queue does enough write combining

I agree with you that on POWER the dcbz is probably not helping.

On PowerPC my experience is different.
>From what I have seen DCBZ help enormously on 970,PA-Semi and CELL.


Cheers
Gunnar



                                                                           
             Paul Mackerras                                                
             <[EMAIL PROTECTED]                                             
             >                                                          To 
                                       Gunnar von                          
             24/06/2008 01:49          Boehn/Germany/Contr/[EMAIL PROTECTED]    
   
                                                                        cc 
                                       [EMAIL PROTECTED], Mark Nelson   
                                       <[EMAIL PROTECTED]>,                
                                       linuxppc-dev@ozlabs.org, Michael    
                                       Ellerman <[EMAIL PROTECTED]>,    
                                       [EMAIL PROTECTED], Arnd        
                                       Bergmann <[EMAIL PROTECTED]>            
                                                                   Subject 
                                       Re: [RFC 1/3] powerpc:              
                                       __copy_tofrom_user tweaked for Cell 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Gunnar von Boehn writes:

> Interesting points.
> Can you help me to understand where the negative effect of DCBZ does come
> from?

In my experience, dcbz slows down the hot-cache case because it adds a
few cycles to the execution time of the inner loop, and on most 64-bit
PowerPC implementations, it doesn't actually help even in the
cold-cache case because the store queue does enough write combining
that the cache doesn't end up reading the line from memory.  I don't
know whether the Cell PPE can do that, but I could believe that it
can't.

Paul.


_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Reply via email to