Re: linux-next: x86-latest/powerpc-next merge conflict

2008-04-21 Thread Ingo Molnar

* Stephen Rothwell [EMAIL PROTECTED] wrote:

 Hi all,
 
 Today's linux-next merge of the x86-latest tree got a conflict in 
 include/asm-powerpc/bitops.h between commit 
 cd008c0f03f3d451e5fbd108b8e74079d402be64 (generic: implement __fls on 
 all 64-bit archs) from the x86-latest tree and commit 
 9f264be6101c42cb9e471c58322fb83a5cde1461 ([POWERPC] Optimize fls64() 
 on 64-bit processors) from the powerpc-next tree.  The fixup was not 
 quite trivial and is worth a look to see if I got it right.

Paul, do you agree with those generic bitops changes? Just in case it's 
not obvious from previous discussions: we'll push them upstream via a 
separate pull request, not via usual x86.git changes. They originated 
from x86.git but grew into a more generic improvement for all. They sit 
in x86.git for tester convenience but are of course not pure x86 changes 
anymore.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


linux-next: x86-latest/powerpc-next merge conflict

2008-04-21 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the x86-latest tree got a conflict in
include/asm-powerpc/bitops.h between commit
cd008c0f03f3d451e5fbd108b8e74079d402be64 (generic: implement __fls on
all 64-bit archs) from the x86-latest tree and commit
9f264be6101c42cb9e471c58322fb83a5cde1461 ([POWERPC] Optimize fls64() on
64-bit processors) from the powerpc-next tree.  The fixup was not quite
trivial and is worth a look to see if I got it right.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgp03NLeyUduM.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: linux-next: x86-latest/powerpc-next merge conflict

2008-04-21 Thread Paul Mackerras
Alexander van Heukelum writes:

 Powerpc would pick up an optimized version via this chain: generic fls64
 -
 powerpc __fls -- __ilog2 -- asm (PPC_CNTLZL %0,%1 : =r (lz) : r
 (x)).

Why wouldn't powerpc continue to use the fls64 that I have in there
now?

 However, the generic version of fls64 first tests the argument for zero.
 From
 your code I derive that the count-leading-zeroes instruction for
 argument zero
 is defined as cntlzl(0) == BITS_PER_LONG.

That is correct.  If the argument is 0 then all of the zero bits are
leading zeroes. :)

Regards,
Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: x86-latest/powerpc-next merge conflict

2008-04-21 Thread Paul Mackerras
Ingo Molnar writes:

 Paul, do you agree with those generic bitops changes? Just in case it's 

Well, it looks OK, but I'm sure people are going to get confused with
fls vs. fls64 vs. __fls all being subtly different.  I'd say it's
worth putting a little file in the Documentation directory to explain
it all.

 not obvious from previous discussions: we'll push them upstream via a 
 separate pull request, not via usual x86.git changes. They originated 
 from x86.git but grew into a more generic improvement for all. They sit 
 in x86.git for tester convenience but are of course not pure x86 changes 
 anymore.

I'm not sure why the add __fls to all 64-bit architectures change
has to be done as a single patch rather than a patch per architecture
going through the architecture maintainers.  I suppose that avoids any
problem with some maintainers not sending it upstream quickly.  I
would expect that if it is a single cross-architecture patch that it
would go through Andrew Morton, though.  But if Andrew wants you to
handle it then I'm happy to give you an Acked-by for it.

Regards,
Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: x86-latest/powerpc-next merge conflict

2008-04-21 Thread Alexander van Heukelum
On Mon, 21 Apr 2008 15:36:06 +0200, Gabriel Paubert [EMAIL PROTECTED]
said:
 On Mon, Apr 21, 2008 at 03:07:13PM +0200, Alexander van Heukelum wrote:
  On Mon, 21 Apr 2008 22:13:06 +1000, Paul Mackerras [EMAIL PROTECTED]
  said:
   Alexander van Heukelum writes:
Powerpc would pick up an optimized version via this chain: generic fls64
-
powerpc __fls -- __ilog2 -- asm (PPC_CNTLZL %0,%1 : =r (lz) : r
(x)).
   
   Why wouldn't powerpc continue to use the fls64 that I have in there
   now?
  
  In Linus' tree that would be the generic one that uses (the 32-bit)
  fls():
  
  static inline int fls64(__u64 x)
  {
  __u32 h = x  32;
  if (h)
  return fls(h) + 32;
  return fls(x);
  }
  
However, the generic version of fls64 first tests the argument for zero.
From
your code I derive that the count-leading-zeroes instruction for
argument zero
is defined as cntlzl(0) == BITS_PER_LONG.
   
   That is correct.  If the argument is 0 then all of the zero bits are
   leading zeroes. :)
  
  So... for 64-bit powerpc it makes sense to have its own implementation
  and ignore the (improved) generic one and for 32-bit powerpc the generic
  implementation of fls64 is fine. The current situation in linux-next
  seems
  optimal to me.
 
 
 Not so sure, the optimal version of fls64 for 32 bit PPC seems to be:
 
   cntlzw  ch,h ; ch = fls32(h) where h = x32
   cntlzw  cl,l ; cl = fls32(l) where l = (__u32)x
   srwit1,ch,5
   neg t1,t1   ; t1 = (h==0) ? -1 : 0
   and cl,t1,cl ; cl = (h==0) ? cl : 0
   add result,ch,cl
 
 That's only 6 instructions without any branch, although the dependency 
 chain is 5 instructions long. Good luck getting the compiler to 
 generate something as compact as this.

I should not have said the magic word optimal, I guess ;). The code
you show would fit nicely as an arch-specific optimized version of
fls64 for 32-bit powerpc in include/arch-powerpc/bitops.h.

Greetings,
Alexander

(who is not going to write and test a patch with
powerpc inline assembly soon. srwi?)

 Don't worry about the number of cntlzw, it's one clock on all 32 bit 
 PPC processors I know, some may even be able to perform 2 or 3 cntlzw 
 per clock.
 
   Regards,
   Gabriel
 
-- 
  Alexander van Heukelum
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - Same, same, but differentÂ…

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev