J. Mayer wrote:
The latest patches in clo makes gcc 3.4.6 fail to build the mips64
targets on my amd64 host (looks like an register allocation clash in the
optimizer code).
Your version is likely faster as well.
Furthermore, the clz micro-op for Mips seems very suspect to me,
according to
On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote:
J. Mayer wrote:
The latest patches in clo makes gcc 3.4.6 fail to build the mips64
targets on my amd64 host (looks like an register allocation clash in the
optimizer code).
Your version is likely faster as well.
Furthermore,
On 10/27/07, J. Mayer [EMAIL PROTECTED] wrote:
I also got optimized versions of bit population count which could also
be shared:
static always_inline int ctpop32 (uint32_t val)
{
int i;
for (i = 0; val != 0; i++)
val = val ^ (val - 1);
return i;
}
If you prefer, I
J. Mayer wrote:
On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote:
J. Mayer wrote:
The latest patches in clo makes gcc 3.4.6 fail to build the mips64
targets on my amd64 host (looks like an register allocation clash in the
optimizer code).
Your version is likely faster as
On Sat, 2007-10-27 at 16:01 +0300, Blue Swirl wrote:
On 10/27/07, J. Mayer [EMAIL PROTECTED] wrote:
I also got optimized versions of bit population count which could also
be shared:
static always_inline int ctpop32 (uint32_t val)
{
int i;
for (i = 0; val != 0; i++)
The sparc64 popc works in O(lg(n)), the optimized code below work in
O(n). Could be better to generalize the sparc64 code, like this:
static always_inline int ctpop32 (uint32_t val)
{
uint32_t i;
i = (val 0x) + ((val 1) 0x);
i = (i0x) +
On Sat, 2007-10-27 at 15:27 +0200, Christian Eddie Dost wrote:
The sparc64 popc works in O(lg(n))
No, it has a fix cost, whatever the operand is.
It has another advantage: it does not need any intermediate variable,
which is great when running on CISC host in the Qemu execution
environmnent.
The latest patches in clo makes gcc 3.4.6 fail to build the mips64
targets on my amd64 host (looks like an register allocation clash in the
optimizer code).
Furthermore, the clz micro-op for Mips seems very suspect to me,
according to the changes made in the clo implementation.
I did change the