Hi Brian, hi everybody,

I cleaned up the 3dnow stuff and removed some workarounds. We have now the
following problem: the routines 

   gl_3dnow_transform_points3_general_raw
   gl_3dnow_transform_points4_general_raw
   gl_3dnow_transform_points3_3d_raw
   gl_3dnow_transform_points4_3d_raw
   gl_v16_3dnow_general_xform

might be assembled incorrect. I can reproduce this with two binutils
packages:

  GNU assembler version 2.9.1 (i586-pc-linux-gnu), using BFD version 2.9.1
produces ill code. 

  GNU assembler version 2.9.1 (i486-linux), using BFD version 2.9.1.0.25
works right. For those who are interested, a diff of two objdumps
(3dnow_xform_raw4.S) is included at the end of this mail. You see some
wrong opcodes on the left and the translation on the right.

I'm not shure, what to do. I'll try to contact somebody of the binutils
maintainers. But since the next Mesa release is not far away, we should
think about how we can prevent wrong builds. At least we should add a
warning in the doc's (Who really reads documentation ? -- perhaps better
on the download www-page ??). Additionally a test for the binutil version
in the configure script. Or a test, if the produced object file is
correct. A lot code needed for such diagnostics exists already in
src/debug_xform.c.

Comments ??  Ideas ??


- Holger


btw: The new code is about 5% faster than the old on apps with _many_
     polygons, Q3 is about 2 fps faster on an Athlon w/ Voodoo3 
     (Benchmarks performed by Dieter Nuetzel).
     Because of this I'd like to see it in the 3.2 release.

---------------------------------------------------------------------------
(objdump -d for both object files, then diff)

2,3c2,3
< 3dnow_xform_raw4.o.bfd-2.9.1.0.25:     file format elf32-i386
< 3dnow_xform_raw4.o.bfd-2.9.1.0.25
---
> 3dnow_xform_raw4.o.bfd-2.9.1:     file format elf32-i386
> 3dnow_xform_raw4.o.bfd-2.9.1
63c63
<   5b: 49              decl   %ecx
---
>   5b: 41              incl   %ecx
66c66
<   63: 51              pushl  %ecx
---
>   63: 41              incl   %ecx
69c69
<   6b: 59              popl   %ecx
---
>   6b: 41              incl   %ecx
72c72
<   73: 61              popa
---
>   73: 41              incl   %ecx
75,79c75,78
<   7b: 69 28 b4 0f 0f  imull  $0xd00f0fb4,(%eax),%ebp
<   80: d0
<   81: 9e              sahf
<   82: 0f 0f           (bad)
<   84: 71 30           jno    b6
<gl_3dnow_transform_points4_general_raw+0xb6>
---
>   7b: 41              incl   %ecx
>   7c: 28 b4 0f 0f d0  subb   %dh,0xf9ed00f(%edi,%ecx,1)
>   81: 9e 0f
>   83: 0f 41 30        cmovno (%eax),%esi
81,82c80,81
<   88: 0f d9 9e 0f 0f  psubusw 0x38790f0f(%esi),%mm3
<   8d: 79 38
---
>   88: 0f d9 9e 0f 0f  psubusw 0x38410f0f(%esi),%mm3
>   8d: 41 38
477c476
<  35b: 49              decl   %ecx
---
>  35b: 41              incl   %ecx
485c484
<  36b: 59              popl   %ecx
---
>  36b: 41              incl   %ecx
488c487
<  373: 61              popa
---
>  373: 41              incl   %ecx




_______________________________________________
Mesa-dev maillist  -  [EMAIL PROTECTED]
http://lists.mesa3d.org/mailman/listinfo/mesa-dev

Reply via email to