On Wed, 2006-01-04 at 16:20 -0500, Michael Jennings wrote: > On Monday, 02 January 2006, at 04:28:52 (-0700), > Tres Melton wrote: > > Patch 1) > Applied. > > Patch 2) > I think you're correct. Applied. > > Patch 3) > Applied. > > Patch 4) > Applied, and copyright dates have been changed. :-)
Thanks. :) > > I have a patch that works here that checks for alignment or not and > > then calls the existing unaligned routine or a different aligned > > routine but that might be disruptive so I want to submit that as > > part of a much larger set. I've included things like cache > > prefetching, x86 SSE2 and I have a couple of instructions left in > > translating sse2 to sse so that x86 can use sse which should double > > their shading speeds. > > Sounds good. You're not referring to the 15/16 patch you sent, right? The 15/16 patch that I just submitted is the last bit that can be applied without restructuring the checks for, and macros defining, the SIMD routines. It 'should' work (it does here) but the optimal solution is to guarantee that the data submitted is already aligned (so we don't have to check like this patch does). That can't be done in Eterm as Eterm doesn't allocate the data storage; X allocates the storage and that is where any more improvements need to be made. You know how I dip my toe before jumping in, well I just got on their mailing lists and joined their IRC channel; it could be a while before I get it into X. > > That is going to create a problem with the HAVE_SSE macros as it > > will now be valid on both x86 and x86-64. Anyway, I'm going to be > > releasing a testing program that tests the different things and, as > > Raster requested, some profiling code so we know what we are > > actually gaining. > > If needed, both configure-time and run-time checks for MMX/SSE/SSE2 > can be added. I have thought about making generic function pointers like shade_ximage_15() and then point them to the appropriate C/MMX/SSE routines at runtime but that isn't a great solution as some of the code will not compile/assemble without hardware support. Compile time checks are going to be necessary and I don't really see any reason to have them switchable at runtime other than to compare the two (my test code will enable that when I finish it) so I think the best solution is going to be some compile time checks and just use macros like I did with the aligned/unaligned functions to maneuver the code path to the correct code. Ultimately I'd like to have simple calls in the pixmap code without all the HAVE_SSE crap and then let the macros sort it out. To that end, I would like to add a cmod.h file (or whatever you think it should be named) that has all the function prototypes in it as well as all the macro logic to the tree so that it becomes completely transparent everywhere except that file. I have included a rough draft of the macros structure for your comments. > It would probably be better to commit and see how it goes. :) Everything that I have submitted prior to this email I consider ready to be committed. The first set I think can go into stable pretty quick and the last one (15/16 aligned/unaligned) seems right to me but I don't want to be responsible for hosing anyone's Eterm. :) This is all I can submit until the profiling testing code is out and I get some feedback. I would like your thoughts on this submission but it is nowhere near ready for a commit of any kind. > Michael Best Regards, -- Tres Melton IRC & Gentoo: RiverRat
#define ETERM_ARCH_UNKNOWN 1 #define ETERM_ARCH_x86 201 #define ETERM_ARCH_x86_64 301 #define ETERM_SIMD_UNKNOWN 1 #define ETERM_SIMD_MMX 201 #define ETERM_SIMD_MMX_PLUS 202 #define ETERM_SIMD_SSE 301 #define ETERM_SIMD_SSE2 302 #define ETERM_SIMD_SSE3 303 #define ETERM_ARCH ETERM_ARCH_x86 #define ETERM_SIMD ETERM_SIMD_SSE #define ETERM_ALIGNMENT 16 int test1( void ) { #if ( defined ETERM_ARCH ) && ( ETERM_ARCH == ETERM_ARCH_x86 ) # if ( defined ETERM_SIMD ) && ( ETERM_SIMD == ETERM_SIMD_MMX ) printf( "x86 MMX Routines.\n" ); # elif ( defined ETERM_SIMD ) && ( ETERM_SIMD == ETERM_SIMD_SSE ) printf( "x86 SSE Routines.\n" ); # elif ( defined ETERM_SIMD ) && ( ETERM_SIMD == ETERM_SIMD_SSE2 ) printf( "x86 SSE2 Routines.\n" ); # else printf( " C Routines.\n" ); // C routines # endif #elif ( defined ETERM_ARCH ) && ( ETERM_ARCH == ETERM_ARCH_x86_64 ) # if ( defined ETERM_SIMD ) && ( ETERM_SIMD == ETERM_SIMD_SSE2 ) printf( "x86-64 SSE2 Routines.\n" ); # else /* Other, lesser, combinations make no sense */ printf( " C Routines.\n" ); // C routines # endif #else printf( " C Routines.\n" ); // C routines #endif return 1; } int main( void ) { return( test1()); }
signature.asc
Description: This is a digitally signed message part