Quoting Jaroslav Kysela <[EMAIL PROTECTED]>: > > I've implemented the whole transfer and mix loop in assembly and it works > without any drastic impact on CPU usage. I tried to optimize the assembler > part as much as I can, but if some assembler guru want to give a glance, > I'll appreciate it. The function is named mix_areas1() in > alsa-lib/src/pcm/pcm_dmix.c. >
It seems to me it would make sens to code it for mmx (to use the saturation it offers for example). If you go for pure 386 there's little to win. Did you look at the assembly generated by gcc when compiling with optimiazations? I usually make this a start point when moving time-critical code to assembly, and if it looks optimized enough - I leave it at that, unless I can use tricks not available to the compiler - like, again, mmx. I don't know how well gcc is optimized for intels, but I remember that you really had to work your ass of to beat inner loops optimized by Watcomm compilers (BTW I heard they're coming back with open source compilers :-). Not to mention proprietary Intel compilers which can take into account things like word alignment for data and code, cache hit / miss situations, branch preditiction and all kinds of magical stuff. I'll take a closer look at the code when I have more time though. -------------- Fycio (J.Sobierski) [EMAIL PROTECTED] ------------------------------------------------------- This SF.net email is sponsored by: SlickEdit Inc. Develop an edge. The most comprehensive and flexible code editor you can use. Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial. www.slickedit.com/sourceforge _______________________________________________ Alsa-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/alsa-devel