Two questions.. Currently in CPUDevice/CPUDeviceTask where optimization can be used it first calls the system_cpu_support_sse2() check, and then if that is unavailable tries system_cpu_support_sse3(). I guess this is also a two part in itself - 1) Is it false to assume SSE3 would be better than SSE2? If SSE3 is better then shouldn't this check be first followed by the less ideal SSE2 (followed by the even less ideal basic impl)? 2) If a cpu supports SSE3, would it always also support SSE2, and effectively never use the SSE3 implementation (as SSE2 always gets used instead) as-is.
The other thing is since CPUDeviceTask is already OO-based, rather than doing checks each time an optimizable method is called to determine what implementation to use, wouldn't it be cleaner to make CPUDeviceTask semi-abstract and create three sub-classes (e.g. BasicCPUDeviceTask, SSE2CPUDeviceTask, SSE3CPUDeviceTask) with each custom impl and just have task_add() [or something] decide which to create? Depending on how often these methods are called it may or may not have much time saving (by not doing those checks each time), but would seem more maintainable than having several related #ifdef's and system_cpu_support_*()'s scattered about. It might also eventually help allow other implementations to be dropped in without needing large chucks of the core CPUDeviceTask modified (i.e. if plugable support for devices is ever reached/to be reached). Also, for CPU's that support (and thus require SSE), how hard would it be to compile the non-optimize calls (and functions) out to reduce the final executable size, as that code will never be called in these cases? If the final 'else' part of the 'if/else if' was removed and used an #else instead (on WITH_OPTIMIZED_KERNEL) for the non-optimized parts. Ok.. this makes is 3.5 questions total! =) -Chad _______________________________________________ Bf-committers mailing list [email protected] http://lists.blender.org/mailman/listinfo/bf-committers
