> Hi Ian,
> ----- Original Message ----- > because it supports multiple device architectures, a code optimized > for the GPU won't run fast on the CPU. > I thought you could write kernels optimised for various architectures, and > choose the best one at run time. So each node could have one kernel for the > GPU and one for the CPU. But an OpenCL GPU kernel will at least run on the > CPU, even if it does so sub-optimally; and vice versa. Yes ideally one would write different kernels optimized for different architectures, and this is the goal of OpenCL. The main issue is when you have a hammer everything looks like a nail, so we must be careful not think OpenCL is a magic bullet, but rather a really nice tool for some situations. The most dramatic speedups will be had at first with highly data parallel algorithms which can be moved to the GPU, with slower but still accelerated CPU versions taking advantage of multiple cores. > > > Then there is the question of user's having the hardware to even run > it, necessitating a CPU only fall-back. > Do you mean two entirely separate codes? Or could we have one > implementation that uses OpenCL, but with a CPU kernel (also written in > OpenCL) to fall back on? That would seem ideal, since you can develop just > one kernel to start with, and add architecture-specific kernels at a later > time. > Is the problem that there are no free OpenCL libraries (e.g. for use > without a GPU)? I do mean two entirely separate codes, at least for a while. Keep in mind that just about everything is already implemented on the CPU, with some multiprocessor support from OpenMP. So if we want to accelerate some feature it doesn't make sense to just throw away the existing code, just switch to OpenCL if its available. This is especially true since OpenCL implementations are not yet ubiquitous (they are free, from NVIDIA, ATI, Intel and Apple to name a few) so we don't want to disadvantage any users who don't have it yet. In the future, if and when OpenCL is everywhere it would make sense to just code a CPU kernel and a GPU kernel to switch between (or whatever kind of kernel you make for a CPU+GPU chip like NVIDIA's project Denver, ATI's Fusion or Intel's Sandy Bridge). Until then we should provide a solid infrastructure for acceleration but not throw out the baby with the bath water. > Cheers, Alex -- Ian Johnson http://enja.org _______________________________________________ Bf-committers mailing list [email protected] http://lists.blender.org/mailman/listinfo/bf-committers
