On Wed, Sep 23, 2009 at 12:20 PM, Nicolas Boulay <[email protected]> wrote: > > So you want some vectorised conditionnal move, and some "Conditional > vector masks", like Intel for larrabee :) (it's a kind of predicat) > (VPU is the name of this instruction set)
Predication isn't a bad idea. The thing is, predication helps most to avoid branch mispredicts. We totally don't need to predict branches because the condition will always be known by the time we get to the branch instruction. (Remember, we're like Niagara here.) I've written SSE code before, and I've encountered situations where predicated code (or something equivalent, using bit masks and logical ops) was MUCH larger than the equivalent code with branches. Still, the predicated code was faster since there was no flow control. Our shader architecture will be faster with smaller code, period, since our flow control can't suffer any mispredict penalty. > > You should only implement 1 cycle operation. If you really need div, > pipeline (1/x) with MUL with enough garded bit to have the required > precision. There is a lots of 1 cycle operation for complexe function > (1/x, 1/sqrt(x)). We can do 1/x by lookup table. But the precision is poor. -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
