> > By my reading even fairly simple iterative implementation should give
> > a single precision result in about 4 iterations (2 modified multiply-add
> > per iteration, probably 9 or 10 instructions total) 
> > 
> That's really interesting.  What about integer division, though?

Not sure.  Many the chips I deal with just Don't Do That :-)

Do we actually need integer division in practice?
I'd guess that the data workloads are going to be float based, and control 
code is almost entirely division by constant, which the compiler can change to 
fixed-point multiply (i.e. widening multiply+shift). The nVidia programming 
guides say that "integer division and modulo operations are particularly 
costly and should be avoided", which I take as meaning it's probably done in 
software.

Paul
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to