------- Comment #3 from jb at gcc dot gnu dot org  2007-04-27 11:27 -------
(In reply to comment #2)
> Note that SSE can vectorize only the float precision variant, not the double
> precision one.  So one needs to carefuly either disable vectorization for the
> double variant to get reciprocal code or the other way around.

AFAICS these reciprocal instructions are available only for single precision,
both for scalar and packed variants. Altivec is only single precision, the SSE
instructions are 

rcpss (single precision scalar reciprocal)
rcpps (single precision packed reciprocal)
rsqrtss (single precision scalar reciprocal square root)
rsqrtps (single precision packed reciprocal square root)

There are no equivalent double precision versions of any of these instructions.
Or do you think there would be a speed benefit for double precision to

1. Convert to single precision
2. Calculate rcp(s|p)s or rsqrt(p|s)s
3. Refine with newton iteration

vs. just using div(p|s)d or sqrt(p|s)d?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

Reply via email to