https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661

--- Comment #4 from Christophe Monat <christophe.monat at st dot com> ---
Hi Pratamesh,

You're absolutely right - maybe it's more efficient when there is some hardware
sincos available (Intel FSINCOS ?) but I would check also carefully the actual
performance.

Indeed, it looks to me that either you have to use two different polynomials or
shift one argument and use either sin or cos, but anyway twice.

We studied that in a slightly different context with Claude-Pierre Jeannerod
from ENS Lyon and our PhD Jingyan Lu-Jourdan a while ago : "Simultaneous
floating-point sine and cosine for VLIW integer processors" available here:
https://hal.archives-ouvertes.fr/hal-00672327 and we were able to gain
significant performance by exploiting the low-level parallelism of the
processor. Agreed, this is not a full IEEE implementation but the important
ideas are there.

Reply via email to