tbp writes:
>Apparently enough for a small vendor like Intel to propose such things
>as orps, andps, andnps, and xorps.

Paolo Bonzini writes:
>I think you're running too far with your sarcasm. SSE's instructions
>do not go so far as to specify integer vs. floating point.  To me, "ps"
>means "32-bit SIMD", independent of integerness

The IA-32 instruction set does distignuish between integer and
floating point bitiwse operations.  In addition to the single-precision
floating-point bitwise instructions that tbp mentioned (ORPS, ANDPS,
ANDNPS and XORPS) there are both distinct double-precision floating-point
bitwise instructions (ORPD, ANDPD, ANDNPD and XORPD) and integer bitwise
instructions (POR, PAND, PANDN and PXOR).  While these operations all do
the same thing, they can differ in performance depending on the context.

Intel's IA-32 Software Developer's Manual gives this warning:

        In this example: XORPS or PXOR can be used in place of XORPD
        and yield the same correct result. However, because of the type
        mismatch between the operand data type and the instruction data
        type, a latency penalty will be incurred due to implementations
        of the instructions at the microarchitecture level.

>>And now i guess the only sanctioned access to those ops is via
>>builtins/intrinsics.
>
>No, you can do so with casts.

tbp is correct.  Using casts gets you the integer bitwise instrucitons,
not the single-precision bitwise instructions that are more optimal for
flipping bits in single-precision vectors.  If you want GCC to generate
better code using single-precision bitwise instructions you're now forced
to use the intrinsics.

                                        Ross Ridge

Reply via email to