------- Comment #4 from adam at consulting dot net dot nz 2010-01-05 04:17
-------
/* Workaround discovered! */
void test_int_vectors_containing_fp_data_using_local_reg_var_overlay() {
//create local register variables of the required floating point type
//(for the same global register variables)
register xmm_2f64_t local_xmm_c __asm__("xmm6");
register xmm_2f64_t local_xmm_d __asm__("xmm7");
//same calculation upon the local register variables. No casts are required.
local_xmm_c = local_xmm_c + local_xmm_d;
//the local changes above will be optimised away unless the global register
//variables are updated. The casts below should be a no-op as the local
//register variables are aliased to the global register variables.
xmm_c=(xmm_2i64_t) local_xmm_c;
xmm_d=(xmm_2i64_t) local_xmm_d;
}
With this workaround generated code is still optimal when the global register
variables have an integer vector type:
0000000000400550
<test_int_vectors_containing_fp_data_using_local_reg_var_overlay>:
400550: 66 0f 58 f7 addpd xmm6,xmm7
400554: c3 ret
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596