The SSE4.1 phminposuw instruction finds the minimum 16-bit element in
the source vector, putting the value of that element in the low 16
bits of the destination vector, the index of that element in the next
three bits and zeroing the rest of the destination.  The helper for
this operation fills the destination from high to low, meaning that
when the source and destination are the same register, the minimum
source element can be overwritten before it is copied to the
destination.  This patch fixes it to fill the destination from low to
high instead, so the minimum source element is always copied first.
This fixes one gcc test failure in my GCC 6-based testing (and so
concludes the present sequence of patches, as I don't have any further
gcc test failures left in that testing that I attribute to QEMU bugs).

Signed-off-by: Joseph Myers <>


diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 16509d0..ed05989 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1707,10 +1710,10 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, 
Reg *d, Reg *s)
         idx = 7;
-    d->Q(1) = 0;
-    d->L(1) = 0;
-    d->W(1) = idx;
     d->W(0) = s->W(idx);
+    d->W(1) = idx;
+    d->L(1) = 0;
+    d->Q(1) = 0;
 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,

Joseph S. Myers

Reply via email to