[Bug target/78020] [AArch64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect

jgreenhalgh at gcc dot gnu.org Tue, 18 Oct 2016 08:17:49 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78020


James Greenhalgh <jgreenhalgh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jgreenhalgh at gcc dot gnu.org

--- Comment #5 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
This bug looks invalid to me. I think you're both failing to grasp the
intuition behind these intrinsics. Ignoring the descriptions in the reference
manual for a second, and thinking only in terms of vector operations. If you
start with:

  v1 = {a0, a1, a2, a3}
  v2 = {b0, b1, b2, b3}

Then a zipped vector would look like:

  zipped = {a0, b0, a1, b1, a2, b2, a3, b3}

zip1 takes the low elements of the vector (the first half), and zip2 takes the
high elements of the vector (the second half), i.e.

  zip1 = {a0, b0, a1, b1}
  zip1 = {a2, b2, a3, b3}

And for clarity, with two element vectors, you're looking at:

  v1 = {a0, a1}
  v2 = {b0, b2}
  zipped = {a0, b0, a1, b1}
  zip1 = {a0, b0}
  zip2 = {a1, b1}

uzp is the inverse operation, which takes a vector which has been zipped, and
unzips it in to two components. That is to say:

  v1 = {a0, b0, a1, b1}
  v2 = {a2, b2, a3, b3}
  concatenated = {a0, b0, a1, b1, a2, b2, a3, b3}
  unzipped = {a0, a1, a2, a3, b0, b1, b2, b3}
  uzp1 = {a0, a1, a2, a3}
  uzp2 = {b0, b1, b2, b3}

Running this example with two element vectors:

  v1 = {a0, b0}
  v2 = {a1, b1}
  concatenated = {a0, b0, a1, b1}
  unzipped = {a0, a1, b0, b1}
  uzp1 = {a0, a1}
  uzp2 = {b0, b1}

And finally, with the labelling of elements identical to the labelling of
elements we used for the zip example:

  v1 = {a0, a1}
  v2 = {b0, b1}
  concatenated = {a0, a1, b0, b1}
  unzipped = {a0, b0, a1, b1}
  uzp1 = {a0, b0}
  uzp2 = {a1, b1}

This should make it clear that for a two element vector uzp1 == zip1 and uzp2
== zip2. As implemented in GCC.

[Bug target/78020] [AArch64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect

Reply via email to