ktkachov at gcc dot changed:

           What    |Removed                     |Added
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |wrong-code
   Last reconfirmed|                            |2016-10-18
                 CC|                            |ktkachov at gcc dot
     Ever confirmed|0                           |1
            Summary|[Aarch64, ARM64]            |[AArch64] vuzp{1,2}q_f64
                   |vuzp{1,2}q_f64              |implementation identical to
                   |implementation identical to |vzip{1,2}q_f64 in
                   |vzip{1,2}q_f64 in           |arm_neon.h and probably
                   |arm_neon.h and probably     |incorrect
                   |incorrect                   |
   Target Milestone|---                         |7.0
      Known to fail|                            |4.8.5, 4.9.4, 5.3.1, 6.0,
                   |                            |7.0

--- Comment #1 from ktkachov at gcc dot ---
I think you're right after reading the ARM ARM (the same version as you)
I suppose the minimal testcase is something like:

#include "arm_neon.h"

double amem[] = {1.0, 2.0};
double bmem[] = {3.0, 4.0};

main (void)
  float64x2_t a = vld1q_f64 (amem);
  float64x2_t b = vld1q_f64 (bmem);

  float64x2_t res = vuzp1q_f64 (a, b);
  if (vgetq_lane_f64 (res, 0) != 3.0
      || vgetq_lane_f64 (res, 1) != 1.0)
    __builtin_abort ();

  return 0;

I guess it can a bit fiddly to get right because in the instruction:
UZP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>

the 'zipped' vector that we unzip is Vm:Vn i.e. Vm goes first and so
the first element of the unzipped vector is from Vm

Reply via email to