https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82199

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>:

https://gcc.gnu.org/g:7efc03fd2cb69fa0f790d32627a3e8131724e7e1

commit r11-2191-g7efc03fd2cb69fa0f790d32627a3e8131724e7e1
Author: Dmitrij Pochepko <dmitrij.poche...@bell-sw.com>
Date:   Fri Jul 17 10:20:12 2020 +0100

    __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)

    The following patch enables vector permutations optimization by using
    another vector element size when applicable.  It allows usage of simpler
    instructions in applicable cases.

    example:

    vector float f(vector float a, vector float b)
    {
      return __builtin_shuffle  (a, b, (vector int){0, 1, 4,5});
    }

    was compiled into:
    ...
            adrp    x0, .LC0
            ldr     q2, [x0, #:lo12:.LC0]
            tbl     v0.16b, {v0.16b - v1.16b}, v2.16b
    ...

    and after patch:
    ...
            zip1    v0.2d, v0.2d, v1.2d
    ...

    bootstrapped and tested on aarch64-linux-gnu with no regressions

    gcc/ChangeLog:

    2020-07-17  Andrew Pinski  <apin...@marvell.com>

            PR target/82199
            * config/aarch64/aarch64.c (aarch64_evpc_reencode): New function.
            (aarch64_expand_vec_perm_const_1): Call it.

    gcc/testsuite/ChangeLog:

    2020-07-17  Andrew Pinski  <apin...@marvell.com>

            PR target/82199
            * gcc.target/aarch64/vdup_n_3.c: New test.
            * gcc.target/aarch64/vzip_1.c: New test.
            * gcc.target/aarch64/vzip_2.c: New test.
            * gcc.target/aarch64/vzip_3.c: New test.
            * gcc.target/aarch64/vzip_4.c: New test.

    Co-Authored-By: Dmitrij Pochepko <dmitrij.poche...@bell-sw.com>

Reply via email to