From: Pan Xiuli <xiuli....@intel.com> This patch set add intel_subgroups_short extension: subgroup broadcast, subgroup shuffle and subgroup block read/write We also add A64 block read logic for future use.
*** BLURB HERE *** Pan Xiuli (15): Libocl: Add intel_subgroups_short extension Backend: Refine GenRegiter::offset Backend: Refine register offset for simd shuffle Backend: Refine sub group broadcast code for spec Libocl: Add sub group broadcast short builtin function Utest: Add check subgroup short helper function Utest: Add test case for sub group broadcast short Backend: Change the sel ir optimization for unpack register Backend: Add short sub group builtin functions Utest: Add test case for sub group short builtin functions Backend: Add sub groups short shuffle builtin functions Utest: Add test case for short type sub group shuffle Backend: Add subgroup short block read/write Utest: Add subgroup block read/write ushort test case Backend: Add A64 subgroup block read/write support backend/src/backend/gen8_context.cpp | 12 + backend/src/backend/gen8_encoder.cpp | 70 +++++ backend/src/backend/gen8_encoder.hpp | 4 + backend/src/backend/gen8_instruction.hpp | 13 + backend/src/backend/gen_context.cpp | 289 +++++++++++++++----- backend/src/backend/gen_defs.hpp | 3 + backend/src/backend/gen_encoder.cpp | 34 ++- backend/src/backend/gen_encoder.hpp | 4 + backend/src/backend/gen_insn_selection.cpp | 37 +-- .../src/backend/gen_insn_selection_optimize.cpp | 2 +- backend/src/backend/gen_register.hpp | 4 + backend/src/ir/instruction.cpp | 39 +-- backend/src/ir/instruction.hpp | 6 +- backend/src/libocl/include/ocl.h | 1 + backend/src/libocl/tmpl/ocl_simd.tmpl.cl | 292 +++++++++++++++++---- backend/src/libocl/tmpl/ocl_simd.tmpl.h | 131 +++++++-- backend/src/llvm/llvm_gen_backend.cpp | 126 ++++++--- backend/src/llvm/llvm_gen_ocl_function.hxx | 50 ++-- backend/src/llvm/llvm_scalarize.cpp | 42 ++- kernels/compiler_sub_group_shuffle.cl | 22 +- kernels/compiler_sub_group_shuffle_down.cl | 23 +- kernels/compiler_sub_group_shuffle_up.cl | 23 +- kernels/compiler_sub_group_shuffle_xor.cl | 23 +- kernels/compiler_subgroup_broadcast.cl | 10 + kernels/compiler_subgroup_buffer_block_read.cl | 47 +++- kernels/compiler_subgroup_buffer_block_write.cl | 44 +++- kernels/compiler_subgroup_image_block_read.cl | 49 +++- kernels/compiler_subgroup_image_block_write.cl | 46 +++- kernels/compiler_subgroup_reduce.cl | 22 ++ kernels/compiler_subgroup_scan_exclusive.cl | 36 +++ kernels/compiler_subgroup_scan_inclusive.cl | 36 +++ src/cl_extensions.h | 5 +- utests/compiler_sub_group_shuffle.cpp | 52 +++- utests/compiler_sub_group_shuffle_down.cpp | 54 +++- utests/compiler_sub_group_shuffle_up.cpp | 54 +++- utests/compiler_sub_group_shuffle_xor.cpp | 54 +++- utests/compiler_subgroup_broadcast.cpp | 11 + utests/compiler_subgroup_buffer_block_read.cpp | 73 +++++- utests/compiler_subgroup_buffer_block_write.cpp | 74 +++++- utests/compiler_subgroup_image_block_read.cpp | 98 +++++-- utests/compiler_subgroup_image_block_write.cpp | 73 +++++- utests/compiler_subgroup_reduce.cpp | 66 +++++ utests/compiler_subgroup_scan_exclusive.cpp | 66 +++++ utests/compiler_subgroup_scan_inclusive.cpp | 66 +++++ utests/utest_helper.cpp | 20 ++ utests/utest_helper.hpp | 2 + 46 files changed, 1942 insertions(+), 366 deletions(-) -- 2.7.4 _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/beignet