Hi!

This patch series further reduces overhead of launching kernels on GCN
devices on top of the already-landed patches, by removing a redundant
allocation and reducing the overhead of constructing the target variable
table (the table of addresses of mapped variables in device memory) by
moving it into host memory and kernel arguments.  This then piggy-backs
off of the kernel argument cache previously added to avoid allocating a
new target variable table in most cases.

It also introduces the concept of "offload sessions", that can be
further expanded in the future, to carry the state required to start a
target region.

This reduces the overhead of launching a kernel by ~27%.

The series also reduces the overhead of launching threads on the actual
GCN device, by parallelizing thread initialization, akin to what the
patch proposed by Matthew Malcomson does here:
https://inbox.sourceware.org/gcc-patches/[email protected]/

Arsen Arsenović (4):
  libgomp/gcn: parallelize initializing threads of a team
  libgomp: let plugins handle allocating the target variable table
  libgomp/plugin-gcn: remove unneeded heap allocation in run_kernel
  libgomp/oacc-mem: add missing assert to goacc_enter_datum

 include/gomp-constants.h                      |   2 +-
 libgomp/config/gcn/team.c                     | 121 ++++---
 libgomp/libgomp-plugin.h                      |  81 ++++-
 libgomp/libgomp.h                             |  58 +++-
 libgomp/oacc-host.c                           |  63 +++-
 libgomp/oacc-mem.c                            |  11 +-
 libgomp/oacc-parallel.c                       |  24 +-
 libgomp/plugin/plugin-gcn.c                   | 310 ++++++++++++------
 libgomp/plugin/plugin-nvptx.c                 |  45 ++-
 libgomp/target.c                              | 191 +++++++----
 libgomp/task.c                                |  33 +-
 .../gcn-kernel-launch-no-tvt-alloc.c          |  51 +++
 .../gcn-kernel-launch-tvt-alloc.c             |  16 +
 13 files changed, 750 insertions(+), 256 deletions(-)
 create mode 100644 
libgomp/testsuite/libgomp.c-c++-common/gcn-kernel-launch-no-tvt-alloc.c
 create mode 100644 
libgomp/testsuite/libgomp.c-c++-common/gcn-kernel-launch-tvt-alloc.c

-- 
2.54.0

Reply via email to