Re: [PATCH,nvptx] Remove use of 'struct map' from plugin (nvptx)

2019-05-06 Thread Thomas Schwinge
Hi!

On Tue, 31 Jul 2018 08:12:51 -0700, Cesar Philippidis 
 wrote:
> This is an old patch which removes the struct map from the nvptx plugin.

(This got committed to trunk in r263212.)

> I believe at one point this was supposed to be used to manage async data
> mappings, but in practice that never worked out.

The original submission seems to be
:

| The attached patch fixes an issue in the managing of
| the page-locked buffer which holds the kernel launch
| mappings. In the process of fixing the issue, I discovered
| that 'struct map' was no longer needed, so that has
| been removed as well.

I can't tell/remember what the "issue in the managing of the page-locked
buffer which holds the kernel launch mappings" would've been.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c

I wanted to note that on all branches down to GCC 5 this test case PASSes
already without any libgomp code changes.

> @@ -0,0 +1,63 @@
> +/* { dg-do run } */
> +
> +#include 
> +#include 
> +#include 
> +
> +/* Exercise the kernel launch argument mapping.  */
> +
> +int
> +main (int argc, char **argv)
> +{
> +  int a[256], b[256], c[256], d[256], e[256], f[256];
> +  int i;
> +  int n;
> +
> +  /* 48 is the size of the mappings for the first parallel construct.  */
> +  n = sysconf (_SC_PAGESIZE) / 48 - 1;

The rationale for the number 48 doesn't seem right -- the first
'parallel' construct has four mappings, so four times eight bytes is 32
bytes on x86_64?  Ha, or maybe this applies to the 'struct map', so it's
these four plus the 'int async', 'size_t size'?

Anyway, I can't tell what this is trying to achieve?

> +
> +  i = 0;
> +
> +  for (i = 0; i < n; i++)
> +{
> +  #pragma acc parallel copy (a, b, c, d)
> + {
> +   int j;
> +
> +   for (j = 0; j < 256; j++)
> + {
> +   a[j] = j;
> +   b[j] = j;
> +   c[j] = j;
> +   d[j] = j;
> + }
> + }
> +}

Maybe filling up some improperly-managed statically-sized data structure
(said "page-locked buffer"?), which then...

> +
> +#pragma acc parallel copy (a, b, c, d, e, f)

... would overflow here?

> +  {
> +int j;
> +
> +for (j = 0; j < 256; j++)
> +  {
> + a[j] = j;
> + b[j] = j;
> + c[j] = j;
> + d[j] = j;
> + e[j] = j;
> + f[j] = j;
> +  }
> +  }
> +
> +  for (i = 0; i < 256; i++)
> +   {
> + if (a[i] != i) abort();
> + if (b[i] != i) abort();
> + if (c[i] != i) abort();
> + if (d[i] != i) abort();
> + if (e[i] != i) abort();
> + if (f[i] != i) abort();
> +   }
> +
> +  exit (0);
> +}

Anyway -- the libgomp code has been cleaned up; the test case seems like
it can be disregarded.


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH,nvptx] Remove use of 'struct map' from plugin (nvptx)

2018-08-01 Thread Tom de Vries
On 08/01/2018 03:43 PM, Cesar Philippidis wrote:
> On 08/01/2018 04:01 AM, Tom de Vries wrote:
>> On 07/31/2018 05:12 PM, Cesar Philippidis wrote:
>>> This is an old patch which removes the struct map from the nvptx plugin.
>>> I believe at one point this was supposed to be used to manage async data
>>> mappings, but in practice that never worked out.
>>
>> I don't quite understand what rationale you're trying to present here.
>>
>> Is this dead code?
> 
> It's dead code.
> 

Then OK.

Thanks,
- Tom


Re: [PATCH,nvptx] Remove use of 'struct map' from plugin (nvptx)

2018-08-01 Thread Cesar Philippidis
On 08/01/2018 04:01 AM, Tom de Vries wrote:
> On 07/31/2018 05:12 PM, Cesar Philippidis wrote:
>> This is an old patch which removes the struct map from the nvptx plugin.
>> I believe at one point this was supposed to be used to manage async data
>> mappings, but in practice that never worked out.
> 
> I don't quite understand what rationale you're trying to present here.
> 
> Is this dead code?

It's dead code.

Cesar


Re: [PATCH,nvptx] Remove use of 'struct map' from plugin (nvptx)

2018-08-01 Thread Tom de Vries
On 07/31/2018 05:12 PM, Cesar Philippidis wrote:
> This is an old patch which removes the struct map from the nvptx plugin.
> I believe at one point this was supposed to be used to manage async data
> mappings, but in practice that never worked out.

I don't quite understand what rationale you're trying to present here.

Is this dead code?

If not, what kind of test-case exercises this? What is the difference in
behaviour for such a test-case with and without the patch?

Thanks,
- Tom


[PATCH,nvptx] Remove use of 'struct map' from plugin (nvptx)

2018-07-31 Thread Cesar Philippidis
This is an old patch which removes the struct map from the nvptx plugin.
I believe at one point this was supposed to be used to manage async data
mappings, but in practice that never worked out.

Is this OK for trunk? I bootstrapped and regtested on x86_64 with nvptx
offloading.

Thanks,
Cesar
[PATCH] Remove use of 'struct map' from plugin (nvptx)

2018-XX-YY  Cesar Philippidis  
	James Norris 	

	libgomp/
	* plugin/plugin-nvptx.c (struct map): Removed.
	(map_init, map_pop): Remove use of struct map. (map_push):
	Likewise and change argument list.
	* testsuite/libgomp.oacc-c-c++-common/mapping-1.c: New

(cherry picked from gomp-4_0-branch r231616)
---
 libgomp/plugin/plugin-nvptx.c  | 33 +++-
 .../libgomp.oacc-c-c++-common/mapping-1.c  | 63 ++
 2 files changed, 69 insertions(+), 27 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index a92f054..1237ea10 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -225,13 +225,6 @@ struct nvptx_thread
   struct ptx_device *ptx_dev;
 };
 
-struct map
-{
-  int async;
-  size_t  size;
-  charmappings[0];
-};
-
 static bool
 map_init (struct ptx_stream *s)
 {
@@ -265,16 +258,12 @@ map_fini (struct ptx_stream *s)
 static void
 map_pop (struct ptx_stream *s)
 {
-  struct map *m;
-
   assert (s != NULL);
   assert (s->h_next);
   assert (s->h_prev);
   assert (s->h_tail);
 
-  m = s->h_tail;
-
-  s->h_tail += m->size;
+  s->h_tail = s->h_next;
 
   if (s->h_tail >= s->h_end)
 s->h_tail = s->h_begin + (int) (s->h_tail - s->h_end);
@@ -292,37 +281,27 @@ map_pop (struct ptx_stream *s)
 }
 
 static void
-map_push (struct ptx_stream *s, int async, size_t size, void **h, void **d)
+map_push (struct ptx_stream *s, size_t size, void **h, void **d)
 {
   int left;
   int offset;
-  struct map *m;
 
   assert (s != NULL);
 
   left = s->h_end - s->h_next;
-  size += sizeof (struct map);
 
   assert (s->h_prev);
   assert (s->h_next);
 
   if (size >= left)
 {
-  m = s->h_prev;
-  m->size += left;
-  s->h_next = s->h_begin;
-
-  if (s->h_next + size > s->h_end)
-	GOMP_PLUGIN_fatal ("unable to push map");
+  assert (s->h_next == s->h_prev);
+  s->h_next = s->h_prev = s->h_tail = s->h_begin;
 }
 
   assert (s->h_next);
 
-  m = s->h_next;
-  m->async = async;
-  m->size = size;
-
-  offset = (void *)>mappings[0] - s->h;
+  offset = s->h_next - s->h;
 
   *d = (void *)(s->d + offset);
   *h = (void *)(s->h + offset);
@@ -1291,7 +1270,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   /* This reserves a chunk of a pre-allocated page of memory mapped on both
  the host and the device. HP is a host pointer to the new chunk, and DP is
  the corresponding device pointer.  */
-  map_push (dev_str, async, mapnum * sizeof (void *), , );
+  map_push (dev_str, mapnum * sizeof (void *), , );
 
   GOMP_PLUGIN_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c
new file mode 100644
index 000..593e7d4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+#include 
+#include 
+#include 
+
+/* Exercise the kernel launch argument mapping.  */
+
+int
+main (int argc, char **argv)
+{
+  int a[256], b[256], c[256], d[256], e[256], f[256];
+  int i;
+  int n;
+
+  /* 48 is the size of the mappings for the first parallel construct.  */
+  n = sysconf (_SC_PAGESIZE) / 48 - 1;
+
+  i = 0;
+
+  for (i = 0; i < n; i++)
+{
+  #pragma acc parallel copy (a, b, c, d)
+	{
+	  int j;
+
+	  for (j = 0; j < 256; j++)
+	{
+	  a[j] = j;
+	  b[j] = j;
+	  c[j] = j;
+	  d[j] = j;
+	}
+	}
+}
+
+#pragma acc parallel copy (a, b, c, d, e, f)
+  {
+int j;
+
+for (j = 0; j < 256; j++)
+  {
+	a[j] = j;
+	b[j] = j;
+	c[j] = j;
+	d[j] = j;
+	e[j] = j;
+	f[j] = j;
+  }
+  }
+
+  for (i = 0; i < 256; i++)
+   {
+ if (a[i] != i) abort();
+ if (b[i] != i) abort();
+ if (c[i] != i) abort();
+ if (d[i] != i) abort();
+ if (e[i] != i) abort();
+ if (f[i] != i) abort();
+   }
+
+  exit (0);
+}
-- 
2.7.4