On 03/22/2018 10:51 AM, Tom de Vries wrote:
> On 03/22/2018 06:24 PM, Cesar Philippidis wrote:
>> On 03/22/2018 09:18 AM, Tom de Vries wrote:
>>
>>> That's obviously not good enough.
>>>
>>> When I compile this test-case:
>>> ...
>>> int
>>> main (void)
>>> {
>>>    int a[10];
>>> #pragma acc parallel num_workers (16)
>>> #pragma acc loop worker
>>>    for (int i = 0; i < 10; i++)
>>>      a[i] = i;
>>>
>>>    return 0;
>>> }
>>> ...
>>>
>>> I get:
>>> ...
>>>   .maxntid 32, 16, 1
>>> ...
>>>
>>> That's the change you need to isolate.
>>
>> I attached an updated patch which incorporates the
>> cfun->machine->axis_dim changes. It now generates more precise arguments
>> for maxntid.
> 
> I'll try this out.
> 
> Still, this doesn't address my request: "Also, list in the comment a JIT
> driver version, and sm_ version and a testcase for which this is required"

I attached the test case where it used to fail without maxntid. But
after looking at again, the maxntid directive was probably masking that
other PTX JIT bug involving abort and exiting threads that you fixed.
And in fact, the test case works without the maxntid patch on my sm_60 GPU.

I'm going to retest the variable vector length changes without it and
see if it's still necessary. On one hand, maxntid should be fairly
innocuous, but I don't like how it can mask other PTX JIT bugs. At this
point, I'm leaning towards dropping it if does not impact the libgomp
regression test suite anymore. What do you want to do?

Cesar
/* This test was failing with nvptx offloading without the .maxntid
   PTX directive.  */

int i;
int main(void)
{
  int j, v;
  i = -1;
  j = -2;
  v = 0;

  j = -2;
  v = 0;
#pragma acc parallel present_or_copyout (v) copyout (i, j) vector_length(128)
  {
    i = 2;
    j = 1;
    if (i != 2 || j != 1)
      __builtin_abort ();
    v = 1;
  }
  if (v != 1 || i != 2 || j != 1)
    __builtin_abort ();
  i = -1;
  j = -2;
  v = 0;
#pragma acc parallel present_or_copyout (v) copy (i, j) vector_length(128)
  {
    if (i != -1 || j != -2)
      __builtin_abort ();
    i = 2;
    j = 1;
    if (i != 2 || j != 1)
      __builtin_abort ();
    v = 1;
  }
  if (v != 1 || i != 2 || j != 1)
    __builtin_abort ();

  return 0;
}

Reply via email to