Executing this example I've got the following message
X10RT: async 37 is not a CUDA kernel.
If I'm not wrong, this message comes from the kernel function, once this
message dissapears when I comment out the call to the kernel function.
I suspect the kernel activity is getting attached to the clock, and the
clock implementation is trying to treat the CUDA place as an ordinary host
place by sending it asyncs without @CUDA. This is a bug, we need to make
sure that clocks can be used in the host application without ending up on
the GPU place.
So, could you guys guide me about this?
1. Am I correct to think that I cannot employ Team when 2 or more GPUs
belong to the same place?
You can do this, you just need to make a team with more than one participant
per place, and then each participant gets its own integer id to use when
calling into the collective operations.
2. What is the relationship between a finish in the host code and a
finish in the kernel code? Or maybe this question should be on clocks
instead on finishes ?
They ought to just work, but we haven't tested them for a while. Certainly
you cannot do:
clocked async (gpu) { ... }
However if you do not do this, the gpu should never end up being 'part' of
the clock, and it should work fine.
2. Would you recommend a explicit clock in order to avoid conflict with
the clock in the kernel function?
That would be a reasonable workaround, you could also use teams instead of
clocks.
--
___
X10-users mailing list
X10-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/x10-users