Re: [PATCH 0/8] NVPTX offloading to NVPTX: backend patches

Bernd Schmidt Tue, 18 Oct 2016 04:04:33 -0700

On 10/17/2016 07:06 PM, Alexander Monakov wrote:

I've just pushed two commits to the branch to fix this issue.  Before those, the
last commit left the branch in a state where an incremental build seemed ok
(because libgcc/libgomp weren't rebuilt with the new cc1), but a from-scratch
build was broken like you've shown.  LULESH is known to work.  I also intend to
perform a trunk merge soon.


Ok that did work, however...

I think before merging this work we'll need to have some idea of how well it
works on real-world code.


This patchset and the branch lay the foundation, there's more work to be
done, in particular on the performance improvements side. There should be
an agreement on these fundamental bits first, before moving on to fine-tuning.

The performance I saw was lower by a factor of 80 or so compared totheir CUDA version, and even lower than OpenMP on the host. Does thismatch what you are seeing? Do you have a clear plan how this can beimproved?

To me this kind of performance doesn't look like something that will befixed by fine-tuning; it leaves me undecided whether the chosen approach(what you call the fundamentals) is viable at all. Performance is stillbetter than the OpenACC version of the benchmark, but then I think weshouldn't repeat the mistakes we made with OpenACC and avoid mergingsomething until we're sure it's ready and of benefit to users.



Bernd

Re: [PATCH 0/8] NVPTX offloading to NVPTX: backend patches

Reply via email to