Unlike cuMemFree and other resource-releasing functions called on exit, cuMemFreeHost appears to re-report errors encountered in kernel launch. This leads to a deadlock after GOMP_PLUGIN_fatal is reentered.
While the behavior on libgomp side is suboptimal (there's no need to call resource-releasing functions if we're about to destroy the CUDA context anyway), this behavior on cuMemFreeHost part is not useful and just makes error "recovery" harder. This was reported to NVIDIA (bug ref. 1737876), but we can work around it by simply reporting the error without making it fatal. * plugin/plugin-nvptx.c (map_fini): Make cuMemFreeHost error non-fatal. --- libgomp/ChangeLog.gomp-nvptx | 4 ++++ libgomp/plugin/plugin-nvptx.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index adf57b1..4e44242 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -135,7 +135,7 @@ map_fini (struct ptx_stream *s) r = cuMemFreeHost (s->h); if (r != CUDA_SUCCESS) - GOMP_PLUGIN_fatal ("cuMemFreeHost error: %s", cuda_error (r)); + GOMP_PLUGIN_error ("cuMemFreeHost error: %s", cuda_error (r)); } static void