On 2015/9/25 上午 04:27, Ilya Verbin wrote: > On Thu, Aug 27, 2015 at 21:44:50 +0800, Chung-Lin Tang wrote: >> We've discovered that, for several of the libgomp plugin interface routines, >> if the target specific routine calls exit() (usually upon a fatal condition), >> deadlock ensues. We found this using nvptx, but it's possible on intelmic as >> well. >> >> This is due to many of the plugin routines are called with the device lock >> held, >> and when exit() is called inside the plugin code, the GOMP_unregister_var() >> destructor >> tries to iterate through and acquire all device locks to cleanup. Since we >> already hold >> one of the device locks, this just gets stuck. Also because gomp_mutex_t is >> a >> simple futex based lock implementation (instead of pthreads), we don't have a >> trylock mechanism to use either. >> >> So this patch tries to alleviate this problem by changing the plugin >> interface; >> the plugin routines that are called while holding the device lock are >> adjusted >> to assume to never fatal exit, but return a value back to libgomp proper to >> indicate execution results. The core libgomp code then may unlock and call >> gomp_fatal(). >> >> We believe this is the right route to solve the problem, since there's only >> two accel target plugins so far. Besides the nvptx plugin, I have made some >> effort >> to update the intelmic plugin as well, though it's not as thoroughly audited. >> Intel folks might want to further make sure your plugin code is free of this >> problem as well. >> >> This patch contains the libgomp proper changes. The nvptx and intelmic >> patches follow. >> I have tested the libgomp testsuite without regressions for both accel >> targets, is this >> okay for trunk? > > (I have no objections) > > However, in case of intelmic, these exit()s are just the tip of the iceberg, > because underlying liboffloadmic contains other exit()s at fatal errors. > And I don't know what to do with such deadlocks. > > -- Ilya
Yes, I think I saw more things to adjust wrt this issue within liboffloadmic, though I hope this plugin interface change can set things ready. And ping again, for the libgomp proper changes. Thanks, Chung-Lin