https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104714
Bug ID: 104714 Summary: [nvptx] Means to specify any sm_xx Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- I'm testing on a couple of boards, with some different settings, and one of those settings is: test native architecture. That is, for an NVIDIA T400 with sm_75, test with -misa=sm_75. But that doesn't work for all boards, because f.i. for a GeForce GT 1030, with sm_61, gcc doesn't support -misa=sm_61. It only support values for which different code may be generated. So, we use instead -misa=sm_53. I have some code in a script, which has this mapped out: ... case $id in GeForce-GT-710) sm=35 opt_sm=35 ;; Quadro-K620) sm=50 opt_sm=35 # Next is 53, too high. ;; GeForce-GT-1030) sm=61 opt_sm=53 # Next is 75, to high. ;; NVIDIA-T400) sm=75 opt_sm=75 ;; *) echo "Unknown id: $id" exit 1 ;; esac ... There are two problems with this: - it's cumbersome to do the mapping, possibly in various locations - the mapping may have to be updated for newer releases, which introduce additional -misa values It would be nice to be able to just specify what board sm you have, and then have gcc figure out the current closest and supported -misa value. We could do this by just allowing any -misa value, say allow -misa=sm_61 and internally map it to sm_53. OTOH, we could use this as an opportunity to sidestep the much regretted name -misa (given that -mptx is used to specify the ptx isa version, and misa the ptx architecture) and introduce say -march for this. This option would then have to be mutually exclusive with -misa. There's an open question though: when specifying sm_61, the code generation internally will switch to sm_53, but what do we emit in the .target field: ... // BEGIN PREAMBLE .version 6.0 .target sm_xx .address_size 64 // END PREAMBLE ... ? sm_53 or sm_61? I'm not entirely sure yet what the benefit would be of having ".target sm_61". F.i. the driver 510.x has given up on the kepler architecture, so we can't use it for a kepler board. But we can generate code for ".target sm_30" and have that same driver map it onto a post-kepler board. So I don't see any benefits here in terms of allowed driver version. So for the moment, I'd go with sm_53. [ FWIW, it would be great if we could simply specify -march=native, and have gcc query the nvidia driver to see what board there is using cuDeviceGetAttribute and CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR. And possibly handle the situation of multiple boards by using the minimum. But, much more involved to realize. ]