Hi! On 2024-09-23T09:22:55+0200, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Sep 20, 2024 at 6:50 PM Thomas Schwinge <tschwi...@baylibre.com> > wrote: >> (This is orthogonal to yesterday's >> "GCC 15: nvptx '-mptx=3.1' multilib variants are deprecated".) >> >> We'd like to raise nvptx code generation from PTX ISA 6.0, sm_30 "Kepler" >> to default PTX ISA 7.3, sm_52 "Maxwell", therefore CUDA 11.3 (2021-04). >> This is, primarily, so that we're able to use 'alloca' and related stack >> manipulation instructions, and improve upon the current: >> >> sorry ("target cannot support alloca"); >> >> I see, for example: >> >> - Ubuntu 22.04 "jammy" LTS has 11.5.1-1ubuntu1 packaged >> - Debian 12 "stable" ("bookworm", 2023-06) has 11.8.89~11.8.0-5~deb12u1 >> packaged >> >> ..., and sm_52 "Maxwell" has been supported as of CUDA 6.5 (2014-08), and >> thus supported by most Nvidia GPUs of the last decade, approximately. >> >> The question is, whether we continue to build by default also the current >> sm_30 "Kepler" multilib variants (to be available for use via >> building/linking with '-march=sm_30'), or if that's truly obsolete by >> now, and need no longer be available by default? (It has been deprecated >> for a long time, and sm_3x "Kepler architecture support is removed from >> CUDA 12.0", 2022-12.) There's always the 'configure'-time >> '--with-arch=sm_30' if you (additionally to sm_52) still need your target >> libraries built for sm_30 multilib variants; I would argue that's >> sufficient? > > I seem to have Turing so the change works for me
ACK. > How would the user > experience be with using the sm_30 multilib? Let's consider the cases searately, under the assumption that the user's Nvidia GPU doesn't support sm_52 or higher: (1) GCC/nvptx 'configure'd without '--with-arch=[...]' (that is, default '--with-arch=sm_52'), and there are no sm_30 multilib variants: offloading codes build fine (for sm_52), but at run time libgomp either (to be decided) emits an error message or silently ignores the device as incapable. (That's effectively the same scenario as if you build with the wrong '-march=[...]' for GCC/GCN offloading. Probably we should mirror for nvptx how GCN behaves.) <https://gcc.gnu.org/PR71646> "incompability between ptx code and GPU hardware" should get resolved here. (2) GCC/nvptx 'configure'd '--with-arch=sm_30': generally, offloading codes continue to build and execute as they do now, just for specific upcoming OpenACC functionality (array/struct reductions), you may run into compile-time 'sorry ("target cannot support alloca");'. (..., which we may evolve into a more helpful error message.) (3) GCC/nvptx 'configure'd without '--with-arch=[...]' (that is, default '--with-arch=sm_52'), but we do build sm_30 multilib variants (either (3a) by default or (3b) upon 'configure'-time request via an additional option: '--with-multilib-list=default,sm_30' or similar), and the user builds with '-foffload-options=nvptx-none=-march=sm_30': behaves as (2). Now, (1) and (2) behave as expected (other than maybe emit more helpful error messages). My concern with (3a) is that the sm_30 multilib variants built by default will largely be unused (as they're obsolete with moderatly recent Nvidia GPU hardware), but I'll be happy to implement (3b) if people think that's still helpful. In fact, (3b) can then generally support 'configure'-time selection of further multilib variants to be built (for example, '--with-multilib-list=default,sm_30,sm_89') -- but not use them as default '-march=[...]', in contrast to what '--with-arch=sm_30' or '--with-arch=sm_89' does, for example. I'll look into that. Grüße Thomas