Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Giacomo Mulas

On Thu, 13 Oct 2022, Thomas Schwinge wrote:


It does work, but only for the code that GCC/nvptx generates in that
'gcc-12' invocation, but not for the support libraries that it linkes in,
which are built for sm_30.


Does this mean then that the support libraries of gcc-11-offload-nvptx
include both support for sm_30 and sm_35? Is it possible to compile such
support libraries so that they do support more than one cuda arch level,
instead of having, as in the case of GCC 12 support libraries, _only_ sm_30
as available option (if I understood you correctly)?


However, that doesn't really help you as a user of GCC, as long as the
distributions don't (have an easy way to) build more variants for several
sm_[...].  More work is necessary in GCC/nvptx upstream to make that
feasible.


well, debian in itself does support this kind of setup, doesn't it? With
alternatives, provides in dpkg... Of course, I gather that putting together
the machinery to build a number of versions of the same package would be
somewhat of a pain to set up and maintain.

But anyway, given all you said, can this issue be solved at all, even acting
on nvptx-tools? If the issue lies in the support libraries, that problem
would still remain regardless of what you do on nvptx-tools, wouldn't it?

Thanks, bye
Giacomo

--
_

Giacomo Mulas 
_

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180255
mob. : +39 329  6603810
_

"It's just a shadow of the man you should be
Like a garden in the forest that the world will never see
You have no thought of answers only questions to be filled"
 (Big Country)
_



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Thomas Schwinge
Hi!

(Tom added in CC; for your information, this is about
.)


On 2022-10-13T12:38:17+0200, Giacomo Mulas  wrote:
> On Thu, 13 Oct 2022, Thomas Schwinge wrote:
>
>> Aha, that's a legit question to wonder about: the reason is that GCC 11
>> defaulted to sm_35 code generation (which CUDA 11 still does support):
>> , just then GCC 12 again
>> reverted to sm_30: .
>
> but then, wouldn't the most straightforward fix to change again GCC 12 to
> generate sm_35 code by default?

GCC upstream would like to continue supporting old GPU hardware, some of
which doesn't support sm_35.  That's why GCC 12 fixed this "regression"
introduced in GCC 11, and reverted back to sm_30, like GCC 10 and earlier
had.

> And also, is there some oscure command line
> option to explicitly request GCC 12 to generate code of some specific sm
> level (possibly even higher than sm_35)? I did try using
> gcc-12 -fopenmp -foffload=nvptx-none -foffload-options="-misa=sm_35"
> but it still does not work, while in principle it should. Why doesn't it?

It does work, but only for the code that GCC/nvptx generates in that
'gcc-12' invocation, but not for the support libraries that it linkes in,
which are built for sm_30.

> On the other hand, if I use
> gcc-11 -fopenmp -foffload=nvptx-none="-misa=sm_30"
> then I get the same error message I get with gcc-12.

ACK.

> Is there something wrong in how GCC 12 handles nvptx code generation
> options?

Works as expected, per my understanding.


That said, with the upcoming GCC 13 you'll be able to build (!) GCC/nvptx
with a '--with-arch=[...]' 'configure' option, see
.

'gcc/doc/install.texi', "nvptx-*-none":

The @option{--with-arch} option may be specified to override the
default value for the @option{-march} option, and to also build
corresponding target libraries.
The default is @option{--with-arch=sm_30}.

For example, if @option{--with-arch=sm_70} is specified,
@option{-march=sm_30} and @option{-march=sm_70} target libraries are
built, and code generation defaults to @option{-march=sm_70}.

However, that doesn't really help you as a user of GCC, as long as the
distributions don't (have an easy way to) build more variants for several
sm_[...].  More work is necessary in GCC/nvptx upstream to make that
feasible.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Giacomo Mulas

On Thu, 13 Oct 2022, Thomas Schwinge wrote:


Aha, that's a legit question to wonder about: the reason is that GCC 11
defaulted to sm_35 code generation (which CUDA 11 still does support):
, just then GCC 12 again
reverted to sm_30: .


but then, wouldn't the most straightforward fix to change again GCC 12 to
generate sm_35 code by default? And also, is there some oscure command line
option to explicitly request GCC 12 to generate code of some specific sm
level (possibly even higher than sm_35)? I did try using
gcc-12 -fopenmp -foffload=nvptx-none -foffload-options="-misa=sm_35"
but it still does not work, while in principle it should. Why doesn't it?
On the other hand, if I use
gcc-11 -fopenmp -foffload=nvptx-none="-misa=sm_30"
then I get the same error message I get with gcc-12.
Is there something wrong in how GCC 12 handles nvptx code generation 
options?


bye
Giacomo


--
_

Giacomo Mulas 
_

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180255
mob. : +39 329  6603810
_

"It's just a shadow of the man you should be
Like a garden in the forest that the world will never see
You have no thought of answers only questions to be filled"
 (Big Country)
_



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Thomas Schwinge
Hi Giacomo!

On 2022-10-13T12:20:19+0200, Giacomo Mulas  wrote:
> On Thu, 13 Oct 2022, Thomas Schwinge wrote:
>
>> Debian need to update nvptx-tools to a version that includes
>> 
>> "as: Deal with CUDA 11.0, "Support for Kepler 'sm_30' and 'sm_32'
>> architecture based products is dropped"
>
> ok, but I am puzzled by this: if the issue is with nvptx-tools and not with
> gcc-12-offload-nvptx, why does gcc-11-offload-nvptx work, producing working
> executables that target sm_35 if I compile with gcc-11 on the same laptop?

Aha, that's a legit question to wonder about: the reason is that GCC 11
defaulted to sm_35 code generation (which CUDA 11 still does support):
, just then GCC 12 again
reverted to sm_30: .


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Giacomo Mulas

On Thu, 13 Oct 2022, Thomas Schwinge wrote:


Debian need to update nvptx-tools to a version that includes

"as: Deal with CUDA 11.0, "Support for Kepler 'sm_30' and 'sm_32'
architecture based products is dropped"


ok, but I am puzzled by this: if the issue is with nvptx-tools and not with
gcc-12-offload-nvptx, why does gcc-11-offload-nvptx work, producing working
executables that target sm_35 if I compile with gcc-11 on the same laptop?

Thanks, bye
Giacomo

--
_

Giacomo Mulas 
_

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180255
mob. : +39 329  6603810
_

"It's just a shadow of the man you should be
Like a garden in the forest that the world will never see
You have no thought of answers only questions to be filled"
 (Big Country)
_



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-13 Thread Thomas Schwinge
reassign 1021660 nvptx-tools
stop


Hi!

On 2022-10-12T17:19:02+0200, Giacomo Mulas  wrote:
> the nvptx plugin for gcc-12 currently available for sid mandates a
> cuda level sm_30, which is no longer available in cuda 11 (the one
> now in sid). This means that even a trivial example code like [...]
> fails to compile with
>
> capitanata:~/test$ gcc-12 -fopenmp test_openmp_2.c
> ptxas fatal   : Value 'sm_30' is not defined for option 'gpu-name'
> nvptx-as: ptxas returned 255 exit status

Debian need to update nvptx-tools to a version that includes

"as: Deal with CUDA 11.0, "Support for Kepler 'sm_30' and 'sm_32'
architecture based products is dropped"


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955



Bug#1021660: gcc-12-offload-nvptx: offloading to nvidia via nvptx fails with cuda version 11 (default in sid)

2022-10-12 Thread Giacomo Mulas
Package: gcc-12-offload-nvptx
Version: 12.2.0-5
Severity: grave
Justification: renders package unusable for nvidia

Dear Maintainer,

the nvptx plugin for gcc-12 currently available for sid mandates a 
cuda level sm_30, which is no longer available in cuda 11 (the one
now in sid). This means that even a trivial example code like

#include 
#include 
  int main(int argc, char **argv){
#pragma omp target parallel
 {
   int i, j;
   i = omp_get_thread_num();
   j = omp_get_num_threads();
   printf("Hello world! I am thread %d out of %d\n", i, j);
  }
 }

fails to compile with

capitanata:~/test$ gcc-12 -fopenmp test_openmp_2.c
ptxas fatal   : Value 'sm_30' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-12 returned 1 
exit status
compilation terminated.
lto-wrapper: fatal error: 
/usr/lib/gcc/x86_64-linux-gnu/12//accel/nvptx-none/mkoffload returned 1 exit 
status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

even trying to set a specific target gpu arch does not seem to work, e.g.

gmulas@capitanata:~/test$ gcc-12 -fopenmp -foffload-options="-misa=sm_35" 
test_openmp_2.c
ptxas fatal   : Value 'sm_30' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-12 returned 1 
exit status
compilation terminated.
lto-wrapper: fatal error: 
/usr/lib/gcc/x86_64-linux-gnu/12//accel/nvptx-none/mkoffload returned 1 exit 
status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

On the other hand, gcc-11 appears to have sm_35 as default, meaning it works, 
both with and without the -misa option:

capitanata:~/test$ gcc-11 -fopenmp -foffload="-misa=sm_35" test_openmp_2.c
/usr/bin/ld: /tmp/user/1000/ccY5a4YE.crtoffloadtable.o: warning: relocation 
against `__offload_vars_end' in read-only section `.rodata'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE

capitanata:~/test$ gcc-11 -fopenmp test_openmp_2.c
/usr/bin/ld: /tmp/user/1000/ccHibGBc.crtoffloadtable.o: warning: relocation 
against `__offload_vars_end' in read-only section `.rodata'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE

and the resulting code runs:

capitanata:~/test$ ./a.out 
Hello world! I am thread 4 out of 8
Hello world! I am thread 1 out of 8
Hello world! I am thread 6 out of 8
Hello world! I am thread 7 out of 8
Hello world! I am thread 0 out of 8
Hello world! I am thread 5 out of 8
Hello world! I am thread 2 out of 8
Hello world! I am thread 3 out of 8

Would it be possible to change the default -misa of gcc 12 to sm_35, 
to enable gpu offloading to nvidia to work with gcc-12? And/or, is there
some undocumented, or poorly documented, way to actually specify on the 
command line the requested cuda level architecture so that it works with
cuda 11 libraries?

Thanks in advance

Best regards
Giacomo Mulas


-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable
  APT policy: (401, 'unstable'), (10, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.19.0-2-amd64 (SMP w/12 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=it_IT.UTF-8, LC_CTYPE=it_IT.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages gcc-12-offload-nvptx depends on:
ii  gcc-12 12.2.0-5
ii  gcc-12-base12.2.0-5
ii  libc6  2.35-3
ii  libc6-dev  2.35-3
ii  libgmp10   2:6.2.1+dfsg1-1.1
ii  libgomp-plugin-nvptx1  12.2.0-5
ii  libmpc31.2.1-2
ii  libmpfr6   4.1.0-3
ii  libzstd1   1.5.2+dfsg-1
ii  nvptx-tools0.20180301-1
ii  zlib1g 1:1.2.11.dfsg-4.1

gcc-12-offload-nvptx recommends no packages.

gcc-12-offload-nvptx suggests no packages.

-- no debconf information