[valgrind] [Bug 413251] Compilation error using GCC 7.4.0 & OpenMPI 4.0.2

2019-12-28 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=413251

--- Comment #6 from Carl Ponder  ---
I wouldn't have a clue how.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413251] Compilation error using GCC 7.4.0 & OpenMPI 4.0.2

2019-11-04 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=413251

Carl Ponder  changed:

   What|Removed |Added

 Status|NEEDSINFO   |REPORTED
 Resolution|NOT A BUG   |---

--- Comment #4 from Carl Ponder  ---
(I had put this into "NEEDSINFO" state, but evidently that means "NEEDSINFO"
from me not you!)

I'm going to hold this open because the MPI 1 support might not always be
available for OpenMPI, and you ought to consider adjusting your interface for
future use.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413251] Compilation error using GCC 7.4.0 & OpenMPI 4.0.2

2019-10-20 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=413251

Carl Ponder  changed:

   What|Removed |Added

 Resolution|--- |NOT A BUG
 Status|REPORTED|NEEDSINFO

--- Comment #2 from Carl Ponder  ---
Building OpenMPI with

--enable-mpi1-compatibility"

appears to solve the problem for both Valgrind and PNetCDF.
I'm going to close this issue now.
It looks like OpenMPI broke compatibility going from 4.0.1 -> 4.0.2.
Can anyone comment on the Valgrind dependency though?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413251] Compilation error using GCC 7.4.0 & OpenMPI 4.0.2

2019-10-20 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=413251

Carl Ponder  changed:

   What|Removed |Added

 CC||cpon...@nvidia.com

--- Comment #1 from Carl Ponder  ---
This is the 3.15.0 release, not just a circa-3.15 snapshot from the SVN
repository. Shouldn't you update the Version list?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413251] New: Compilation error using GCC 7.4.0 & OpenMPI 4.0.2

2019-10-20 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=413251

Bug ID: 413251
   Summary: Compilation error using GCC 7.4.0 & OpenMPI 4.0.2
   Product: valgrind
   Version: 3.15 SVN
  Platform: Ubuntu Packages
OS: Linux
Status: REPORTED
  Severity: normal
  Priority: NOR
 Component: general
  Assignee: jsew...@acm.org
  Reporter: cpon...@nvidia.com
  Target Milestone: ---

SUMMARY
I get these errors in the build:

Making all in mpi
make[2]: Entering directory
'/gpfs/fs1/SHARE/Utils/Valgrind/3.15.0/GCC-7.4.0_OpenMPI-4.0.2/distro/mpi'
/gpfs/fs1/SHARE/Utils/OpenMPI/4.0.2/GCC-7.4.0_CUDA-10.1.243.0_418.87.00_UCX-2019-10-19_HWLoc-2.1.0_ZLib-1.2.11_NUMActl-2.0.13/bin/mpicc
   -I../include  -g -O -fno-omit-frame-
pointer -Wall -fpic -m64 -Wno-deprecated-declarations  -MT
libmpiwrap_amd64_linux_so-libmpiwrap.o -MD -MP -MF
.deps/libmpiwrap_amd64_linux_so-libmpiwrap.Tpo -c -o libmpiwrap_a
md64_linux_so-libmpiwrap.o `test -f 'libmpiwrap.c' || echo './'`libmpiwrap.c
In file included from libmpiwrap.c:116:0:
libmpiwrap.c: In function ‘showTy’:
libmpiwrap.c:281:19: error: expected expression before ‘_Static_assert’
else if (ty == MPI_UB) fprintf(f,"UB");
   ^
libmpiwrap.c:282:19: error: expected expression before ‘_Static_assert’
else if (ty == MPI_LB) fprintf(f,"LB");
   ^
libmpiwrap.c: In function ‘showCombiner’:
libmpiwrap.c:354:12: error: expected expression before ‘_Static_assert’
   case MPI_COMBINER_HVECTOR_INTEGER: fprintf(f, "HVECTOR_INTEGER"); break;
^
libmpiwrap.c:354:12: error: expected expression before ‘_Static_assert’
libmpiwrap.c:354:40: error: expected expression before ‘:’ token
   case MPI_COMBINER_HVECTOR_INTEGER: fprintf(f, "HVECTOR_INTEGER"); break;
^
In file included from libmpiwrap.c:116:0:
libmpiwrap.c:359:12: error: expected expression before ‘_Static_assert’
   case MPI_COMBINER_HINDEXED_INTEGER: fprintf(f, "HINDEXED_INTEGER");
break;
^
libmpiwrap.c:359:12: error: expected expression before ‘_Static_assert’
libmpiwrap.c:359:41: error: expected expression before ‘:’ token
   case MPI_COMBINER_HINDEXED_INTEGER: fprintf(f, "HINDEXED_INTEGER");
break;
 ^
In file included from libmpiwrap.c:116:0:
libmpiwrap.c:366:12: error: expected expression before ‘_Static_assert’
   case MPI_COMBINER_STRUCT_INTEGER: fprintf(f, "STRUCT_INTEGER"); break;
^
In file included from libmpiwrap.c:116:0:
libmpiwrap.c:366:12: error: expected expression before ‘_Static_assert’
   case MPI_COMBINER_STRUCT_INTEGER: fprintf(f, "STRUCT_INTEGER"); break;
^
libmpiwrap.c:366:12: error: expected expression before ‘_Static_assert’
libmpiwrap.c:366:39: error: expected expression before ‘:’ token
   case MPI_COMBINER_STRUCT_INTEGER: fprintf(f, "STRUCT_INTEGER"); break;
   ^
libmpiwrap.c: In function ‘extentOfTy’:
libmpiwrap.c:462:8: warning: implicit declaration of function
‘PMPI_Type_extent’; did you mean ‘MPI_Type_extent’?
[-Wimplicit-function-declaration]
r = PMPI_Type_extent(ty, );
^~~~
MPI_Type_extent
In file included from libmpiwrap.c:116:0:
libmpiwrap.c: In function ‘walk_type’:
libmpiwrap.c:736:17: error: expected expression before ‘_Static_assert’
   if (ty == MPI_LB || ty == MPI_UB)
 ^
Makefile:645: recipe for target 'libmpiwrap_amd64_linux_so-libmpiwrap.o' failed
make[2]: *** [libmpiwrap_amd64_linux_so-libmpiwrap.o] Error 1
make[2]: Leaving directory
'/gpfs/fs1/SHARE/Utils/Valgrind/3.15.0/GCC-7.4.0_OpenMPI-4.0.2/distro/mpi'
Makefile:841: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
'/gpfs/fs1/SHARE/Utils/Valgrind/3.15.0/GCC-7.4.0_OpenMPI-4.0.2/distro'
Makefile:710: recipe for target 'all' failed
make: *** [all] Error 2


STEPS TO REPRODUCE
I'm using these configuration parameters

+ ./configure
--with-mpicc=/gpfs/fs1/SHARE/Utils/OpenMPI/4.0.2/GCC-7.4.0_CUDA-10.1.243.0_418.87.00_UCX-2019-10-19_HWLoc-2.1.0_ZLib-1.2.11_NUMActl-2.0.13/bin/mpicc
--prefix=/gpfs/fs1/SHARE/Utils/Valgrind/3.15.0/GCC-7.4.0_OpenMPI-4.0.2


SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Ubuntu 18,04
OpenMPI:  4.0.2

ADDITIONAL INFORMATION
I believe the list of constants has changed between OpenMPI 4.0.1 & 4.0.2.
I'm seeing similar breakages building the latest PNetCDF.
I'm thinking there may be a flag to get around this though.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-12-01 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #24 from Carl Ponder <cpon...@nvidia.com> ---
I can upload an executable, or I can give you the source-code for the test and
instructions on how to build and run it.
You'd still need to have the PGI runtime installed. I can help you get a demo
copy if you need.

About the zeroing of the space, (a) I can see there's nonzero junk in the
array, and (b) PGI insists that they don't zero-out stack arrays. Why do you
keep insisting that they do? NVIDIA owns PGI and I've been in weekly con-calls
with their compiler developers for the last 5 years.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-30 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #22 from Carl Ponder <cpon...@nvidia.com> ---
I know they're not zeroing out the space.
As far as trying to intercept the subroutine-call, I've worked a little on this
level

  coregrind/m_syswrap

but these only intercept system-calls, right?
And you're saying that there's no analogous convention for me to intercept
calls into the PGI runtime and record the uninitialized data state, right?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-30 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #18 from Carl Ponder <cpon...@nvidia.com> ---
PGI confirms that this call to "__builtin_aa" is what's bumping the stack
pointer. It's a subroutine inside the PGI runtime.

Does valgrind have a way for us to intercept this subroutine-call and then mark
the array-elements as being uninitialized? I think this would solve the problem
for us.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-23 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #17 from Carl Ponder <cpon...@nvidia.com> ---
I uploaded the two assembly-files. From the "sdiff", I think this is where the
allocations vary:

  -Mnostack_arrays -Mstack_arrays
  
---
494 ..Dcfi3:   ..Dcfi3:
495 subq$48, %rsp| subq$32, %rsp
496 movq%rbx, -24(%rbp)  | movq%rbx,
-16(%rbp)
497 movq%r12, -32(%rbp)  | movq%r12,
-24(%rbp)
498 movq%r13, -40(%rbp)  | movq%r13,
-32(%rbp)
499 ##  lineno: 38 ##  lineno: 38
500 movq%rdi, %rbx movq%rdi, %rbx
501 movl(%rbx), %eax   movl(%rbx), %eax
502 movl%eax, -16(%rbp)  | movl%eax,
-8(%rbp)
503 movslq  -16(%rbp), %rax  | movslq  -8(%rbp),
%rdi
504 movq%rax, -8(%rbp)   | shlq$2, %rdi
505 leaq-8(%rbp), %rdi   | call__builtin_aa
xorl%eax, %eax   <
movl$.C2_299, %esi   <
callpgf90_auto_alloc04   <
movq%rax, %r12 movq%rax, %r12

(I'm including the line-numbers, up to the point where they correspond between
the two files).
I'm guessing that these pgf90_auto_alloc04 / __builtin_aa are performing the
allocations, I'll check with PGI on this.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-23 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #16 from Carl Ponder <cpon...@nvidia.com> ---
Created attachment 102409
  --> https://bugs.kde.org/attachment.cgi?id=102409=edit
Assembly generated with stack arrays, where valgrind doesn't work

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-23 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #15 from Carl Ponder <cpon...@nvidia.com> ---
Created attachment 102408
  --> https://bugs.kde.org/attachment.cgi?id=102408=edit
Assembly generated without stack-arrays, where valgrind works

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-22 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #13 from Carl Ponder <cpon...@nvidia.com> ---
Given that there's junk in the array, I know that the contents aren't being
zero'd out, and the PGI people confirm that -Mstack_arrays are not initialized.
How does valgrind recognize that an array is being initialized under the
circumstances? Is it following the control-flow instruction-by-instruction?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-22 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #11 from Carl Ponder <cpon...@nvidia.com> ---
Back to comment #9, there *is* no instruction initializing the array, which is
why it has some junk entries, regardless of valgirind's lack of mention.

Talking to the PGI people, the -Mxtack_arrays flag causes the local arrays to
be allocated on the stack, so the allocation is just a matter of adjusting the
stack-pointer, rather than invoking "malloc" or equivalent.

Does valgrind work by intercepting the malloc calls and then tabulating the
uninitialized memory-cells? And if the arrays are allocated off of the stack in
gfortran or gcc, how would valgrind keep track of this?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-03 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #10 from Carl Ponder <cpon...@nvidia.com> ---
Stopping at line 70 puts it right after the array-allocation but before the
array-writes are happening:

 62   implicit none
 63   integer, intent(in) :: N
 64   integer ( kind = 4 ) i
 65   integer ( kind = 4 ) :: x(1:N)
 66 
 67 !
 68 !  X = { 0, 1, 2, 3, 4, ?a, ?b, ?c, ?d, ?e }.
 69 !
 70   do i = 1, 5

The data-state still says initialized, even though the array contains junk
values:

(gdb) print x
$2 = (40, 0, 117993993, 0, 117993992, 0, 69349896, 0, 19, 0)
(gdb) print 
$3 = (PTR TO -> ( integer (10))) 0xffeffed90
(gdb) monitor xb 0xffeffed90 40
  00  00  00  00  00  00  00  00
0xFFEFFED90:0x280x000x000x000x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFED98:0x090x720x080x070x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDA0:0x080x720x080x070x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDA8:0x080x320x220x040x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDB0:0x130x000x000x000x000x000x000x00

I'm checking with the compiler guys on this.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-03 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #8 from Carl Ponder <cpon...@nvidia.com> ---
If I *don't* compile with the -Mstack_arrays, I get this at line 77 instead:

(gdb) print x
$1 = (0, 1, 2, 3, 4, 0, 0, 0, 0, 0)
(gdb) print 
$2 = (PTR TO -> ( integer (10))) 0x70881d0

(gdb) monitor xb 0x70881d0 40
  00  00  00  00  00  00  00  00
0x70881D0:  0x000x000x000x000x010x000x000x00
  00  00  00  00  00  00  00  00
0x70881D8:  0x020x000x000x000x030x000x000x00
  00  00  00  00  ff  ff  ff  ff
0x70881E0:  0x040x000x000x000x000x000x000x00
  ff  ff  ff  ff  ff  ff  ff  ff
0x70881E8:  0x000x000x000x000x000x000x000x00
  ff  ff  ff  ff  ff  ff  ff  ff
0x70881F0:  0x000x000x000x000x000x000x000x00

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-03 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #7 from Carl Ponder <cpon...@nvidia.com> ---
Ok here's better -- I can see the data if I compile using "-O0 -g" rather than
"-O0 -gopt", which I'd assumed would be the same thing.
Here's what I'm seeing in the step-through: at line 77, the array contains

  (gdb) print x
  $1 = (0, 1, 2, 3, 4, 0, 69349896, 0, 19, 0)

where x(6:10) are uninitialized values. Here are the bits for the 40-byte range
of x:

(gdb) print 
$6 = (PTR TO -> ( integer (10))) 0xffeffed90
(gdb) monitor xb 0xffeffed90 40
  00  00  00  00  00  00  00  00
0xFFEFFED90:0x000x000x000x000x010x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFED98:0x020x000x000x000x030x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDA0:0x040x000x000x000x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDA8:0x080x320x220x040x000x000x000x00
  00  00  00  00  00  00  00  00
0xFFEFFEDB0:0x130x000x000x000x000x000x000x00

This doesn't look right to me, given that x(4) is assigned but x(8) is not:

(gdb) print x(4)
$18 = 3
(gdb) print (4)
$19 = (PTR TO -> ( integer )) 0xffeffed9c
(gdb) monitor xb 0xffeffed9c 4
  00  00  00  00
0xFFEFFED9C:0x030x000x000x00

(gdb) print x(8)
$20 = 0
(gdb) print (8)
$21 = (PTR TO -> ( integer )) 0xffeffedac
(gdb) monitor xb 0xffeffedac 4
  00  00  00  00
0xFFEFFEDAC:0x000x000x000x00

Based on the explanation in the document, I would expect all the bytes to be
assigned FF for X(1:5) and 00 for the rest.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-02 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #4 from Carl Ponder <cpon...@nvidia.com> ---
Can you please list out the commands more precisely?
I ran these commands in one window:

  module purge
  module load pgi/16.9
  module load gcc/4.8.5
  module load valgrind

  pgfortran -o test03.pgi test03.f90 -O0 -gopt -Mstack_arrays
  valgrind --tool=memcheck --vgdb=full --vgdb-error=0 test03.pgi

Then in the second window I ran these commands:

  module purge
  module load pgi/16.9
  module load gcc/4.8.5
  module load valgrind

  gdb test03.pgi
  target remote | vgdb

  b 77
  c

so far so good. But now:

  print N

gives

  Cannot access memory at address 0x4011a000

Why is this? And

  print x(1)

gives

  value being subranged must be in memory

And

  xb 0x4011a000

gives

  Undefined command: "xb".  Try "help".

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-02 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #3 from Carl Ponder <cpon...@nvidia.com> ---
This "pgfortran" is the PGI Fortran compiler.
What I'm puzzled about is why valgrind is finding more uninitialized
array-elements when I compiled with gfortran than with pgfortran, and if I use

pgfortran -O0 -gopt -Mstack_arrays ...

valgrind doesn't find any uninitialized array-elements at all.
So this "gdb+vgdb" will show me the valgrind internal tables that keep track of
what's initialized and what isn't?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] No uninitialised values reported with PGI -Mstack_arrays

2016-11-02 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

--- Comment #1 from Carl Ponder <cpon...@nvidia.com> ---
I attached the test-case here. You can reproduce the issue as follows:

pgfortran -o test03.pgi test03.f90 -O0 -gopt
valgrind test03.pgi # 12 errors.

pgfortran -o test03.pgi test03.f90 -O0 -gopt -Mstack_arrays
valgrind test03.pgi # 0 errors.

I'm using the PGI 16.9 compiler running on CentOS 7.2. The valgrind was built
with GCC 4.8.5.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 371966] New: No uninitialised values reported with PGI -Mstack_arrays

2016-11-02 Thread Carl Ponder
https://bugs.kde.org/show_bug.cgi?id=371966

Bug ID: 371966
   Summary: No uninitialised values reported with PGI
-Mstack_arrays
   Product: valgrind
   Version: 3.11.0
  Platform: unspecified
OS: Linux
Status: UNCONFIRMED
  Severity: normal
  Priority: NOR
 Component: memcheck
  Assignee: jsew...@acm.org
  Reporter: cpon...@nvidia.com
  Target Milestone: ---

Created attachment 101954
  --> https://bugs.kde.org/attachment.cgi?id=101954=edit
Simple Fortran test-case using array with dynamic bound.

I have a simple Fortran test-case that allocates an array and uses
uninitialized values from it. Using the PGI compiler, if I compile it using the
-Mstack_arrays option, valgrind reports 0 errors.

I also have a HUGE program (WRF) where valgrind is likewise not reporting
anything in spite of the fact that uninitialized array-elements are being used,
so I'm trying to track down issues like this one.

Can you guys explain what's going on? I'm also checking with PGI on this.

-- 
You are receiving this mail because:
You are watching all bug changes.