[valgrind] [Bug 479191] vgdb is blocked after several tries

2023-12-31 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=479191

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
On what are you running ? (processor, linux version, ...).

The difference between the first and second vgdb trace is that in the second
case,
the threads are blocked in a system call, and vgdb has to do more complex
operations to
wake up valgrind.

Also, you launch valgrind with --vgdb-error=0. 
This allows to do some gdb/vgdb operations before startup.
What is the reason for this in your case ?


If you put two -d options, vgdb will output more debugging info.

Also, you might add debugging options (-v -v -v -d -d -d) at valgrind side to
see what happens there.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 473944] Handle mold linker split RW PT_LOAD segments correctly

2023-09-01 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=473944

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Philippe Waroquiers  ---
Fixed in c0b2c786d

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 473944] Handle mold linker split RW PT_LOAD segments correctly

2023-08-31 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=473944

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Paul Floyd from comment #2)
> With the patch I get one regression failure
> helgrind/tests/pth_destroy_cond  (stderr)
> 
> There is missing source information, presumably due to a failure reading
> debuginfo.
> 
> The first aspacem map is
> 
> --54635:1: aspacem <<< SHOW_SEGMENTS: Memory layout at client startup (32
> segments)
> --54635:1: aspacem 3 segment names in 3 slots
> --54635:1: aspacem freelist is empty
> --54635:1: aspacem (0,4,7)
> /usr/home/paulf/scratch/valgrind/none/none-amd64-freebsd
> --54635:1: aspacem (1,65,6)
> /usr/home/paulf/scratch/valgrind/helgrind/tests/pth_destroy_cond
> --54635:1: aspacem (2,134,7) /libexec/ld-elf.so.1
> --54635:1: aspacem   0: RSVN 00-1f 2097152 - SmFixed
> --54635:1: aspacem   1: file 20-200fff4096 r
> d=0x6d8ca7de696e301b i=2440208 o=0   (1,65)
> --54635:1: aspacem   2: file 201000-201fff4096 r-x--
> d=0x6d8ca7de696e301b i=2440208 o=0   (1,65)
> --54635:1: aspacem   3: file 202000-203fff8192 rw---
> d=0x6d8ca7de696e301b i=2440208 o=0   (1,65)
> --54635:1: aspacem   4: RSVN 204000-0003ff 61m - SmFixed
> --54635:1: aspacem   5: file 000400-0004006fff   28672 r
> d=0x28a8dde4190bc5c i=1049059 o=0   (2,134)
> --54635:1: aspacem   6: file 0004007000-000401cfff   90112 r-x--
> d=0x28a8dde4190bc5c i=1049059 o=24576   (2,134)
> --54635:1: aspacem   7: file 000401d000-000401dfff4096 rw---
> d=0x28a8dde4190bc5c i=1049059 o=110592  (2,134)
> --54635:1: aspacem   8: file 000401e000-000401efff4096 rw---
> d=0x28a8dde4190bc5c i=1049059 o=110592  (2,134)
> --54635:1: aspacem   9: anon 000401f000-0004014096 rw---
> --54635:1: aspacem  10: anon 000402-0004020fff4096 rwx--
> --54635:1: aspacem  11: RSVN 0004021000-000481 8384512 - SmLower
> --54635:1: aspacem  12:  000482-0037ff823m
> --54635:1: aspacem  13: FILE 003800-00380abfff  704512 r
> d=0x696e301b i=1844040 o=0   (0,4)
> --54635:1: aspacem  14: FILE 00380ac000-0038141fff  614400 r-x--
> d=0x696e301b i=1844040 o=700416  (0,4)
> --54635:1: aspacem  15: file 0038142000-0038142fff4096 r-x--
> d=0x696e301b i=1844040 o=1314816 (0,4)
> --54635:1: aspacem  16: FILE 0038143000-003821bfff  32 r-x--
> d=0x696e301b i=1844040 o=1318912 (0,4)
> --54635:1: aspacem  17: FILE 003821c000-003821cfff4096 rw---
> d=0x696e301b i=1844040 o=2203648 (0,4)
> 
> objdump:
> paulf> objdump -p pth_destroy_cond  
> 
> 
> pth_destroy_cond: file format elf64-x86-64-freebsd
> 
> Program Header:
> PHDR off0x0040 vaddr 0x00200040 paddr
> 0x00200040 align 2**3
>  filesz 0x0268 memsz 0x0268 flags r--
>   INTERP off0x02a8 vaddr 0x002002a8 paddr
> 0x002002a8 align 2**0
>  filesz 0x0015 memsz 0x0015 flags r--
> LOAD off0x vaddr 0x0020 paddr
> 0x0020 align 2**12
>  filesz 0x08ec memsz 0x08ec flags r--
> LOAD off0x08f0 vaddr 0x002018f0 paddr
> 0x002018f0 align 2**12
>  filesz 0x0590 memsz 0x0590 flags r-x
> LOAD off0x0e80 vaddr 0x00202e80 paddr
> 0x00202e80 align 2**12
>  filesz 0x0180 memsz 0x0180 flags rw-
> LOAD off0x1000 vaddr 0x00203000 paddr
> 0x00203000 align 2**12
>  filesz 0x0098 memsz 0x00c8 flags rw-
> 
> And the trace when running with a couple more I added
> --56884-- ++*rw_load_count to 2 for
> /usr/home/paulf/scratch/valgrind/helgrind/tests/pth_destroy_cond
> --56884--  offset 1000 offset roundup 1000
> --56884--  prev + size 203e80 addr 203000
> 
> If I change the condition to
> 
>if (previous_rw_a_phdr.p_memsz > 0 &&
>ehdr_m.e_type == ET_EXEC &&
>previous_rw_a_phdr.p_vaddr + previous_rw_a_phdr.p_filesz
> == a_phdr.p_vaddr)
> 
> then it works.

Thanks for the testing. The above condition also works for the executables
linked by mold 1.5.1 in my setup (RHEL 8.6)
(in my case, the condition has to ensure the decrement is not done as the 2
segments are not merged).

I will finalize the patch (e.g. to put some traces corresponding to what you
added) and push.

Thanks
Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 473944] New: Handle mold linker split RW PT_LOAD segments correctly

2023-08-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=473944

Bug ID: 473944
   Summary: Handle mold linker split RW PT_LOAD segments correctly
Classification: Developer tools
   Product: valgrind
   Version: 3.21.0
  Platform: Other
OS: Linux
Status: REPORTED
  Severity: normal
  Priority: NOR
 Component: general
  Assignee: jsew...@acm.org
  Reporter: philippe.waroqui...@skynet.be
  Target Milestone: ---

Created attachment 161282
  --> https://bugs.kde.org/attachment.cgi?id=161282=edit
change the condition to detect 2 PT_LOADs will be merged

This is a follow-up/similar problem as reported and fixed in bug 452802.

valgrind could not load the debug info of the main executable.
The problem is the same as 452802. The fix of 452802 does not work as the logic
to detect that
the 2 PT_LOAD segments are different currently wrongly concludes that the 2
segments can be combined.

Here are the details (with some remarks/comments prefixed with #).
The attached patch solves the problem on my setup (RHEL 8.6, ld-2.28, mold
1.5.1), but as this completely changes
the condition to consider the segments to be mergeable, would be nice to
(re-)validate this e.g. with lld
or other setups where split PT_LOAD are produced.

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class: ELF64
  Data:  2's complement, little endian
  Version:   1 (current)
  OS/ABI:UNIX - System V
  ABI Version:   0
  Type:  EXEC (Executable file)
  Machine:   Advanced Micro Devices X86-64
  Version:   0x1
  Entry point address:   0x25e000
  Start of program headers:  64 (bytes into file)
  Start of section headers:  214499808 (bytes into file)
  Flags: 0x0
  Size of this header:   64 (bytes)
  Size of program headers:   56 (bytes)
  Number of program headers: 13
  Size of section headers:   64 (bytes)
  Number of section headers: 64
  Section header string table index: 49

Section Headers:
  [Nr] Name  Type Address   Offset
   Size  EntSize  Flags  Link  Info  Align
  [ 0]   NULL   
        0 0 0
  [ 1] .interp   PROGBITS 00200318  0318
   001c     A   0 0 1
  [ 2] .note.gnu.build-i NOTE 00200334  0334
   0024     A   0 0 4
  [ 3] .note.ABI-tag NOTE 00200358  0358
   0020     A   0 0 4
  [ 4] .hash HASH 00200378  0378
   7f60  0004   A   6 0 4
  [ 5] .gnu.hash GNU_HASH 002082d8  82d8
   02b8     A   6 0 8
  [ 6] .dynsym   DYNSYM   00208590  8590
   00017e08  0018   A   7 1 8
  [ 7] .dynstr   STRTAB   00220398  00020398
   0001999a     A   0 0 1
  [ 8] .gnu.version  VERSYM   00239d32  00039d32
   1fd6  0002   A   6 0 2
  [ 9] .gnu.version_rVERNEED  0023bd08  0003bd08
   02c0     A   7 9 8
  [10] .rela.dyn RELA 0023bfc8  0003bfc8
   1188  0018   A   6 0 8
  [11] .rela.plt RELA 0023d150  0003d150
   00013110  0018   A   635 8
  [12] .plt  PROGBITS 00251000  00051000
   cb80    AX   0 0 16
  [13] .plt.got  PROGBITS 0025db80  0005db80
   0230    AX   0 0 16
  [14] .fini PROGBITS 0025ddb0  0005ddb0
   000d    AX   0 0 4
  [15] .init PROGBITS 0025ddc0  0005ddc0
   0020    AX   0 0 4
  [16] .text PROGBITS 0025e000  0005e000
   07bce0d4    AX   0 0 4096
  [17] google_malloc PROGBITS 07e2c100  07c2c100
   1f1a    AX   0 0 64
  [18] malloc_hook   PROGBITS 07e2e020  07c2e020
   0685    AX   

[valgrind] [Bug 444488] Use glibc.pthread.stack_cache_size tunable

2022-11-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=88

--- Comment #7 from Philippe Waroquiers  ---
(In reply to Paul Floyd from comment #6)
> > I think we should check and use the existing hint. Current users of the 
> > hint 
> > will/should have the same behaviour whatever the glibc version.
> 
> There is gnu_get_libc_version(). Now, how to call that without breaking musl.
I suspect it might be problematic to call a glibc function (as a host function)
during client image initialisation.
So, we might have to always set the tunable env variable as part of the client
initialisation,
and call gnu_get_libc_version before producing an error message that the old
hack did not work.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444488] Use glibc.pthread.stack_cache_size tunable

2022-11-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=88

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Paul Floyd from comment #4)
> Thanks for adding me to the CC Philippe.
> 
> If I do this:
> export GLIBC_TUNABLES="glibc.pthread.stack_cache_size=0"
> 
> Then helgrind/tests/tls_threads fails with just
> +--21937:0:   sched WARNING: pthread stack cache cannot be disabled!
> 
> Without the env var there are a load of 
> 
> +Possible data race during write of size 8 at 0x by thread #x
> 
> errors
> 
> Do we have a way of knowing that GLIBC_TUNABLES did something so that we
> don't need to twiddle with stack_cache_actsize?
If we can detect the glibc version, then we can avoid using the old
stack_cache_actsize hack.

> Also --sim-hints=no-nptl-pthread-stackcache isn't turned on by default. Do
> we want to check for it in setup_client_env ()  and only put GLIBC_TUNABLES
> in the environment if it is used? Or perhaps add a new simhint.
I think we should check and use the existing hint. Current users of the hint 
will/should have the same behaviour whatever the glibc version.

> 
> Bonus points for handling GLIBC_TUNABLES already set by the tuner, and add
> or replace glibc.pthread.stack_cache_size
> 
> This doesn't seem to help the example being discussed in the valgrind-users
> list.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444488] Use glibc.pthread.stack_cache_size tunable

2022-11-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=88

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Mark Wielaard from comment #2)
> (In reply to Philippe Waroquiers from comment #1)
> > In the discussion on valgrind-users mailing list,
> > Paul reported tthat:
> >   'It looks like "stack_cache_actsize" in libc moved to be
> > _dl_stack_cache_actsize in ld-linux-x86-64.so.2'
> > 
> > Is there a way to modify the glibc glibc.pthread.stack_cache_size tunable
> > from valgrind ?
> 
> tunables are set by the GLIBC_TUNABLES environment variable
> https://www.gnu.org/software/libc/manual/html_node/Tunables.html
> 
> We can set/add to that GLIBC_TUNABLES environment variable in
> coregrind/m_initimg/initimg-linux.c setup_client_env () where we also set
> the LD_PRELOAD environment variable.

When running with a newer glibc, we should also avoid producing an error
message 
that the old way to disable the stack cache is not working.
Likely this implies to detect the version of glibc.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444488] Use glibc.pthread.stack_cache_size tunable

2022-11-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=88

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||pjfl...@wanadoo.fr

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444488] Use glibc.pthread.stack_cache_size tunable

2022-11-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=88

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
In the discussion on valgrind-users mailing list,
Paul reported tthat:
  'It looks like "stack_cache_actsize" in libc moved to be
_dl_stack_cache_actsize in ld-linux-x86-64.so.2'

Is there a way to modify the glibc glibc.pthread.stack_cache_size tunable from
valgrind ?
Or do we assume that the user has to tune this ?
Or do we do an alternate implementation of the current valgrind hack using
_dl_stack_cache_actsize in ld-linux-x86-64.so.2 ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code when vgdb is attached

2022-10-15 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #27 from Philippe Waroquiers  ---
(In reply to David Vasek from comment #26)
> I work on this issue together with Libor. If you checkout this branch:
> https://gitlab.nic.cz/knot/knot-dns/-/commits/valgrind_vgdb_bug , it
> includes a few changes that should help you with the debugging.
Thanks for the above.
With this, I was able to identify the likely culprit for the bug.
I have pushed a fix for this as 348775f34
The valgrind regression tests were run on debian/amd64, centos/ppc64 and
ubunty/arm64.
Also, the ctl/valgrind knot test has run 80 times without encountering the
abort
(while before the fix, each run of ctl/valgrind was triggering the bug).

It would be nice if you could check with the latest state of the git repository
if everything works
as expected now.

Thanks for the clear and detailed instructions to reproduce the bug, which was
critically needed to
reproduce this race condition.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 460142] Auxiliary stack traces

2022-10-11 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=460142

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
(In reply to Simon Richter from comment #0)
> The three stack traces I get (allocation, deallocation and use) show what is
> happening, but it is difficult for me to find the point where the string is
> given to Python -- the function to intern the string is called quite often,
> so I can't just easily break there.
Waiting for valgrind to provide some support to record auxiliary stack traces
as you suggest,
what you might do is to capture this stack trace and print it together with the
ptr for the memory allocated
and given to python.
Then when valgrind reports a 'use after free' error, you can search in the
program output the last
stack trace that mention this pointer.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code when vgdb is attached

2022-10-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #25 from Philippe Waroquiers  ---
(In reply to Libor Peltan from comment #24)
> You will probably need to run the test several time until it reproduces. It
> may also happen on some machines that it does never reproduce. For me, it
> easily reproduces on my laptop, but hardly on a powerful server.
Thanks for the detailed instructions. I was able to reproduce the bug, happens
more or less 1 on 5 trials.
I will further work on this time permitting (valgrind is a week-end activity,
so only a few hours from time to time).

One question: the test outputs valgrind messages in the file 'valgrind' and
outputs the valgrind debug output in stderr.
This setup is very special, I would like to have both outputs in the same file.
I have not seen where is the piece of code that really launches valgrind and
makes these 2 separate files
(I even do not too much understand how this is done, as valgrind messages and
debug output are normally both written on
the same fd).

Is there an easy change to the test framework that would ensure all the output
is either in stderr or valgrind file ?

Thanks
Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code when vgdb is attached

2022-10-01 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #23 from Philippe Waroquiers  ---
(In reply to Libor Peltan from comment #22)
> (In reply to Philippe Waroquiers from comment #21)
> > Valgrind should stop by itself when it finds an error (when using
> > --vgdb-error argument)
> 
> The error mentioned by me is an error in application logic. Valgrind has no
> way to detect it, no reason to stop in that case and it does not.
> 
> > If yes, as a bypass for this bug, you might try to have valgrind invoking
> > gdbserver and then launch gdb/vgdb, rather than having gdb/vgdb
> > 'interrupting' a (still) running process.
> 
> Yes, we can use a workaround that we don't invoke vgdb at all in such cases.
> But in any case, it would be nice to fix what's going wrong in Valgrind.
> 
> > But in any case, what you are doing should not cause a problem. When I have
> > a little bit of time, I will dig again in the vgdb logic 
> > and see if/where it could create such wrong interaction.
> 
> Thank you for continuous effor in this issue! I tried to create a tiny small
> program that does only listens to incomming UDP packets while using poll or
> epoll_wait syscalls, and frequently attach vgdb to it. Unfortunately, the
> issue did not reproduce in this simple scenario. 
You might need to have a mixture of threads executing some cpu instructions,
some threads
blocked in syscalls, and some signals such as SIGALRM.

> The only reproducer we have
> so far (at least pretty reliable) is the whole Knot DNS and its test-case.
> If you wish, we can give you instructions how to build it and launch the
> test-case, but it'll be several steps to do.
If the instructions are not too long to describe, it will not harm to have
them.
If simple enough and building Knot DNS does not need too much dependencies etc
...,
I can give it a try.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code when vgdb is attached

2022-09-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #21 from Philippe Waroquiers  ---
(In reply to Libor Peltan from comment #20)
> Thank you for your observations! Based on this, we actually found out that
> the issue happens exactly (sometimes!) when we attach vgdb to the running
> process, like this:
> 
> ```
> /usr/bin/gdb -ex "set confirm off" -ex "target remote | /usr/bin/vgdb
> --pid=5944" -ex "info threads" -ex "thread apply all bt full" -ex q
> /home/peltan/master_knot/src/knotd
> ```
> 
> I apologize that we overlooked this improtant fact earlier. (Our test
> environment performs this automatically when a routine error occurs.)
> 
> We will continue working on minimizing the reproducer in following days.

That starts to clarify where the problem could originate from.
Valgrind should stop by itself when it finds an error (when using --vgdb-error
argument)
and invoke its gdbserver, waiting for gdb/vgdb to connect.

Do you mean that the above command is launched (somewhat asynchronously) when
an error is detected via other ways ?

If yes, as a bypass for this bug, you might try to have valgrind invoking
gdbserver and then launch gdb/vgdb, rather than having gdb/vgdb
'interrupting' a (still) running process.

But in any case, what you are doing should not cause a problem. When I have a
little bit of time, I will dig again in the vgdb logic 
and see if/where it could create such wrong interaction.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code

2022-09-26 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #19 from Philippe Waroquiers  ---
I took a look at the attached logs.

A first observation:
* We have 2 groups of 3 threads that get the 0xe8 syscall return.
* For each of these 2 groups, we see a little bit before these 0xe8 return that
there is a connection to
   the embedded gdbserver of Valgrind.
Here are the line number and occurrences of the 0xe8 syscall return:
6 matches for "0xe8" in buffer: valgrind
  26734:SYSCALL[61639,20](232) ... [async] --> Success(0xe8) 
  26774:SYSCALL[61639,17](232) ... [async] --> Success(0xe8) 
  26789:SYSCALL[61639,15](232) ... [async] --> Success(0xe8) 
  31141:SYSCALL[61639,14](232) ... [async] --> Success(0xe8) 
  31176:SYSCALL[61639,16](232) ... [async] --> Success(0xe8) 
  31206:SYSCALL[61639,13](232) ... [async] --> Success(0xe8) 

And here are the 3 matches for the gdbserver:
3 matches for "TO DEBUG" in buffer: valgrind
286:==61639== TO DEBUG THIS PROCESS USING GDB: start GDB like this
  26668:==61639== TO DEBUG THIS PROCESS USING GDB: start GDB like this
  31121:==61639== TO DEBUG THIS PROCESS USING GDB: start GDB like this

where the first one is the message produced at startup.
Maybe this is a modified executable that triggers a call to vgdb/gdb  when it
encounters this syscall problem ?
Or is there something that attaches to the valgrind gdbserver or sends a
command to it ?
Because in this last case, we could possibly have an interaction between vgdb
and many threads blocked  in syscalls.
We see in the stderr trace the following:
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 0007
tid 1 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 15 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 17 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 20 status VgTs_WaitSys

--61639:1:  gdbsrv stop_pc 0x4CAC04E changed to be resume_pc 0x4C9CD7F: poll
(poll.c:29)
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 0007
tid 1 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 13 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 14 status VgTs_WaitSys
--61639:2:  gdbsrv   stored register 0 size 8 name rax value 00e8
tid 16 status VgTs_WaitSys
--61639:1:  gdbsrv VG core calling VG_(gdbserver_report_signal) vki_nr 15
SIGTERM gdb_nr 15 SIGTERM tid 1
-
So, for the 2 groups of 3 threads that got 0xe8 syscall return, we see that the
valgrind gdbserver was instructed to put 0xe8 
in the rax register.

It is however difficult to relate the stderr output with the valgrind output.
If you could redo the trace with the none tool, but keep together the stderr
and the valgrind output 
(i.e. let valgrind do its output to stderr together with its debug log) + add
--time-stamp=yes
that might help to see what happens in which order.


I have to say that at this state, I have not much idea what other things to
look at.
To further investigate and possibly find the bug,  likely we will need an (easy
to run) reproducer.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 459477] XERROR messages lacks ending '\n' in vgdb

2022-09-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=459477

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED
 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Fixed in 3c5720453 (also fixes some occurrences of missing\n in ERROR calls)

Thanks for the report and the patch.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code

2022-09-24 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #16 from Philippe Waroquiers  ---
In one of the trace I see the below trace. It looks like the a signal SIGALRM
is delivered to the thread that encounters the futex 202 result.
--24048-- async signal handler: signal=14, vgtid=24051, tid=4, si_code=-6,
exitreason VgSrc_None
--24048-- interrupted_syscall: tid=4, ip=0x580e687e, restart=False,
sres.isErr=False, sres.val=202
--24048--   at syscall instr: returning EINTR
--24048-- delivering signal 14 (SIGALRM):-6 to thread 4
--24048-- push_signal_frame (thread 4): signal 14
==24048==at 0x4C2C340: futex_wait (futex-internal.h:146)
==24048==by 0x4C2C340: __lll_lock_wait (lowlevellock.c:49)
==24048==by 0x4C32322: __pthread_mutex_cond_lock (pthread_mutex_lock.c:93)
==24048==by 0x4C2E9B3: __pthread_cond_wait_common (pthread_cond_wait.c:616)
==24048==by 0x4C2E9B3: pthread_cond_wait@@GLIBC_2.3.2
(pthread_cond_wait.c:627)
==24048==by 0x14184D: worker_main (pool.c:70)
==24048==by 0x1395B2: thread_ep (dthreads.c:146)
==24048==by 0x4C2FB42: start_thread (pthread_create.c:442)
==24048==by 0x4CC0BB3: clone (clone.S:100)
So, this is another indication that the problem is likely linked to 
VG_(fixup_guest_state_after_syscall_interrupted).
But it is not very clear what is special in your application.

Can you also reproduce the problem with the --tool=none tool ? Or does it
happen only with memcheck ?
Can you check if the problem goes away when using
--vex-iropt-register-updates=allregs-at-each-insn ?
If the problem cannot be reproduced with this setting, can you see if it
reproduces with --vex-iropt-register-updates=allregs-at-mem-access ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 459031] Documentation of --error-exitcode is incomplete.

2022-09-17 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=459031

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED
 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Fixed in e489f3197

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code

2022-09-17 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

--- Comment #8 from Philippe Waroquiers  ---
I took a look at both logs.
First the epoll log.
(tid is the an thread id number used internally in valgrind)

What we see is that the tid 14 is just getting the result of a previous epoll
syscall, and then starts a new epoll syscall:

--111235--   SCHED[14]:  acquired lock (VG_(client_syscall)[async])<<<<<
tid 14 acquires the valgrind lock as the epoll syscall ended
SYSCALL[111235,14](232) ... [async] --> Success(0x0)   
 <<<<< and the tid 14 reports the success of this syscall
--111235--   SCHED[14]: TRC: SYSCALL   
 <<<<< it launches a new epoll syscall
SYSCALL[111235,14](232) sys_epoll_wait ( 18, 0x2610f740, 1, 1000 ) --> [async]
... 
--111235--   SCHED[14]: releasing lock (VG_(client_syscall)[async]) ->
VgTs_WaitSys <<<<< and releases the lock waiting for the result
--111235--   SCHED[35]:  acquired lock (VG_(scheduler):timeslice)  
   <<<<< and the tid 35 acquires the lock
.

Then later on, the tid 18 calls tgkill sending a (fatal) signal to itself (I
believe it is to itself, the tracing of valgrind of the link 
between the tid and the linux thread id is not very clear).
As this signal is fatal, all threads are being killed by valgrind
We see that a little bit before the tgkill that tid18 does a write on fd 2.
Possibly that is an indication of reporting an error/problem.

The problem with the futex has a similar pattern:
The tid 6 starts a futex syscall and releases the valgrind lock.
Then sometime later, the tid 11 is doing an mmap, and then slightly after
calls tgkill.
And similarly this tid 11 does a write on fd 2 a little bit before.

The processing of a fatal signal in valgrind is quite tricky : complex code,
with race conditions see e.g. bug 409367.

This fatal signal has to get all the threads out of their syscalls.
For this, a kind of internal signal "vgkill"  is sent by the valgrind scheduler
to all threads.
When the signal is received, valgrind detects that the thread was in a syscall
and that the thread
has to "interrupt" the syscall. For this, valgrind calls VG_(post_syscall). But
this post_syscall assumes that the guest state
is correctly "fixed", but I do not see where this is done.

So, an hypothesis about what happens:
  * the application encounters an error condition (in tid 18 in the epoll case,
in tid 11 in the futex case)
  * this application thread decides to call abort, generating a fatal signal
  * valgrind handling of a fatal signal is for sure complex and might still be
racy, and might not properly reset the guest state
   when the fatal signal has to be handled by a thread doing e.g. epoll or
futex syscall
As the guest state is not properly restored, when this thread "resumes" and/or
due to race conditions, instead of just dying, it continues
and then itself reports a strange state as the guest thread state was not
properly restored using a call to 
VG_(fixup_guest_state_after_syscall_interrupted).

To validate this hypothesis, maybe the following could be done:
* check what is this "write on fd 2" doing (maybe with strace?)
* in case the application encounters a problem, instead of calling abort that
sends a fatal signal, you might rather do e.g. sleep(10).
   If the hypothesis is correct, then the thread doing epoll or futex should
just stay blocked in their syscall,
   and the thread detecting the problem will sleep in this state.
   It might then be possible to attach using gdb+vgdb and investigate the state
of the application and/or the valgrind threads.

There is a way to tell valgrind to launch gdbserver in case of an abnormal
valgrind exit using the option:
--vgdb-stop-at=event1,event2,... invoke gdbserver for given events [none]
 where event is one of:
   startup exit valgrindabexit all none

Looks like we might add abexit to ask valgrind to call gdbserver when a client
thread/process does an abnormal exit.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458915] syscall sometimes returns its number instead of return code

2022-09-14 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458915

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
Does strace show that there are some signals being processed close (in time) to
the system call wrongly returning its syscall number ?
If a signal happens when syscall-amd64-linux.S is being executed, 
then  VG_(fixup_guest_state_after_syscall_interrupted) has some complex logic
that interacts with the partially executed asm code.

Alternatively, having more valgrind tracing might give some hints.
You could try
valgrind -v -v -v -d -d -d --trace-syscalls=yes --trace-signals=yes
your_app

and if your application is multi-threaded (I guess it is), you might also use
--trace-sched=yes

With regards to "What intrigues me that both the syscall number and the return
value appear in the RAX register at some point."
If you speak about the "physical RAX register", then I think this is normal. To
execute a syscall, the syscall number must be set
to the syscall number before the syscall instruction, and on return of the
syscall instruction, the RAX register contains the syscall return value.

When this syscall instruction is finished, the syscall return code (stored by
the kernel in the physical register RAX) 
must be moved to the guest register RAX.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 441069] Process terminating with default action of signal 4 (SIGILL) Illegal opcode at address 0x580A3C2C at 0x4000B00: ??? (in /lib/ld-2.26.so)

2022-08-22 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=441069

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
As far as I can see, your test program is trivial.
If valgrind does not work at all on such a trivial program, it might be due to
your specific installation/version of the OS
(your program crashes in the dynamic loader).

So, the first thing to do is to try with the latest  valgrind version (3.19.0
or the git version)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 458118] Track deletions of objects from unloaded shared libraries

2022-08-22 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=458118

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
Valgrind keeps recently freed blocks in a list that allows to report where it
was allocated. If the size of this list (controlled by --freelist-vol
parameter) is big enough and you use --keep-debuginfo=yes, then I think
valgrind should be able to tell you the stack trace that allocated the
referenced freed block.

Now, if the segmentation violation happens because the destructor code has been
unloaded and this destructor code is not found anymore via a pointer in the
dispatch table, then valgrind does not track executable code and/or dispatch
table.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 457898] Multiple threads: Assertion 'found' failed.

2022-08-15 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=457898

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Works for me on debian 11.4

Can you re-run your test program with debug info under a valgrind itself
compiled with debug info and using more trace
 (such as -d -d -d --trace-sched=yes)  and attach the resulting log file ?

Thanks
Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 457619] Instructions are not consistently executed after returning from a SIGSEGV signal handler

2022-08-13 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=457619

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
Using the below should solve the problem:
   valgrind --vex-iropt-register-updates=allregs-at-mem-access  ./test

See https://valgrind.org/docs/manual/manual-core.html#manual-core.signals for
more info.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 455826] Running Valgrind memcheck on a live process without exiting it reports LDL but on graceful exit it does not.

2022-06-26 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=455826

--- Comment #7 from Philippe Waroquiers  ---
On Fri, 2022-06-24 at 10:06 +, shapath wrote:
> https://bugs.kde.org/show_bug.cgi?id=455826
> 
> --- Comment #5 from shapath  ---
> (In reply to Philippe Waroquiers from comment #4)
> > (In reply to shapath from comment #3)
> > > 
> > > Valgrind report:-
> > > ==
> > > (gdb) monitor leak_check full reachable any
> > When compiling with gcc -g -O0 and doing the leak search,
> > I do not get any definitely or possibly leaked block. Leak search reports
> > 2 still reachable blocks.
> > 
> > You can use the following to see why a block is still reachable:
> > (where 0x4a330a0 is the addess of the strdup-ed "hello" 
> > (gdb) mo w 0x4a330a0
> > ==8392== Searching for pointers to 0x4a330a0
> > ==8392== tid 1 register R8 pointing at 0x4a330a0
> > (gdb) 
> > 
> > 
> > As you can see, in my case, the address of the just allocated name still
> > happens
> > to be in a register.
> > 
> > When I force main to return, then name is reported as definitely leaked
> > (as the register pointing to name is likely used for something else)
> 
> Tried the suggestion to compile with -O0.  Also modified program to print the
> address for strdup-ed "hello" before the program hits the infinite while loop.
> i see it reported as a definite leak.
> 
>  I tried "mo w 0x52050a0" which does not return any reference register where
> 0x52050a0 is the address.
Depending on the code generated by the compiler and the moment at which a leak
search
is done, a pointer might still be present (or not) in one register.

> 
> (sjohri/coding)$ gcc -g -O0 valgrind.c  -o val_exmple
> (sjohri/coding)$ valgrind --log-file=/var/tmp/_valgrind_%p
> --xml-file=/var/tmp/_valgrind_xml_%p  ./val_exmple
> 
>  "The strdup-ed address is 0x52050a0"
> 
> :(sjohri/coding)$ gdb val_exmple
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.0.1.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/sjohri/coding/val_exmple...done.
> (gdb) target remote | vgdb
> Remote debugging using | vgdb
> relaying data between gdb and process 86547
> Reading symbols from
> /usr/libexec/valgrind/vgpreload_core-amd64-linux.so...done.
> Loaded symbols for /usr/libexec/valgrind/vgpreload_core-amd64-linux.so
> Reading symbols from
> /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so...done.
> Loaded symbols for /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so
> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
> found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> 0x04efc9e0 in __nanosleep_nocancel () from /lib64/libc.so.6
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-326.0.1.el7_9.x86_64
> (gdb) monitor leak_check full reachable any
> ==86547== 6 bytes in 1 blocks are definitely lost in loss record 1 of 2
> ==86547==at 0x4C29F73: malloc (vg_replace_malloc.c:309)
> ==86547==by 0x4EC3B89: strdup (in /usr/lib64/libc-2.17.so)
> ==86547==by 0x4006B2: my_malloc (valgrind.c:29)
> ==86547==by 0x40070E: main (valgrind.c:39)
As the struct component "name" is not aligned, the content of "char *name"
is not considered as a pointer, and so the strdup-ed string is considered 
as definitely lost in your case.



Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 455826] Running Valgrind memcheck on a live process without exiting it reports LDL but on graceful exit it does not.

2022-06-24 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=455826

--- Comment #4 from Philippe Waroquiers  ---
(In reply to shapath from comment #3)
>
> Valgrind report:-
> ==
> (gdb) monitor leak_check full reachable any
When compiling with gcc -g -O0 and doing the leak search,
I do not get any definitely or possibly leaked block. Leak search reports
2 still reachable blocks.

You can use the following to see why a block is still reachable:
(where 0x4a330a0 is the addess of the strdup-ed "hello" 
(gdb) mo w 0x4a330a0
==8392== Searching for pointers to 0x4a330a0
==8392== tid 1 register R8 pointing at 0x4a330a0
(gdb) 


As you can see, in my case, the address of the just allocated name still
happens
to be in a register.

When I force main to return, then name is reported as definitely leaked
(as the register pointing to name is likely used for something else)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 455826] Running Valgrind memcheck on a live process without exiting it reports LDL but on graceful exit it does not.

2022-06-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=455826

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
Note that the leak search algorithm is scanning the memory starting from "root"
memory zone (stacks, global variables, registers, ...).
During this scanning, any aligned piece of memory which happens to point at a
block will be considered as a pointer.
So, for example, if an integer variable happens to have the same bit
representation as the address of an allocated (but lost) block,
the leak search will not detect the lost block as a leak, because it has found
a "pointer" to this block.

So, possibly, depending on what the process does before exit, it might create
some bit patterns that look like a pointer.

The leak search algorithm might thus have false negative: some real leaks might
not be detected.
I do not see how the leak search algorithm could create a false positive  lost
block (ignoring the possibility that the algorithm
is buggy of course).

Note also that monitor leak_check is just launching the same leak search
algorithms as used by client requests and used at exit.

As Paul said, more info (e.g. what does the leak stack trace look like ? Is
such a leak report plausible when it is detected) might clarify.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 452058] Generated suppressions contain a mix of mangled (physical) and demangled (inline) frames

2022-04-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=452058

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
Note that even when not using inline information; I think that suppression
entries will similarly not match 
when the compiler does different inlining decisions.

If we tell valgrind to not use the inline information (--read-inline-info=no), 
then generated suppression entries will only contain
the non inlined calls. But if the compiler decides to inline more (or less),
then there will be less (or more) entries in the suppression.

So, as suggested by Mark, assuming we can always use the mangled name and then
always use the inline info, we can then expect
to have  "more stable" number of stack frames put in the suppression entries
(and so not depend anymore on the inline decisions).

If mangled names are not found in the debug info, then it looks like we
will/might need to make suppression info  matching
even more sophisticated (another name for complex :)).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 440765] Feature request: when a dynamically allocated variable is last read/written

2021-08-09 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=440765

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Keeping a stack trace of all accesses is a very heavy functionality.
This can be implemented as e.g. helgrind --history=full provides 
this history of past accesses (with the amountof history to keep
controlled by --conflict-cache-size=N).

An alternative might be to use valgrind+gdb/vgdb and use the gdb command watch
to watch accesses to a piece of memory.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435493] Cannot create 'R_TempDir' under valgrind-3.17.0.GIT-lbmacos

2021-04-11 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=435493

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Syscall 475 on darwin is the mkdirat syscall. I guess that when this syscall
fails, R then reports the fatal error that it cannot create R_Tempdir.


I do not have access to a macos system so cannot fix this but
you might maybe try : it might be an easy change in the valgrind file
m_syswrap/priv_syswrap-darwin.h, to make something similar to e.g. the line
readlinkat a few lines above, and then do a PRE(sys_mkdirat) in
syswrap-darwin.c
somewhat similar to the PRE(sys_mkdirat) in syswrap-linux.c
(I guess only the syscall convention will have to be updated)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434035] vgdb might crash if valgrind is killed

2021-03-09 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=434035

--- Comment #1 from Philippe Waroquiers  ---
Patch looks ok to me.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 338633] gdbserver_tests/nlcontrolc.vgtest hangs on arm64

2021-03-08 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=338633

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #6 from Philippe Waroquiers  ---
Fixed in c79180a3

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432870] gdbserver_tests:nlcontrolc hangs with newest glibc2.33 x86-64

2021-03-08 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=432870

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|CONFIRMED   |RESOLVED

--- Comment #9 from Philippe Waroquiers  ---
Fixed in c79180a3

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432870] gdbserver_tests:nlcontrolc hangs with newest glibc2.33 x86-64

2021-03-07 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=432870

--- Comment #7 from Philippe Waroquiers  ---
Created attachment 136473
  --> https://bugs.kde.org/attachment.cgi?id=136473=edit
fix nlcontrolc.vgtest blocking on arm64 or newer glibc

Attach patch should fix the blockage. Tested on debian 10/amd64 and on an arm64
platform.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434057] Add stdio mode to valgrind's gdbserver

2021-03-06 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=434057

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Note that valgrind gdbserver is very special: it is not a 'separate' process
that e.g. do ptrace system calls to debug an inferior.
The valgrind gdbserver is embedded in valgrind itself.
Due to that, when the valgrind guest process is blocked in a syscall,
the valgrind gdbserver cannot 'read' packets from gdb.
The intermediate vgdb process is reading the packets from gdb,
and then wakes up (in a very special way) the valgrind runtime to get
the valgrind guest process out of the blocking system call.
As such, a direct connnection between gdb and the valgrind gdbserver
will not work properly.

A possible implementation might be a new vgdb option such as:
 --valgrind [VALGRIND_OPTIONS ...] PROGRAM_TO_RUN_UNDER_VALGRIND [PROGRAM_ARGS
...]

When vgdb gets this argument, it would launch valgrind itself and connect to
it.
That avoids thus to have to launch valgrind in one window, and have gdb/vgdb
launched
in another window.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432510] RFE: ENOMEM fault injection mode

2021-02-15 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=432510

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #4 from Philippe Waroquiers  ---
To have a flexible way to specify when/where a memory allocation should fail,
we might use a something that re-uses (part of) the suppression infrastructure:

The user would give a file with 'suppression-like' entries, but instead of
suppressing errors, these entries would put a limit (in nr of allocated blocks
and/or nr of allocated bytes) after which a malloc would return NULL.

That should be relatively cpu-cheap to implement, as the matching between
the alloc stacktrace and the 'heap-limit supp entries' has to be done
only the first time a new stacktrace is stored in the list of stack traces.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 427510] Use of uninitialized value in callgrind_annotate.

2020-10-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=427510

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Seems fixed in recent git version.
Can you try with the last 3.16 version (or the last GIT version), 
instead of a 3.16 GIT version ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 426853] vagrind+massif analyze ceph-osd memory OOM

2020-09-26 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=426853

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
The message "Error: can not open xtree output file" is produced
when the open system call fails.
Such qn "open" failing is likely caused by some protection problems
at operating system level.

I see e.g. that the program gets argument such as --setuser ceph.
Are you sure the user ceph can write in /root ?

You could use the valgrind arguments --massif-out-file=... to change the
location where the file is produced.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 424656] Uninitialised value was created by a heap allocation

2020-07-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=424656

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |NOT A BUG
 Status|REPORTED|RESOLVED

--- Comment #3 from Philippe Waroquiers  ---
Yes, you can suppress errors.

See user manual for more info:
https://www.valgrind.org/docs/manual/manual-core.html#manual-core.suppress

More generally it is hiighly recommended to read or at least scan the user
manual.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 424656] Uninitialised value was created by a heap allocation

2020-07-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=424656

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
That looks like a real bug that valgrind detects.

The malloc allocates 32 bytes, the strcpy initialises 16 bytes
but the printf loop prints the 32 bytes, so effectively prints data nopt
initialised.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-04-24 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |INTENTIONAL

--- Comment #12 from Philippe Waroquiers  ---
After much discussion, we decided to keep the 3.15 behaviour
rather than rollback to 3.14 behaviour.
But in valgrind 3.16, the incompatible supp entry will be detected,
and a warning message will be given.
That has been pushed today as d9e714812
This is not really fixing the bug, so closing it as RESOLVED INTENTIONAL.

Sorry for the backward incompatible change pushed in 3.15,
we will try harder in the future to avoid such incompatible changes.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-04-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #11 from Philippe Waroquiers  ---
Updated the warning message to be:
==3170== WARNING: preadv(vector[...]) is an obsolete suppression line not
supported anymore since valgrind 3.15.
==3170== You should replace [...] by a specific index such as [0] or [1] or [2]
or similar

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-04-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #10 from Philippe Waroquiers  ---
Some further notes:
I should re-update the warning to replace the final 'or ...'
by  'or [2].

And I sincerely hope that nobody is using preadv and pwritev wrongly
with huge vectors, as otherwise they might need to type a lot of supp
entries.
(that is in fact a main reason to avoid variable extra error lines ...)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-04-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #9 from Philippe Waroquiers  ---
Created attachment 127809
  --> https://bugs.kde.org/attachment.cgi?id=127809=edit
not a fix, but detects the incompatible supp entry and produce a warning

The commit log explains in details what we envisaged,
and why the decision is to keep what 3.15 accepts instead
of rolling back the change in 3.15 or make the error matching logic
even more complex

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 419562] PR_SET_PTRACER error with Ubuntu on WSL

2020-04-13 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=419562

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Evan Hunter from comment #2)
> Thanks for the pointers on testing it on vgdb.
> It looks like it still hangs vgdb :-(
> I too am not sure what the prctl(PR_SET_PTRACER, 1, 0, 0, 0) call is trying
> to achieve. It seems to succeed but still leaves vgdb blocked.

Using the debug options -d -d -d of vgdb and -v -v -v -d -d -d of valgrind,
it should be possible to have an idea about what does not work.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 419562] PR_SET_PTRACER error with Ubuntu on WSL

2020-04-04 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=419562

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Thanks for the proposed patch.

I do not remember the reason for the line
   ret = VG_(prctl) (PR_SET_PTRACER, 1, 0, 0, 0);

prctl PR_SET_PTRACER documentation indicates that the second argument
is either PR_SET_PTRACER_ANY (to allow any process to ptrace the caller),
or a pid (to allow pid to ptrace the caller)
or 0 (to not allow anymore a process to ptrace the caller).

So, the reason of the call with the pid 1 is not clear (anymore to me.
I must have had a good reason at a time, but not commented :(.

That being said:
Does calling set_ptracer with value 1 effectively allow vgdb to
get a blocked valgrind process out of the syscall ?

In other words, before your patch:
   valgrind sleep 100
in another window:   vgdb help
and vgdb should block or give error msg or similar
 (you might use vgdb -d -d -d -d help   to get more info about what
  is going on)

and after your patch, vgdb -d -d -d -d help  should be able to wake up
valgrind and produce the help text.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 418840] SIG_IGN doesn't clear pending signal if SIG_IGN is already the handler

2020-03-14 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=418840

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
I think the bug originates from the lazy translation of the scss to skss,
in function static void handle_SCSS_change ( Bool force_update ).

When force_update is false (which is the case when handling a guest 
sigaction call), and the old setup is equal to the new setup of the
kernel signal state, then no kernel sigaction is executed.

So, when a signal is blocked+ignored then raised,
the signal is queued by the kernel till it is unblocked (and then
it will be ignored).

In this state, if a blocked signal is re-ignored, the kernel clears it.
But in the same circumstance, valgrind signal handling does not call sigaction:
valgrind does not know that there is a blocked ignored queued signal
in the kernel, it just sees that the signal handling is not changed,
and then it does not call the kernel to tell to re-ignore the signal.

Changing the call of handle_SCSS_change in sys_sigaction to always use
force_update solves the problem, but that means the lazy update of
the signal handling is removed.

It would be possible to detect this case in handle_SCSS_change:
if the current setup is blocked + ignored, then this function
could check if there is a signal pending, and then it should still
call the kernel to clear the blocked signal.
That means one more syscall to get the pending signals in case one or more
signals are blocked + ignored.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-05 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #7 from Philippe Waroquiers  ---
(In reply to Mark Wielaard from comment #5)
> This is unfortunate and an unforeseen consequence of making the the error
> message more useful (it is useful to know which vector contained
> uninitialised bytes).
Yes, having more precise info in this area can be very useful.

To give more information about an error without making them 'different',
it might be more appropriate to put more information in the 'extra' part
of the error.
The logic to compare this "extra" part is in the tool, and so the tool
can consider that this additional info is not to be compared in the error
equality logic.

And of course, as usual, if you have an error, you can use gdb+vgdb
to get as much details as possible about the error being reported.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-05 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||jsew...@acm.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-05 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Mark Wielaard from comment #5)
> This is unfortunate and an unforeseen consequence of making the the error
> message more useful (it is useful to know which vector contained
> uninitialised bytes).
> 
> Sadly we have had releases with both the old and new variant of the error
> message. So we would indeed break some existing suppressions picking either
> the old or new variant.
Yes, unfortunate.  I will add Julian in the cc list to have some more
opinions ...

> 
> I wonder if we can make [...] special so that it either matches the literal
> string [...] or [].
This can for sure be coded, but the error matching code/logic is IMO
quite complex already, and moreover, this logic will be tool and error
dependent, as the extra message matching is done by the tool, not by
the m_errormgr.c common core module.

> 
> Note: we do use the [...] variant also in [process_vm_](readv|writev) and
> vmsplice.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-04 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||ahajk...@redhat.com

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-04 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||m...@klomp.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-04 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

--- Comment #4 from Philippe Waroquiers  ---
(In reply to Leonid Yuriev from comment #3)
> > What does `valgrind --gen-suppressions=all ...` show ?
> {
>
>Memcheck:Param
>pwritev(vector[0])
>fun:pwritev
>fun:mdbx_pwritev
>fun:mdbx_flush_iov
>fun:mdbx_page_flush
>fun:mdbx_txn_commit
>fun:_ZN8testcase16breakable_commitEv
>fun:_ZN8testcase39db_open__begin__table_create_open_cleanERj
>fun:_ZN15testcase_nested5setupEv
>fun:_Z12test_executeRK12actor_config
>fun:_Z16osal_actor_startRK12actor_configRi
>fun:main
> }
> 
> {
>
>Memcheck:Param
>pwritev(vector[1])
> 
> ...
>pwritev(vector[2])
> ...
>pwritev(vector[3]) and so on.
> 
> 
> > Does it suppress if you use the suppression exactly as generated by 
> > valgrind ?
> Suppressions works for an every explicitly `vector[N]`, but not for the
> `vector[...]`.

As I understand, you are expecting vector[...] in the line following
Memcheck:Param to match vector[1] or vector[2] or ...
There is no such logic.  This part of the error must match exactly.

Your suppression entry was working with 3.13 because in 3.13, the error
was generated with vector[...]

Such errors for pwritev was changed from producing vector[...]
to vector[0], vector[1], etc as part of the commit:

commit b0861063a8d2a55bb7423e90d26806bab0f78a12
Author: Alexandra Hájková 
AuthorDate: Tue Jun 4 13:47:14 2019 +0200
Commit: Mark Wielaard 
CommitDate: Wed Jul 3 00:19:16 2019 +0200

As far as I can see, the fact that the new error message after this commit
contains a varying offset between brackets is what causes the problem:
this looks to me to be a backward incompatible change (as shown by your
supp that stopped working between 3.13 and 3.15) and does not match
the 'idea' of error parameters. Here are some comments extracted from
the description of void VG_(maybe_record_error):
   Note that `ekind' and `s' are also used to generate a suppression.
   `s' should therefore not contain data depending on the specific
   execution (such as addresses, values) but should rather contain
   e.g. a system call parameter symbolic name.
(where 's' is this vector[1] etc string).

Wondering how to fix this ...
If we go back to the behaviour before 3.15, we break the suppression
entries working for 3.15, and if we do not go back, we break the suppression
entries working for 3.14 and before.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 417075] pwritev(vector[...]) suppression ignored

2020-02-04 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=417075

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
What does
  valgrind --gen-suppressions=all ...
show ?

Does it suppress if you use the suppression exactly as generated by valgrind ?

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413603] callgrind_annotate/cg_annotate truncate function names at '#'

2019-11-03 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=413603

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|CONFIRMED   |RESOLVED

--- Comment #6 from Philippe Waroquiers  ---

Thanks for the analysis and patch, pushed in aaf64922a

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413603] callgrind_annotate/cg_annotate truncate function names at '#'

2019-11-03 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=413603

Philippe Waroquiers  changed:

   What|Removed |Added

Summary|callgrind_annotate  |callgrind_annotate/cg_annot
   |truncates function names at |ate truncate function names
   |'#' |at '#'

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413603] callgrind_annotate truncates function names at '#'

2019-10-31 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=413603

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Andreas Arnez from comment #3)
> (In reply to Philippe Waroquiers from comment #2)
> > * I am wondering if we should not allow comment lines starting with
> >   0 or or spaces characters (like empty lines?) followed by # ?
> I wondered about that, too.  Actually my first version allowed whitespace
> before the comment marker.  But then I realized that callgrind_annotate
> doesn't skip spaces at the beginning of a line in other cases, either.  Thus
> I adjusted my patch to be consistent with the rest of the logic.
> Do you know of a case where it would be necessary to allow whitespace before
> the comment marker?
No, I had no specific example in mind, but as the callgrind format is
quite general, I was afraid we might have a tool not in the valgrind tree
producing such comment lines with spaces before #.
I however just tried with kcachegrind, and it reports a warning for
such # lines that have spaces before the #.

So, I think that what you have done is ok.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413603] callgrind_annotate truncates function names at '#'

2019-10-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=413603

--- Comment #2 from Philippe Waroquiers  ---
Thanks for the patch.

Two small comments:
* I am wondering if we should not allow comment lines starting with
  0 or or spaces characters (like empty lines?) followed by # ?
* cg_annotate seems to suffer from the same bug.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 410924] massif crashes running jetstream2 benchmark with webkit

2019-09-05 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=410924

--- Comment #3 from Philippe Waroquiers  ---
Some feedback ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 411134] Allow the user to change a set of command line options during execution.

2019-08-31 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=411134

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #3 from Philippe Waroquiers  ---
Pushed as 3a803036f

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 411134] Allow the user to change a set of command line options during execution.

2019-08-21 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=411134

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
For background info, see discussion with MariaDB developers
https://sourceforge.net/p/valgrind/mailman/message/36738630/
and a similar discussion on StackOverflow
https://stackoverflow.com/questions/57245062/suppress-leak-check-in-a-specific-forked-child

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 411134] New: Allow the user to change a set of command line options during execution.

2019-08-21 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=411134

Bug ID: 411134
   Summary: Allow the user to change a set of command line options
during  execution.
   Product: valgrind
   Version: unspecified
  Platform: Other
OS: Linux
Status: REPORTED
  Severity: wishlist
  Priority: NOR
 Component: general
  Assignee: jsew...@acm.org
  Reporter: philippe.waroqui...@skynet.be
  Target Milestone: ---

Created attachment 122275
  --> https://bugs.kde.org/attachment.cgi?id=122275=edit
patch to implement dynamically changeable options

The attached patch changes the command line option framework and parsing
code to allow to change (some) command line options dynamically.

Here is a summary of the new functionality (extracted from NEWS):
* It is now possible to dynamically change the value of many command
  line options while your program (or its children) are running under
  Valgrind.
  To have the list of dynamically changeable options, run
 valgrind --help-dyn-options
  You can change the options from the shell by using vgdb to launch
  the monitor command "v.clo ...".
  The same monitor command can be used from a gdb connected
  to the valgrind gdbserver.
  Your program can also change the dynamically changeable options using
  the client request VALGRIND_CLO_CHANGE(option).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 400593] In Coregrind, use statx for some internal syscalls if [f]stat[64] fail

2019-08-17 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=400593

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Petar Jovanovic from comment #4)
> This version looks better, thanks.
> I have just pushed it [1] after some testing, but I will leave the issue
> open so we can see over the weekend whether there have been regressions on
> other platforms or not.
> 
> [1]
> https://sourceware.org/git/?p=valgrind.git;a=commit;
> h=c6a6cf929f3e2a9bf5d7f09f334ed4d67f2d6e18

Would be good to add the bug nr in NEWS fixed list.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 410924] massif crashes running jetstream2 benchmark with webkit

2019-08-15 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=410924

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Does the same also crash with other tools,
in particular, memcheck and none tools ?

Just to be sure, does it also crash if you specify --smc-check=all ?

In the attached log, we see that thread 1 is busy executing some code,
but thread 2 is blocked in a madvise syscall.
Remaining threads are in a timed wait call.

It is slightly strange to have thread 2 blocked in madvise syscall.
Maybe the application is doing very special actions on mapped pages ?

You could also see if adding --px-default=allregs-at-mem-access
or --px-default=allregs-at-each-insn  changes the behaviour.
(possibly, the javascript engine is changing protection of some pages,
and expect 'precise exception handling' for handling signals.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 410599] Non-deterministic behaviour of pth_self_kill_15_other test

2019-08-12 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=410599

--- Comment #5 from Philippe Waroquiers  ---
(In reply to Stefan Maksimovic from comment #4)
> Created attachment 122077 [details]
> pth_self_kill.patch v2
> 
> Thanks Philippe, validating the test through memcheck slipped my mind.
> 
> I've updated the patch by initializing the variables reported by memcheck,
> it should be fine now.

The patch looks ok to me and can be pushed.
Thanks for the work

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 410599] Non-deterministic behaviour of pth_self_kill_15_other test

2019-08-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=410599

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Stefan Maksimovic from comment #2)
> If it's not too much trouble, I suggest you test it yourself just to make
> sure.
I tested, and the modified test still reproduces the bug with the old
release (and is working ok with the new release).

However, the test itself is now using uninitialised variables:

==32132== Syscall param rt_sigaction(act->sa_mask) points to uninitialised
byte(s)
==32132==at 0x487D800: __libc_sigaction (sigaction.c:58)
==32132==by 0x109266: main (pth_self_kill.c:39)
==32132==  Address 0x1ffefffd08 is on thread 1's stack
==32132==  in frame #0, created by __libc_sigaction (sigaction.c:43)
==32132== 
==32132== Syscall param rt_sigaction(act->sa_flags) points to uninitialised
byte(s)
==32132==at 0x487D800: __libc_sigaction (sigaction.c:58)
==32132==by 0x109266: main (pth_self_kill.c:39)
==32132==  Address 0x1ffefffcf8 is on thread 1's stack
==32132==  in frame #0, created by __libc_sigaction (sigaction.c:43)


That should better be fixed to ensure we have a deterministic test :).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 410599] Non-deterministic behaviour of pth_self_kill_15_other test

2019-08-07 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=410599

--- Comment #1 from Philippe Waroquiers  ---
(In reply to Stefan Maksimovic from comment #0)
> A recent commit
> https://sourceware.org/git/?p=valgrind.git;a=commit;
> h=63a9f0793113fd5d828ea7b6183812ad71f924f1
> has introduced a test which exhibits different behaviour on some platforms.
> 
> Namely, running the pth_self_kill_15_other test on these can end in either
> of the following:
> 1) the spawned thread finishes first
> 2) the main thread finishes first
> 
> Running the test multiple times in succession we observed that on x86 the
> test finishes as described in the 2) case
> whereas on others either of the two cases can be present.
> We have seen this behaviour on different arm and mips platforms.
> 
> In the 2) case the output we get corresponds with the .exp file while in the
> 1) case we get an extra 'Terminated' string from the kernel on stderr.
> 
> Moreover, we ran the test on arm/mips without the functionality the rest of
> that patch provides, to test whether it really hangs/loops on arm/mips or
> not.
> Interestingly the pth_self_kill_9 test behaves the same on arm/mips and x86
> whereas the pth_self_kill_15_other does finish on arm/mips
> (it prints the 'Terminated' message - the spawned thread finishes first).
> 
> A possible solution would be to make the test deterministic; one way would
> consist of inserting a pthread_join call.
> That would alter the test in terms of the output produced but we believe
> that the nature of the test itself would remain intact.
> Reading the commit message which introduced the tests, we gather that the
> purpose was to test two scenarios(loop/hang) which the
> commit was created to solve.
> In case the above suggested change would not disrupt the intended
> functionality of the test, would it be applicable?
> 
> What course of action would you recommend?

Thanks for looking at this (this part of the code and the related tests
 are very tricky).

I suggest you reproduce the bug by using the test program
and the previous version of Valgrind.

Then modify the test as you want to make it deterministic, but
verify that the test still triggers the bug with the old version
of Valgrind.

Thanks

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409678] improvement suggestion for dhat

2019-07-11 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409678

--- Comment #4 from Philippe Waroquiers  ---
(In reply to plasmahh from comment #3)
> > Seems there is no doc changed, no test changed.
> I wasn't aware of that its necessary to do all this, you can then close this
> ticket, I was just sharing our changes in the hope it might be useful for
> other people that want to know the same information.
Yes, of course, any contribution is welcome.

A change has however more chances to be integrated if it is clear what it does,
and the more complete the patch is, e.g. with doc updated, with a test, ...

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409367] exit_group() after signal arriving to thread waiting in futex() causes hangs

2019-07-11 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409367

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Allison Karlitskaya from comment #5)
> (In reply to Philippe Waroquiers from comment #4)
> > Pushed as 63a9f0793
> 
> Thanks very much, Philippe.
> 
> A few questions, if you don't mind:
> 
> 1) is there any workaround to this problem that you can imagine (in terms of
> commandline flags, etc.) that void avoid the problem other than to update
> valgrind to a version that includes this patch?  Our current workaround is
> to add a sleep on the main thread before exit, and I'd like to remove that
> ASAP.
I do not see a workaround at valgrind command line level.

> 
> 2) when is this patch likely to appear in a release?  when is it likely to
> appear in a stable release?
We only have stable releases :).
Typically, there is a release every 6 months or so.
The last release was the 12 of April.


> 
> 3) do you think this patch is suitable for backporting/vendor-patching for
> distro packages?
The patch can for sure be backported, if some distro want to do it.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409678] improvement suggestion for dhat

2019-07-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409678

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
The attached patch is in an unusual format, e.g. contains various control
characters.

Also, what is the idea that the patch is implementing ?

Seems there is no doc changed, no test changed.

So, that makes it not easy to see what you propose to add.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409367] exit_group() after signal arriving to thread waiting in futex() causes hangs

2019-07-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409367

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Philippe Waroquiers  ---
Pushed as 63a9f0793

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-07-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Philippe Waroquiers  ---
Thanks for the review.
Pushed as 63a9f0793

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409429] False positives at unexpected location due to failure to recognize cmpeq as a dependency breaking idiom

2019-07-05 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409429

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
For info, reproduced on Ubuntu 19.04 with g++ 8.3.0 and valgrind trunk.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-07-02 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

--- Comment #11 from Philippe Waroquiers  ---
A patch fixing this problem (and also bug 409367) has been
attached to bug 409367.

If no remarks on the approach, I will push in a few days.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409367] exit_group() after signal arriving to thread waiting in futex() causes hangs

2019-07-02 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409367

--- Comment #3 from Philippe Waroquiers  ---
Created attachment 121295
  --> https://bugs.kde.org/attachment.cgi?id=121295=edit
fix hands and loops when process sends signal to itself

I have tested with the reproducer attached, and it works.
The test added by the patch is similar to this test.

If no remark on the approach, I will push in a few days ...

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409367] exit_group() after signal arriving to thread waiting in futex() causes hangs

2019-07-01 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409367

Philippe Waroquiers  changed:

   What|Removed |Added

   Assignee|jsew...@acm.org |philippe.waroquiers@skynet.
   ||be

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409367] exit_group() after signal arriving to thread waiting in futex() causes hangs

2019-07-01 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409367

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #2 from Philippe Waroquiers  ---
This looks very similar to a loop reproduced with 409141.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-07-01 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

Philippe Waroquiers  changed:

   What|Removed |Added

   Assignee|jsew...@acm.org |philippe.waroquiers@skynet.
   ||be

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 255603] exp-sgcheck Assertion '!already_present' failed

2019-06-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=255603

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||bugzilla@poradnik-webmaster
   ||a.com

--- Comment #10 from Philippe Waroquiers  ---
*** Bug 409162 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409162] exp-sgcheck: sg_main.c:559 (add_blocks_to_StackTree): Assertion '!already_present' failed.

2019-06-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409162

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 CC||philippe.waroquiers@skynet.
   ||be
 Resolution|--- |DUPLICATE

--- Comment #1 from Philippe Waroquiers  ---
See bug 255603.
This is (supposed to be) solved in valgrind >= 3.14.

Please upgrade your valgrind (it is easy to compile from source)
and verify it is working for you.

Thanks

*** This bug has been marked as a duplicate of bug 255603 ***

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-06-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

--- Comment #9 from Philippe Waroquiers  ---
See also bug https://bugs.kde.org/show_bug.cgi?id=372600

This bug seems somewhat related/similar to the above.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-06-25 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

--- Comment #8 from Philippe Waroquiers  ---
Thanks for the small reproducer.

This small test case is revealing a bunch of problems related
to termination of a process when it kills itself,
and some problems in the gdbserver debug tracing.
This last thing was easily fixable (commit 90d831171).

Otherwise, it looks like there are at least 2 problems:
  * the hang reported here
  * but a similar hang in case the main thread is sending the signal
to the other thread.  We then seem to have a race condition between
the main thread that exits, and the other thread that tries to kill
the process.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 409141] Valgrind hangs when SIGKILLed

2019-06-24 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=409141

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #5 from Philippe Waroquiers  ---
IMO, this is supposed to work:
If the application is sending SIGKILL to itself,
the syscall is intercepted, and some special handling is suppose
to happen to ensure the process dies.

See e.g. m_signals.c async_signalhandler
and/or syswrap-generic.c ML_(do_sigkill).

So, if it does not work, this looks to be a real bug.

do you have a small compilable reproducer ?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 406434] valgrind is unable to intercept the malloc calls in statically linked executables

2019-04-13 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=406434

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Mark Wielaard from comment #2)
> (In reply to Tom Hughes from comment #1)
> > This is a fundamental, and I believe well documented, limitation.
> > 
> > Because valgrind relies on preloading a shared object to do function
> > interception it can't work for a program that doesn't use the dynamic
> > linker, and intercepting malloc calls (among other things) relies on the
> > function interception system.
> 
> I don't believe that this is well documented.
> It would at least be helpful if valgrind clearly warned about this.
> For example if there is no PT_INTERP for the executable, but the wrapper
> does use LD_PRELOAD for a vgpreload library.
> 
> valgrind doesn't have to do normal ELF symbol interposition. It can
> intercept anything with a symbol name/address. The documentation of
> --soname-synonyms even implies that this would work for statically linked
> code/executables:
> 
>·   Replacements in a statically linked library are done by using
>the NONE pattern. For example, if you link with libtcmalloc.a,
>and only want to intercept the malloc related functions in the
>executable (and standard libraries) themselves, but not any
>other shared libraries, you can give the option
>--soname-synonyms=somalloc=NONE. Note that a NONE pattern will
>match the main executable and any shared library having no
>soname.
> 
> But this doesn't work in this case because even if it can see and find the
> malloc related functions in the executable the vgpreload libraries with the
> replacement/interception functions isn't loaded.
> 
> In theory we can get the replacements wired in differently like we do with
> add_hardwired_spec () for ld.so. But that would require some way to compile
> in the replacement functions into the tools themselves instead of relying on
> LD_PRELOAD.

Some years ago, I looked at having replacement functions in the tool,
and not as LD_PRELOAD.  I think the (only?) (main?) reason why the replacement
functions are in a .so is that they must run in guest mode, and there is
some protection that forbids to run some 'valgrind tool' code in guest mode.
IIRC, if we could separate the 'pages' where we have the 'real valgrind'
code from the 'tool code that must run in client mode', we could have
the same protection but not depend on LD_PRELOAD.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 399355] Add callgrind_diff

2019-04-11 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=399355

--- Comment #14 from Philippe Waroquiers  ---
Note that at work, I am busy discussing to have someone working on this bug.
So, some progress might happen in the coming weeks (but not for 3.15).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 406260] valgrind memcheck receive SIGBUS on octeon II CPU

2019-04-06 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=406260

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
(In reply to Shouhua Yu from comment #0)

> gcc version is 4.7 glibc is 2.16 kernel version is 3.10 valgrind code is
> 3.14.0
The below message tells that valgrind version is 3.12.0.
It would be good to try with the latest release (3.14.0) or
even the GIT version.

> ==7907== Memcheck, a memory error detector
> ==7907== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
> ==7907== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-31 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

--- Comment #11 from Philippe Waroquiers  ---
(In reply to wavexx from comment #10)
> Do you still think the buffer sizes should be hard-coded though?
> 
> I know you can recompile and all, and theoretically this should never
> happen, but I do expect debugging tools to never fail on crappy input ;)

There are advantages and disadvantages to the current approach:
As I understand, in terms of software layers, the VEX lib does not have any
dependencies to the valgrind memory management layer/address space manager.
Have memory sized at startup would break this.
Also, when these max are exceeded, this is really an (efficiency) bug.
Maybe it might also slightly impact the performance.

But for sure, it not nice to have valgrind crashing on valid programs.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Philippe Waroquiers  ---
(In reply to wavexx from comment #7)
> Created attachment 119159 [details]
> valgrind trace (current master)

Thanks for the quick return.

Looking at the difference, the nr of front end temporaries has been divided by
3
(from 1854 to 594).
After instrumentation, divided by >4:
6503 -> 1495
and later on, the generated code is also much smaller.

So, Julian did a very good job :).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

--- Comment #6 from Philippe Waroquiers  ---
(In reply to wavexx from comment #5)
> Indeed, the current master can run it through without any tweak.
That is good news.
> Is there anything you want me to try?
I think the problem should be properly solved.

But to grasp a little bit better how much this was improved,
if you are courageous, it would be nice to redo the tracing with master
of the block that was giving the crash, so that we can evaluate the code
improvement.

As the new version might not use exactly the same SB nr as the 3.14,
you should find the line that looks like:
 SB 97263 (evchecks 68367534) [tid 1] 0x541a124 (anonymous
namespace)::wxPNGImageData::DoLoadPNGFile(wxImage*, (anonymous
namespace)::wxPNGInfoStruct&) [clone .constprop.45]+2228
/usr/local/stow/wxWidgets-3.1.2/lib/libwx_gtk3u_core-3.1.so.2.0.0+0x519124

and then do the trace with --trace-notbelow=X   --trace-notabove=Y
and use X and Y to have 1 or 2 SB before/after the [clone
.constprop.45]+2228
address giving the problem.

Thanks
Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-30 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

--- Comment #4 from Philippe Waroquiers  ---
I have taken a quick look at the trace, and effectively,
the generated code is huge.
The code looks related to xmm/ymm registers and instructions.
In 3.15, Julian has made a bunch of improvements for the code
generation in this area.
See e.g. 
 git log
3af8e12b0d49dc87cd26258131ebd60c9b587c74..3b2f8bf69ea11f13357468d28cebc88d41be9199

Could you try to compile the last GIT version and see it it works better ?

Thanks

Philippe

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-28 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||jsew...@acm.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405782] "VEX temporary storage exhausted" when attempting to debug slic3r-pe

2019-03-24 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405782

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
Thanks for the bug.

Could you attach the VEX debug trace obtained doing the below ?
Thanks

---
Use the unpatched valgrind (so as to reproduce the problem/crash).
run a first time:
  valgrind --trace-flags= 

This will output a bunch of lines such as:
...
 SB 1789 (evchecks 8650) [tid 1] 0x4f833a7 free_mem+231 UNKNOWN_OBJECT+0x0
 SB 1790 (evchecks 8651) [tid 1] 0x4f832ae free_slotinfo+110
UNKNOWN_OBJECT+0x0
...

Then rerun with
valgrind --trace-flags= --trace-notbelow=X 
where X is one or two numbers before the SB that causes the crash.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405516] Memcheck with Ruby produces numerous outputs

2019-03-18 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=405516

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #3 from Philippe Waroquiers  ---
Looking at the trace file:
As explained by Tom, when you launch ruby, several forks
and at least one exec happens.

Right at the beginning, you see that valgrind is reading /usr/bin/ruby
and then goes on to read /usr/bin/bash.
So, it looks like /usr/bin/ruby is a bash script which then launches
various things via fork and some exec.
Completely at the end, it does an exec of /usr/bin/ruby-mri
I will assume that this is the real ruby interpreter or whatever.

The conclusion: if you want to analyse what the ruby interpreter does,
you have to give --trace-children=yes otherwise what you are valgrind-ing
will just be the wrapper around the real ruby thing.

You might eliminate some of the output by playing with 
--trace-children-skip=patt1,patt2,... 
and/or --trace-children-skip-by-arg=patt1,patt2,...
and/or --child-silent-after-fork=no|yes
but --trace-children=yes seems mandatory.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 404638] Add VG_(replaceIndexXA)

2019-03-16 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=404638

Philippe Waroquiers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||philippe.waroquiers@skynet.
   ||be
 Status|REPORTED|RESOLVED

--- Comment #6 from Philippe Waroquiers  ---
Slightly modified version of the patch pushed as 081c34ea477.
(I have removed some of the asserts and the call to ensureSpace
as the replace operation can never make the array grow.

Thanks for the patch

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 402833] memcheck/tests/overlap testcase fails, memcpy seen as memmove

2019-03-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=402833

--- Comment #3 from Philippe Waroquiers  ---
(In reply to Julian Seward from comment #2)
> Is there any progress here?  How important will it be to fix this for 3.15.0?

I believe this will be a non neglectible change in the REDIR mechanism,
as the REDIR will have to be done at at ifunc level.

As far as I can see; not fixing this means some false negative for memcpy,
not detecting overlap args.
So, not that critical IMO

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 401454] Add a --show-percs option to cg_annotate and callgrind_annotate.

2019-01-27 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=401454

Philippe Waroquiers  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED
 CC||philippe.waroquiers@skynet.
   ||be

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 402369] Overhaul DHAT

2019-01-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=402369

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #8 from Philippe Waroquiers  ---
Note that if no effort is available to look at what is suggested
in comment 3 and 6, then maybe better to push the patch as is.
(for sure, I do not want to block a clear functional improvement
for that reason).

Julian can maybe comment.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 402369] Overhaul DHAT

2019-01-23 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=402369

Philippe Waroquiers  changed:

   What|Removed |Added

  Component|callgrind   |dhat
   Assignee|josef.weidendor...@gmx.de   |jsew...@acm.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 402369] Overhaul DHAT

2019-01-10 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=402369

--- Comment #6 from Philippe Waroquiers  ---
(In reply to Nick Nethercote from comment #5)
> > It might be interesting to replace the wordFM by an xtree,
> 
> It may. Nonetheless, I'd rather land the code as-is, because it's a major
> improvement over the existing DHAT. We can consider optimizations to the
> implementation later :)

Optimization (in terms of speed) is one (non major) aspect
(I would guess that it is unlikely that dhat spends a lot of cpu
in this data structure). 
The main aspect is to avoid adding another dhat specific data structure,
instead of reusing (extending) the 'common coregrind data structure',
used by massif/memcheck/helgrind.
Or told otherwise: avoid growing the code basis unnecessarily.

But I have not looked much in details how easy it would be to extend
xtre to make it support the dhat and output json.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 402833] memcheck/tests/overlap testcase fails, memcpy seen as memmove

2019-01-07 Thread Philippe Waroquiers
https://bugs.kde.org/show_bug.cgi?id=402833

Philippe Waroquiers  changed:

   What|Removed |Added

 CC||philippe.waroquiers@skynet.
   ||be

--- Comment #1 from Philippe Waroquiers  ---
It also started to fail for me some months ago.
Debian GLIBC 2.24-11+deb9u3
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

See https://sourceforge.net/p/valgrind/mailman/message/36034448/

At that time, IIRC, I vaguely contemplated this could be fixed by changing
the REDIR mechanism by making it 'better ifunc aware, and do the
redirection at an earlier stage.

-- 
You are receiving this mail because:
You are watching all bug changes.

  1   2   3   4   5   >