On 2017/02/18 0:57, John Reiser wrote:
> Hint #1.  Fix the first complaint.  Do not pass GO, do not collect $200.  FIX 
> the first complaint.
> You will get more sympathy and attention if the *first* significant event
> is the bug/error/mystery that is the focus of your inquiry.
>
>> ==3755== Mismatched free() / delete / delete []
>> ==3755==    at 0x4C2CD3A: free (vg_replace_malloc.c:530)
>> ==3755==    by 0x13EE71B3: bool
>> google::protobuf::InsertIfNotPresent...

Thank you. I thought of investigating this myself.
(But my previous brief analysis came to a dead end since the allocation
was done inside libstdc++ AND the mozilla code seemed to
honor the proper free/malloc, delete/new, delete []/new arrayobject at 
the superficial source code level :-( ]

By running the the latest thunderbird code under valgrind/memcheck
under linux kernel 3.19.5 (this is the latest kernel I could make the 
memcheck + thunderbird work under Debian GNU/Linux.), I obtained
the mismatched warnings as many as possible, and tried to analyze them.

According to
https://bugzilla.mozilla.org/show_bug.cgi?id=1340576,
the prospect is grim.
See comment 5 there.
https://bugzilla.mozilla.org/show_bug.cgi?id=1340576#c5
--- begin quote ---
(In reply to ISHIKAWA, Chiaki from comment #4)
 > Julian, what course of action should I take from here?

The simple answer is, run with --show-mismatched-frees=no.  Most of
them are false positives caused by inconsistent inlining of malloc
into new vs free into delete.

The more complex answer is, we'd have to look at them on an
individual basis.  Bug 1325470 is an example which Mike Hommey
believes is a real bug.  But those are relatively rare.  Mostly
Valgrind is reporting false positives here.
--- end quote ---

So it seems that these are actually FALSE POSITIVEs due to inconsistent 
inlining of compiler/header/whatever [I am not a C++ guru].
So if I say, --show-mismatched-frees=no, these won't show up and since
they don't interfere with the operation of valgrind under the kernel 
3.19.5, it does seem to be a false positive to me.
(That these are reported as false positives is in itself a big problem: 
I think it is the issues of GCC6 and libstdc++ code compiled by GCC6. I 
am not sure whether these false positives won't happen if clang is used 
for compiling libstdc++ and mozilla thunderbird. But I digress.)

My original question was why the test set up works under vanilla 3.19.5 
linux kernel and not under 4.8.y Debian GNU/Linux kernel. Somehow
the same setup works under kernel revision 3.19.5.

>
> =====
>
>> ishikawa@ip030:/NREF-COMM-CENTRAL/comm-central$ gdb /usr/local/bin/valgrind
>    [[snip]]
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x000000080470fdf8 in ?? ()
>> (gdb) where
>> #0  0x000000080470fdf8 in ?? ()
>> #1  0x0000000802e8df30 in ?? ()
>> #2  0x000000000010d76b in ?? ()
>> #3  0x0000000802008460 in ?? ()
>> #4  0x0000000802e8df30 in ?? ()
>> #5  0x0000000000001c00 in ?? ()
>> #6  0x0000000038c6bb00 in ?? ()
>> #7  0x0000000000000601 in ?? ()
>> #8  0x0000000000011af3 in ?? ()
>> #9  0x0000000000000000 in ?? ()
>> (gdb) quit
>> A debugging session is active.
>
> Hint #2.  Use gdb effectively.
>
> (gdb) info reg   ## show all registers
> (gdb) x/5i $pc   ## examine instruction stream
> (gdb) x/30i $pc-0x20   ## likely previous instruction stream (heuristic sync 
> for variable-length instructions)
> (gdb) x/32xw $sp   ## examine memory at stack pointer
> (gdb) info proc   ## display the process ID
> (gdb) shell cat /proc/<PID>/maps   ## show memory mapping; <PID> is "process" 
> from "info proc"
>
>
> Hint #3.  If child processes are involved, then apply the tool to them, too.
> $ valgrind --trace-children=yes ...

Oh, I thought I passed "--trace-children to the particular valgrind 
session(s) when I captured the latest log.
Hmm. All the logs in the last e-mail of the valgrind runs
had --trace-children=yes option (not always at the beginning, though).
Aha, there seems to have been a copy&paste error when I created the 
previous e-mail.

case 1. valgrind --trace-children=yes ...
case 2. (gdb) run
Starting program: /usr/local/bin/valgrind --verbose --trace-children=yes 
  --smc-check=all-non-file ...

(I am afraid that there could have been a copy&paste error here. I ran
valgrind with the echoed back options. I might have erased the command 
line after |run| by mistake. You can see that the said option was passed 
correctly from the following output from valgrind as well.

--3973-- Valgrind options:
--3973--    --verbose
--3973--    --trace-children=yes  <=== here
--3973--    --smc-check=all-non-file
--3973--    --gen-suppressions=all

case 3. valgrind  --vex-iropt-register-updates=allregs-at-mem-access 
--verbose --trace-children=yes ...

I would check what is the memory at 0xffeffbab8 (reported in
strace output):

 > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, 
si_addr=0xffeffbab8} ---
+++ killed by SIGSEGV +++

using

 > (gdb) shell cat /proc/<PID>/maps   ## show memory mapping; <PID> is 
"process" from "info proc"

and look at the memory area at the address.

In the meantime, if there is anyone who has run a large program under 
valgrind/memcheck under stock Debian GNU/Linux kernel, please let me 
know your kernel version number.  Even if I can figure out the memory 
mmap/stack/whatever condition by analyzing the kernel memory map, etc. 
by looking at the address reported when SIGSEGV is reported, unless I 
can figure out WHAT KERNEL OPTION is the culprit exactly, that won't be 
of much help to me as it stands now. :-(
If I can know what KERNEL OPTION is the culprit, at least I can try to
re-create the 4.8.y series kernel and try valgrind under it.
There are enough differences of kernel options between 3.19.5 and 4.8.y, 
and a fishing trip won't discover the culprit easily.
(I have been using Debian for close to 20 years now, but maybe I should 
switch to Fedora/CentOS since it is used by Mozilla foundation's 
compilatation/test farm. Oh well, the compilation/test farm uses clang 
and so there is another issue GCC vs clang. I have been using GCC for 30 
years, and was comfortable using Debian GNU/Linux and GCC. Maybe it is 
time for a change.)

TIA


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to