Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-07 Thread Yury V. Zaytsev

Hi Paul,

On Sun, 6 Aug 2017, Paul Wise wrote:


On Sun, 2017-08-06 at 11:51 +0200, Yury V. Zaytsev wrote:

We have a patch that you could try, it would be great if you could tell 
whether this fixes the problem for you:


http://midnight-commander.org/attachment/ticket/3846/3846_mcview_hook.patch


Thanks, that fixes the problem.


I appreciate you confirming this, just landed in master.

Denis, do you think you could find time to update the package to 4.8.19? I 
don't know when we'll make a 4.8.20 release, probably in the coming 
months, but it would be good to get bugfixes in 4.8.19 out...


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-06 Thread Paul Wise
On Sun, 2017-08-06 at 11:51 +0200, Yury V. Zaytsev wrote:

> We have a patch that you could try, it would be great if you could tell 
> whether this fixes the problem for you:
> 
> http://midnight-commander.org/attachment/ticket/3846/3846_mcview_hook.patch

Thanks, that fixes the problem.

> The initial report lacked a critical piece of information, and 
> specifically that you have pressed "Ctrl+x q" during panel initialization 
> and this caused the crash, and not merely observed a "random crash on 
> startup".

Yeah, initially I thought it was a random crash. It was only after some
more investigation that I discovered it was actually reproducible.

> This is the reason why I originally assumed that either the pointer wasn't 
> set to NULL after getting freed, or something else overwrote it to point 
> to unitialized memory.

I see.

> Yes, this log has been really useful and now it is more or less clear 
> what's going on.

Good to hear, thanks for the info.

> I think that you are misinterpreting your valgrind report.

Yeah, that is clear from your description of the problem.

> If you run mc under valgrind ... which we routinely do
...
> we do use static analysis tools like cppcheck
...
> we have been using Coverity for the last 5 years or so

Good to hear :)

> as fuzzing is concerned ...

I see, thanks for the info.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-06 Thread Yury V. Zaytsev

Hi Paul,

We have a patch that you could try, it would be great if you could tell 
whether this fixes the problem for you:


http://midnight-commander.org/attachment/ticket/3846/3846_mcview_hook.patch

On Sat, 5 Aug 2017, Paul Wise wrote:


On Fri, 2017-08-04 at 23:20 +0200, Yury V. Zaytsev wrote:

The most obvious reason for the segfault would have been that somebody 
has freed `view->filename_vpath` without setting it to NULL, but a 
quick grep for `->filename_vpath` doesn't reveal anything suspicious

:-(


Discussion on IRC made us think that it was using memory that has been 
allocated but not initialised, because the MALLOC_PERTURB_=254 
MALLOC_CHECK_=2 combination causes glibc to poison new memory 
allocations with 0xFE and if you then see 0xFE anywhere, that means that 
the program is buggy and interacting with uninitialised memory.


The initial report lacked a critical piece of information, and 
specifically that you have pressed "Ctrl+x q" during panel initialization 
and this caused the crash, and not merely observed a "random crash on 
startup".


This is the reason why I originally assumed that either the pointer wasn't 
set to NULL after getting freed, or something else overwrote it to point 
to unitialized memory.


The issue is easily reproducible for me and I think you could reproduce 
it by introducing a 5 second sleep call just before the panel setup.


So far, I wasn't able reproduce it, at least not that easily.

PS: you might be interested in the valgrind output for one of the 
sessions where I was able to press Ctrl+x q early enough. The log 
clearly shows that valgrind found a number of invalid reads.


https://people.debian.org/~pabs/tmp/valgrind-mc.gz


Yes, this log has been really useful and now it is more or less clear 
what's going on. Specifically, a dialog started processing an event you 
triggered by pressing "Ctrl+x q" before it was fully initialized, and this 
lead to the crash you observed. Instead, events should be ignored during 
the initialization phase, or queued for later processing.


A similar issue with another dialog (much easier to reproduce) has been 
fixed in 4.8.19 as I said earlier. Unfortunately, such issues haven't been 
consistently considered by past developers, so many more of the same type 
of problems might be lurking around the widget system.


PS: I would suggest running the mc code through some static analysis 
tools, either via check-all-the-things or by manually running the tools 
listed in these two configuration files:


I think that you are misinterpreting your valgrind report. All of the very 
many invalid reads you observed are caused by the same underlying cause, 
and specifically you triggering an event processing before the dialog has 
been fully initialized. If you run mc under valgrind operating normally 
which we routinely do, the report should be clean.


And yes, we do use static analysis tools like cppcheck, experimental 
compiler warnings (which have been actually much more useful than the 
former), Google Sanitizers and the like.


If you are fine with using proprietary services to improve free code, 
you could look at Coverity. You could also get Google to do some fuzzing 
of the mc codebase.


Indeed, we have been using Coverity for the last 5 years or so. In as far 
as fuzzing is concerned, unfortunately, mc hasn't been programmed to be 
robust against invalid inputs to begin with and is written in a very 
unforgiving language, so a lot of work is still to be done for fuzzing to 
be actually useful instead of simply revealing the truth above.


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-05 Thread Paul Wise
On Fri, 2017-08-04 at 23:20 +0200, Yury V. Zaytsev wrote:

> The most obvious reason for the segfault would have been that somebody has 
> freed `view->filename_vpath` without setting it to NULL, but a quick grep 
> for `->filename_vpath` doesn't reveal anything suspicious :-(

Discussion on IRC made us think that it was using memory that has been
allocated but not initialised, because the MALLOC_PERTURB_=254
MALLOC_CHECK_=2 combination causes glibc to poison new memory
allocations with 0xFE and if you then see 0xFE anywhere, that means
that the program is buggy and interacting with uninitialised memory.

https://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html
http://udrepper.livejournal.com/11429.html

> So this makes me think that maybe something else corrupted the memory 
> indirectly, but then I can't see how this can be reasonably investigated 
> without a sanitizer like report or even better a reproducer...

The issue is easily reproducible for me and I think you could reproduce
it by introducing a 5 second sleep call just before the panel setup.

PS: you might be interested in the valgrind output for one of the
sessions where I was able to press Ctrl+x q early enough. The log
clearly shows that valgrind found a number of invalid reads.

https://people.debian.org/~pabs/tmp/valgrind-mc.gz

PS: I would suggest running the mc code through some static analysis
tools, either via check-all-the-things or by manually running the tools
listed in these two configuration files:

https://anonscm.debian.org/cgit/collab-maint/check-all-the-things.git/tree/data/c.ini
https://anonscm.debian.org/cgit/collab-maint/check-all-the-things.git/tree/data/elf.ini

If you are fine with using proprietary services to improve free code,
you could look at Coverity. You could also get Google to do some
fuzzing of the mc codebase.

https://scan.coverity.com/
https://www.linux.com/blog/2017/7/googles-oss-fuzz-tool-helps-secure-open-source-projects

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-05 Thread Denis Briand
Oops its ctrl+x q of course :)

  Message d'origine  
De: Yury V. Zaytsev
Envoyé: samedi 5 août 2017 10:09
À: Denis Briand
Cc: bug
Objet: Re: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

On Sat, 5 Aug 2017, Denis Briand wrote:

> On Sat, Aug 05, 2017 at 12:07:39AM +0200, Yury V. Zaytsev wrote:
>> On Fri, 4 Aug 2017, Denis Briand wrote:
>>
>>> No I can't reproduce it with git master master and the sid version,
>>> because the start is too quick... ^^ My valgrind report is clean. I
>>> attached the Paul valgrind report on the upstream bug ticket.
>>
>> Well, it claims that the memory that viewer filename is pointing to is freed
>> in set_display_type(), but I have no idea why this is happening, and why
>> viewer is even involved here in the first place. Also, do you have a clue as
>> to what Ctrl+x g is supposed to do?
>
> Yes Ctrl+x g open a quick view panel.

Hmmm, it's Ctrl+x q for me, but now at least it start making *some* sense. 
Are you sure it's really Ctrl+x g, maybe pabs is using some customized 
keymap layout?

Anyways, it sounds very much like

http://www.midnight-commander.org/ticket/3700

which has been fixed in 4.8.19, and as you can see, ASan traces a much 
nicer than what comes out of Valgrind, but whatever.

So, would be good to know if this still appears in 4.8.19 and/or git 
master, since it seems that pabs is able to reproduce it. If this doesn't 
appear anymore, then it's #3700 and we don't have to do anything.

Also, when forwarding bugs, please set version field correctly. By setting 
it to master, you are claiming that you could reproduce it on master, 
whereas, in fact, it's happening on a version that is 1 year old.

-- 
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-05 Thread Yury V. Zaytsev

On Sat, 5 Aug 2017, Denis Briand wrote:


On Sat, Aug 05, 2017 at 12:07:39AM +0200, Yury V. Zaytsev wrote:

On Fri, 4 Aug 2017, Denis Briand wrote:


No I can't reproduce it with git master master and the sid version,
because the start is too quick... ^^ My valgrind report is clean. I
attached the Paul valgrind report on the upstream bug ticket.


Well, it claims that the memory that viewer filename is pointing to is freed
in set_display_type(), but I have no idea why this is happening, and why
viewer is even involved here in the first place. Also, do you have a clue as
to what Ctrl+x g is supposed to do?


Yes Ctrl+x g open a quick view panel.


Hmmm, it's Ctrl+x q for me, but now at least it start making *some* sense. 
Are you sure it's really Ctrl+x g, maybe pabs is using some customized 
keymap layout?


Anyways, it sounds very much like

http://www.midnight-commander.org/ticket/3700

which has been fixed in 4.8.19, and as you can see, ASan traces a much 
nicer than what comes out of Valgrind, but whatever.


So, would be good to know if this still appears in 4.8.19 and/or git 
master, since it seems that pabs is able to reproduce it. If this doesn't 
appear anymore, then it's #3700 and we don't have to do anything.


Also, when forwarding bugs, please set version field correctly. By setting 
it to master, you are claiming that you could reproduce it on master, 
whereas, in fact, it's happening on a version that is 1 year old.


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Denis Briand
On Sat, Aug 05, 2017 at 12:07:39AM +0200, Yury V. Zaytsev wrote:
> On Fri, 4 Aug 2017, Denis Briand wrote:
> 
> >No I can't reproduce it with git master master and the sid version,
> >because the start is too quick... ^^ My valgrind report is clean. I
> >attached the Paul valgrind report on the upstream bug ticket.
> 
> Well, it claims that the memory that viewer filename is pointing to is freed
> in set_display_type(), but I have no idea why this is happening, and why
> viewer is even involved here in the first place. Also, do you have a clue as
> to what Ctrl+x g is supposed to do?

Yes Ctrl+x g open a quick view panel.


pgp4v2mRXtnDN.pgp
Description: PGP signature


Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Yury V. Zaytsev

On Fri, 4 Aug 2017, Denis Briand wrote:

No I can't reproduce it with git master master and the sid version, 
because the start is too quick... ^^ My valgrind report is clean. I 
attached the Paul valgrind report on the upstream bug ticket.


Well, it claims that the memory that viewer filename is pointing to is 
freed in set_display_type(), but I have no idea why this is happening, and 
why viewer is even involved here in the first place. Also, do you have a 
clue as to what Ctrl+x g is supposed to do?


Under valgrind, mc starts rather slowly for me in /usr/bin, but I can't 
reproduce it on 4.8.19...


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Denis Briand
On Fri, Aug 04, 2017 at 11:29:43PM +0200, Yury V. Zaytsev wrote:
> On Fri, 4 Aug 2017, Denis Briand wrote:
> 
> >Step to reproduce:
> >$ MALLOC_PERTURB_=254 MALLOC_CHECK_=2 mc
> >then during the startup: hit "Ctrl+x g" keys combo
> >
> >Important: you must hit "Ctrl+x g" keys before the panels are displayed.
> >To reproduce this long start, you have got 370 sub-directories in the
> >current directory.
> 
> Hi Denis,
> 
> Ah, so you've got a reproducer, great! Are you able to reproduce it on git
> master? Can you easily switch ASan on?
> 
> -- 
> Sincerely yours,
> Yury V. Zaytsev

Hi Yury,
No I can't reproduce it with git master master and the sid version,
because the start is too quick... ^^
My valgrind report is clean.
I attached the Paul valgrind report on the upstream bug ticket.

Denis


pgpl30bPOGwRP.pgp
Description: PGP signature


Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Yury V. Zaytsev

On Fri, 4 Aug 2017, Denis Briand wrote:


Step to reproduce:
$ MALLOC_PERTURB_=254 MALLOC_CHECK_=2 mc
then during the startup: hit "Ctrl+x g" keys combo

Important: you must hit "Ctrl+x g" keys before the panels are displayed.
To reproduce this long start, you have got 370 sub-directories in the
current directory.


Hi Denis,

Ah, so you've got a reproducer, great! Are you able to reproduce it on git 
master? Can you easily switch ASan on?


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: [Pkg-mc-devel] Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Yury V. Zaytsev

Hi Paul,

Thanks for the report! To be very honest, I'm not sure how actionable it 
is though :-/


The most obvious reason for the segfault would have been that somebody has 
freed `view->filename_vpath` without setting it to NULL, but a quick grep 
for `->filename_vpath` doesn't reveal anything suspicious :-(


So this makes me think that maybe something else corrupted the memory 
indirectly, but then I can't see how this can be reasonably investigated 
without a sanitizer like report or even better a reproducer...


--
Sincerely yours,
Yury V. Zaytsev



Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Denis Briand
severity 870728 minor
forwarded 870728 https://midnight-commander.org/ticket/3846
thanks

Thank you Paul.
As I said on IRC, it is a very particular case.
So I switch this bug to minor and I forward it to upstream.

To summarize here our IRC chat:

Step to reproduce:
$ MALLOC_PERTURB_=254 MALLOC_CHECK_=2 mc
then during the startup: hit "Ctrl+x g" keys combo

Important: you must hit "Ctrl+x g" keys before the panels are displayed.
To reproduce this long start, you have got 370 sub-directories in the
current directory.

regards

Denis Briand


pgpveE8t5APL0.pgp
Description: PGP signature


Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Paul Wise
On Fri, 2017-08-04 at 12:51 -0400, Paul Wise wrote:

> I should note that I have these variables set. I expect the
> 0xfefefefefefefefe in the traceback is from the second one.

This is confirmed:

$ strings 
/var/crash/1000/5115-1000-1000-11-1501856299-chianamo--usr-bin-mc.core | grep 
MALLOC
MALLOC_PERTURB_=254
MALLOC_CHECK_=2

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Paul Wise
On Fri, 2017-08-04 at 10:53 -0400, Paul Wise wrote:

> I got a random crash when starting mc:

I should note that I have these variables set. I expect the
0xfefefefefefefefe in the traceback is from the second one.

# 
https://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html
export MALLOC_CHECK_=2
# http://udrepper.livejournal.com/11429.html
export MALLOC_PERTURB_=$((RANDOM % 255 + 1))

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#870728: mc: random crash on startup (SIGSEGV)

2017-08-04 Thread Paul Wise
Package: mc
Version: 3:4.8.18-1
Severity: normal
Usertags: crash

I got a random crash when starting mc:

$ gdb -batch -n -ex 'set pagination off' -ex bt -ex 'thread apply all bt full' 
--core /var/crash/1000/5115-1000-1000-11-1501856299-chianamo--usr-bin-mc.core 
/usr/bin/mc
[New LWP 5115]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `mc -b'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x007962f9acd5 in vfs_path_elements_count 
(vpath=vpath@entry=0xfefefefefefefefe) at path.c:797
797 path.c: No such file or directory.
#0  0x007962f9acd5 in vfs_path_elements_count 
(vpath=vpath@entry=0xfefefefefefefefe) at path.c:797
#1  0x007962f9b1a5 in vfs_path_free (vpath=0xfefefefefefefefe) at path.c:954
#2  0x007963007e69 in mcview_done (view=view@entry=0x79634d3000) at 
lib.c:228
#3  0x007963001c7e in mcview_hook (v=0x79634d3000) at actions_cmd.c:211
#4  0x007962f7f7ae in execute_hooks (hook_list=) at hook.c:89
#5  0x007962f79735 in frontend_dlg_run (h=0x79634d2030) at dialog.c:525
#6  dlg_run (h=0x79634d2030) at dialog.c:1196
#7  0x007962f922cf in create_panels_and_run_mc () at midnight.c:952
#8  do_nc () at midnight.c:1768
#9  0x007962f6da80 in main (argc=, argv=) at 
main.c:403

Thread 1 (Thread 0x7f70c5dc1740 (LWP 5115)):
#0  0x007962f9acd5 in vfs_path_elements_count 
(vpath=vpath@entry=0xfefefefefefefefe) at path.c:797
No locals.
#1  0x007962f9b1a5 in vfs_path_free (vpath=0xfefefefefefefefe) at path.c:954
vpath_element_index = 0
#2  0x007963007e69 in mcview_done (view=view@entry=0x79634d3000) at 
lib.c:228
No locals.
#3  0x007963001c7e in mcview_hook (v=0x79634d3000) at actions_cmd.c:211
view = 0x79634d3000
panel = 0x79634da000
v = 0x79634d3000
view = 0x79634d3000
#4  0x007962f7f7ae in execute_hooks (hook_list=) at hook.c:89
new_hook = 0x79634f2690
p = 0x79634f2690
#5  0x007962f79735 in frontend_dlg_run (h=0x79634d2030) at dialog.c:525
d_key = 
wh = 0x79634d2030
event = {buttons = 0 '\000', modifiers = 0 '\000', vc = 0, dx = 0, dy = 
0, x = -1, y = 25335, type = (GPM_MOVE | GPM_UP | GPM_SINGLE | GPM_DOUBLE | 
GPM_TRIPLE), clicks = 1666068480, margin = (GPM_TOP | GPM_RGT | unknown: 112), 
wdx = -12544, wdy = 25380}
#6  dlg_run (h=0x79634d2030) at dialog.c:1196
No locals.
#7  0x007962f922cf in create_panels_and_run_mc () at midnight.c:952
No locals.
#8  do_nc () at midnight.c:1768
ret = 
#9  0x007962f6da80 in main (argc=, argv=) at 
main.c:403
mcerror = 0x0
config_migrated = 
config_migrate_msg = 0x7f70c461313c <_int_malloc+1228> 
"\351E\376\377\377\017\037\200"
exit_code = 1

-- System Information:
Debian Release: buster/sid
  APT prefers testing-debug
  APT policy: (900, 'testing-debug'), (900, 'testing'), (800, 
'unstable-debug'), (800, 'unstable'), (790, 'buildd-unstable'), (700, 
'experimental-debug'), (700, 'experimental'), (690, 'buildd-experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.11.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_AU.utf8, LC_CTYPE=en_AU.utf8 (charmap=UTF-8), 
LANGUAGE=en_AU.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages mc depends on:
ii  e2fslibs  1.43.4-2
ii  libc6 2.24-12
ii  libglib2.0-0  2.52.3-1
ii  libgpm2   1.20.4-6.2+b1
ii  libslang2 2.3.1a-1
ii  libssh2-1 1.8.0-1
ii  mc-data   3:4.8.18-1

Versions of packages mc recommends:
ii  mime-support  3.60
ii  perl  5.26.0-5
ii  unzip 6.0-21

Versions of packages mc suggests:
pn  arj  
ii  bzip21.0.6-8.1
pn  dbview   
pn  djvulibre-bin
ii  evince [pdf-viewer]  3.22.1-4
ii  file 1:5.30-1
ii  genisoimage  9:1.1.11-3+b2
pn  gv   
ii  imagemagick  8:6.9.7.4+dfsg-16
ii  imagemagick-6.q16 [imagemagick]  8:6.9.7.4+dfsg-16
pn  libaspell-dev
ii  lynx 2.8.9dev16-1
pn  odt2txt  
ii  poppler-utils0.48.0-2
ii  python   2.7.13-2
pn  python-boto  
ii  python-tz2017.2-2
ii  texlive-binaries 2017.20170613.44572-3
ii  w3m  0.5.3-34
ii  zip  3.0-11+b1

-- no debconf information

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part