Hi Paul,

We have a patch that you could try, it would be great if you could tell whether this fixes the problem for you:

http://midnight-commander.org/attachment/ticket/3846/3846_mcview_hook.patch

On Sat, 5 Aug 2017, Paul Wise wrote:

On Fri, 2017-08-04 at 23:20 +0200, Yury V. Zaytsev wrote:

The most obvious reason for the segfault would have been that somebody has freed `view->filename_vpath` without setting it to NULL, but a quick grep for `->filename_vpath` doesn't reveal anything suspicious
:-(

Discussion on IRC made us think that it was using memory that has been allocated but not initialised, because the MALLOC_PERTURB_=254 MALLOC_CHECK_=2 combination causes glibc to poison new memory allocations with 0xFE and if you then see 0xFE anywhere, that means that the program is buggy and interacting with uninitialised memory.

The initial report lacked a critical piece of information, and specifically that you have pressed "Ctrl+x q" during panel initialization and this caused the crash, and not merely observed a "random crash on startup".

This is the reason why I originally assumed that either the pointer wasn't set to NULL after getting freed, or something else overwrote it to point to unitialized memory.

The issue is easily reproducible for me and I think you could reproduce it by introducing a 5 second sleep call just before the panel setup.

So far, I wasn't able reproduce it, at least not that easily.

PS: you might be interested in the valgrind output for one of the sessions where I was able to press Ctrl+x q early enough. The log clearly shows that valgrind found a number of invalid reads.

https://people.debian.org/~pabs/tmp/valgrind-mc.gz

Yes, this log has been really useful and now it is more or less clear what's going on. Specifically, a dialog started processing an event you triggered by pressing "Ctrl+x q" before it was fully initialized, and this lead to the crash you observed. Instead, events should be ignored during the initialization phase, or queued for later processing.

A similar issue with another dialog (much easier to reproduce) has been fixed in 4.8.19 as I said earlier. Unfortunately, such issues haven't been consistently considered by past developers, so many more of the same type of problems might be lurking around the widget system.

PS: I would suggest running the mc code through some static analysis tools, either via check-all-the-things or by manually running the tools listed in these two configuration files:

I think that you are misinterpreting your valgrind report. All of the very many invalid reads you observed are caused by the same underlying cause, and specifically you triggering an event processing before the dialog has been fully initialized. If you run mc under valgrind operating normally which we routinely do, the report should be clean.

And yes, we do use static analysis tools like cppcheck, experimental compiler warnings (which have been actually much more useful than the former), Google Sanitizers and the like.

If you are fine with using proprietary services to improve free code, you could look at Coverity. You could also get Google to do some fuzzing of the mc codebase.

Indeed, we have been using Coverity for the last 5 years or so. In as far as fuzzing is concerned, unfortunately, mc hasn't been programmed to be robust against invalid inputs to begin with and is written in a very unforgiving language, so a lot of work is still to be done for fuzzing to be actually useful instead of simply revealing the truth above.

--
Sincerely yours,
Yury V. Zaytsev

Reply via email to