Hi, Thank you again.
I will hopefully upload the requested info next week. Here is what I can write down today. What would be the appropriate upload service? [The data would be too large for e-mail to the list.] On 2017/02/19 7:32, John Reiser wrote: > How many failures occur in 10 runs of thunderbird under valgrind? 10 times, i.e., all the time under the Debian's stock newer kernel. > How many failures occur in 10 runs if you reboot just before each run? It never occurred to me to to reboot the system before retrying. I will check this next week (but given the tests I did by SWITCHING kernel versions by rebooting to a different revision before over the last few months, I would say 10 times, i.e. all the times, but again let me check.) > > Thunderbird is a user mail agent that uses interactive graphics. > How many failures occur before the display window appears, and how many after? There is one issue: I am seeing a failure of valgrind when I try to run thunderbird test suite and the complicating factor here is aside from the available user interaction through GUI under X windows, during the execution of |make mozmill| test suite, there is a daemon that runs test scripts and talks to the main TB binary via COM interface. [I stay away from KB and mouse cursor during tests to avoid interfering with the test suite run. I do this by invoking virtual X desktop using Xephyr: the test suite run using Valgrind is done in that virtual desktop. If I wanted to, I COULD interact with thunderbird's GUI via mouse explicitly. I did this a few times when a bug in thunderbird or test scrips made the execution hung waiting for a confirmation of modal dialog, etc.] From what I did, the crash occurs before the display window of the tested thunderbird appears all the time [all the time when the valgrind printed mysterious Segmentation error under newer Debian kernel. > Are the symptoms and frequency the same for a Radeon card as for NVidia? > On the open-source NVidia driver versus the proprietary driver? > In "dumb framebuffer" mode ("no" acceleration)? > Please tell us which cards: "lspci -nn | grep VGA" or similar. I am using Debian GNU/Linux inside VirtualBox installed under Windows 10 as a platform to develop and test thunderbird patches. Debian GNU/Linux installed as the guest OS inside VirtualBox. So the video graphics driver relevant here is the the VirtualBox video driver, I think, correct? (But there was a puzzling message in X.0.log. I will mention it to the answer to your second to last question.) Under 3.19.5 kernel where the valgrind + thunderbird test suite works: $ lspci -nn | grep VGA 00:02.0 VGA compatible controller [0300]: InnoTek Systemberatung GmbH VirtualBox Graphics Adapter [80ee:beef] ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ (InnoTek is the name of original virtualbox developer.) I am not sure if I can remove the above virtualbox graphics adaptor and revert to the plain VGA adaptor emulation done by VirtualBox, but let me try. > Are the symptoms and frequency the same for Firefox as for thunderbird? I am not developing or creating patches for Firefox. Sorry. > Are the symptoms and frequency the same for Chrome as for thunderbird? Ditto. Oh, you mean to ask whether I can run very simple valgrind firefox-binary (without any test harness invovlment) under the new kernel and see it works? Then I can test it. But Chrome. I have not even installed it before. > Please present a histogram of the {mapped file, pc offset, instruction stream} > when the SIGSEGV happens. [You should have at least 70 runs by now: 10 each > for thunderbird plain, with reboot, other graphics card, other NVidia driver, > dumb framebuffer, Firefox, Chrome.] OK, I will gather the data (not sure what you man by "histogram", but I will gather what I think is relevant.) 10 each for thunderbird plain, with reboot [I will certainly reboot before the test run. x 10 times with the above InnoTech driver (built-in for VirtualBox). [I am not sure if SIGSEGV happens under this setup.] for thunderbid + test suite hookup. I am quite certain that SIGSEGV happens under this setup. BTW, DOES ANYONE HAVE A GOOD IDEA ABOUT HOW TO CAPTURE the mapped file, etc WHEN SIGSEGV happens? It is very dynamic and by the time I am ready to type in shell commands, the child binary that experienced it may be gone. Yes, I have not been able to figure out exactly which process under the test suite setup started by thunderbird (under valgrind) is experiencing a difficulty. I guess some clever hacking via gdb gets me started there? BTW, valgrind's --gdb-* options are meant to debug the target under valgrind, NOT the segfault of valgrind itself, correct? [And the whole thing including valgrind works under kernel 3.19.5 and not under later kernel drives me crasy.] > other graphics card, other NVidia driver, These won't apply. for thunderbird plain, dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.] after reboot for thunderbird + test suite hookup. dumb framebuffer [IF THIS SETUP IS FEASIBLE under VirtualBox.] after reboot > Firefox, I think without any test suite hookup, or anything, I can simply run Firefox ESR now available from Debian GNU/Linux repository. I suspect without any test suite hookup, it will run. Anyway, I will try to compare the mmap status under firefox with stock VirtualBox graphics driver, and mmap status under firefox with dumb framebuffer [IF THIS IS FEASIBLE.] after reboot. > Chrome. It looks there is a package of Chrome for Ubuntu. Maybe I can install it under Debian. However, this can wait, I think. At the same time, it would be very instructive to compare the mmap between the one while chrome is running [AFTER REBOOT] and the ones when mozilla software {thunderbird, firefox} is running. > thunderbird is not available from the Debian stable "jessie" repository > (Debian 8.7.1, 2017-01-20.) Where did you get it? Sorry I was not clear about it. I have fetched so-called comm-central thunderbird repository and have been building it locally [64-bit] for testing purposes to fix some serious bugs I experienced. The instruction to build thunderbird locally is in the following URL and I have basically followed it. https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Simple_Thunderbird_build "Basically" means that I had to tweak the so-called "mozconfig" in many ways, especially, to enable valgrind-friendly build. Very brief explanation is in the following URL: https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind The above refers to test |mochitest| for firefox. Since thunderbird lives in different source directory and uses a very different test suite setup that uses mozmill, there are quirks and modifications one need to add to the source files and scripts in order to run thunderbird under valgrind. It seems that, at one time, somebody hacked the thunderbird test suite to run valgrind/memcheck for thunderbird, but it was abandoned and nobody seems to recall how it was exactly done or how to update the scripts, etc. So basically, what I do myself to run thunderbird is - renaming the original thunderbird binary to something else, and - in its place, I place a binary that invokes the original thunderbird binary under valgrind/memcheck with the supplied parameters. This trick has worked very well and many bugs/issues were found in the last several years until 2015 when I first experienced the strange problem of valgrind failure. And back then, I realized it was related to different kernel versioning. The locally created kernel 3.19.5 saved the day. But the world has moved on to 4.x series kernel since then, and when I updated the kernel last summer this problem reappeared. I have reverted the kernel to 3.19.5 for the moment, but I am not sure how long I can stick to the older kernel. If you need a thinderbird binary to test on your end, I can certainly make it available. Actually, I run the test (without valgrind) inside mozilla's compilation/testing farm occasionally. [This makes it for me to possible to compile/test OSX version and Windows version. This is a necessary step before a patch is accepted into mozilla's source tree. ] You can fetch the binary from there. Please let me know if this is the case. > Which kernel modules have been loaded (lsmod)? Under 3.19.5 ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ uname -a Linux ip030 3.19.5 #1 SMP Mon Apr 20 08:50:21 JST 2015 x86_64 GNU/Linux ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ lsmod Module Size Used by fuse 72030 1 btrfs 731518 0 xor 21081 1 btrfs raid6_pq 95431 1 btrfs ufs 59011 0 qnx4 13100 0 hfsplus 81692 0 hfs 45988 0 minix 27622 0 ntfs 160179 0 vfat 17270 0 msdos 17077 0 fat 50634 2 vfat,msdos jfs 137440 0 xfs 667205 0 libcrc32c 12426 1 xfs ext3 151975 0 jbd 52800 1 ext3 ext2 59160 0 dm_mod 77808 0 vboxsf 37355 1 mptctl 29762 0 mptbase 56835 1 mptctl binfmt_misc 12846 1 ghash_clmulni_intel 13019 0 aesni_intel 163983 0 ppdev 12724 0 joydev 17107 0 iTCO_wdt 12831 0 iTCO_vendor_support 12704 1 iTCO_wdt aes_x86_64 16719 1 aesni_intel ablk_helper 12572 1 aesni_intel cryptd 14600 3 ghash_clmulni_intel,aesni_intel,ablk_helper lrw 12871 1 aesni_intel evdev 17518 14 gf128mul 13047 1 lrw glue_helper 12773 1 aesni_intel microcode 30394 0 snd_intel8x0 30885 2 psmouse 83740 0 serio_raw 12894 0 pcspkr 12595 0 snd_ac97_codec 102547 1 snd_intel8x0 snd_pcm 73065 2 snd_ac97_codec,snd_intel8x0 snd_timer 22641 1 snd_pcm snd 53213 8 snd_ac97_codec,snd_intel8x0,snd_timer,snd_pcm soundcore 13031 1 snd sg 29968 0 ac97_bus 12510 1 snd_ac97_codec processor 28021 0 lpc_ich 20905 0 mfd_core 12601 1 lpc_ich video 18144 0 rng_core 12880 0 vboxvideo 36417 2 vboxguest 181315 6 vboxsf,vboxvideo thermal_sys 28310 2 video,processor ttm 61967 1 vboxvideo drm_kms_helper 74527 1 vboxvideo drm 229484 5 ttm,drm_kms_helper,vboxvideo i2c_piix4 12665 0 i2c_core 38003 3 drm,i2c_piix4,drm_kms_helper syscopyarea 12350 1 vboxvideo sysfillrect 12522 1 vboxvideo sysimgblt 12351 1 vboxvideo ac 12715 0 battery 13356 0 parport_pc 22422 0 parport 31812 2 ppdev,parport_pc button 12988 0 sunrpc 192012 1 loop 22596 0 ip_tables 22004 0 x_tables 19034 1 ip_tables autofs4 27584 2 ext4 403601 15 crc16 12343 1 ext4 jbd2 71809 1 ext4 mbcache 13488 3 ext2,ext3,ext4 sd_mod 39859 26 sr_mod 21993 0 cdrom 27042 1 sr_mod ata_generic 12490 0 hid_generic 12393 0 usbhid 40671 0 hid 90268 2 hid_generic,usbhid ohci_pci 12808 0 ehci_pci 12472 0 ohci_hcd 30951 1 ohci_pci ehci_hcd 40790 1 ehci_pci crc32c_intel 21850 4 ahci 29245 16 usbcore 151644 5 ohci_hcd,ohci_pci,ehci_hcd,ehci_pci,usbhid libahci 23158 1 ahci usb_common 12440 1 usbcore ata_piix 29671 0 libata 145717 4 ahci,libahci,ata_generic,ata_piix scsi_mod 172107 5 sg,libata,mptctl,sd_mod,sr_mod e1000 90595 0 ishikawa@ip030:/KERNEL-SRC/kernel/linux-source-4.9$ I did not realize there are so many vbox drivers. > Which version(s) of the low-level X11 and display drivers (DRM: direct > rendering manager) are in use? Under 3.19.5 egrep -i "(module|vbox|drm)" /var/log/Xorg.0.log & printed out [ 8.651] (==) ModulePath set to "/usr/lib/xorg/modules" [ 8.651] (II) Module ABI versions: [ 8.652] (II) xfree86: Adding drm device (/dev/dri/card0) [ 8.655] (II) LoadModule: "glx" [ 8.658] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so [ 8.716] (II) Module glx: vendor="X.Org Foundation" [ 8.716] compiled for 1.19.1, module version = 1.0.0 [ 8.716] (==) Matched vboxvideo as autoconfigured driver 0 [ 8.716] (==) Matched vboxvideo as autoconfigured driver 1 [ 8.716] (II) LoadModule: "vboxvideo" [ 8.716] (WW) Warning, couldn't open module vboxvideo [ 8.716] (II) UnloadModule: "vboxvideo" [ 8.716] (II) Unloading vboxvideo [ 8.716] (EE) Failed to load module "vboxvideo" (module does not exist, 0) [ 8.716] (II) LoadModule: "modesetting" [ 8.716] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so [ 8.717] (II) Module modesetting: vendor="X.Org Foundation" [ 8.717] compiled for 1.19.1, module version = 1.19.1 [ 8.717] Module class: X.Org Video Driver [ 8.717] (II) LoadModule: "fbdev" [ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so [ 8.717] (II) Module fbdev: vendor="X.Org Foundation" [ 8.717] compiled for 1.19.0, module version = 0.4.4 [ 8.717] Module class: X.Org Video Driver [ 8.717] (II) LoadModule: "vesa" [ 8.717] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so [ 8.717] (II) Module vesa: vendor="X.Org Foundation" [ 8.717] compiled for 1.19.0, module version = 2.3.4 [ 8.717] Module class: X.Org Video Driver [ 8.721] (II) Loading sub module "fbdevhw" [ 8.721] (II) LoadModule: "fbdevhw" [ 8.721] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so [ 8.722] (II) Module fbdevhw: vendor="X.Org Foundation" [ 8.722] compiled for 1.19.1, module version = 0.0.2 [ 8.722] (II) Loading sub module "glamoregl" [ 8.722] (II) LoadModule: "glamoregl" [ 8.722] (II) Loading /usr/lib/xorg/modules/libglamoregl.so [ 8.733] (II) Module glamoregl: vendor="X.Org Foundation" [ 8.733] compiled for 1.19.1, module version = 1.0.0 [ 8.838] EGL_MESA_drm_image required. [ 8.839] (II) modeset(0): Monitor name: VBOX monitor [ 8.840] (II) Loading sub module "fb" [ 8.840] (II) LoadModule: "fb" [ 8.840] (II) Loading /usr/lib/xorg/modules/libfb.so [ 8.840] (II) Module fb: vendor="X.Org Foundation" [ 8.840] compiled for 1.19.1, module version = 1.0.0 [ 8.840] (II) UnloadModule: "fbdev" [ 8.840] (II) UnloadSubModule: "fbdevhw" [ 8.840] (II) UnloadModule: "vesa" [ 8.916] (II) LoadModule: "libinput" [ 8.916] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so [ 8.919] (II) Module libinput: vendor="X.Org Foundation" [ 8.919] compiled for 1.19.0, module version = 0.23.0 [ 8.919] Module class: X.Org XInput Driver I am a little surprised but right now I may be using glx driver given that "vboxvide" module does not seem to be loaded and other famous modules get unloaded. Yes, I found out glxinfo printed out rows of output including the following lines, and glxgears seems to run fine. I should have known. Re: glx: glxinfo | grep -i1 vmware Extended renderer info (GLX_MESA_query_renderer): Vendor: VMware, Inc. (0xffffffff) Device: llvmpipe (LLVM 3.9, 256 bits) (0xffffffff) -- Max GLES[23] profile version: 3.0 OpenGL vendor string: VMware, Inc. OpenGL renderer string: Gallium 0.4 on llvmpipe (L I will collect info on 4.9.0-1 kernel (this is the latest test kernel where I could not run thunderbird test suite since something dies during execution.). It may take a little time to gether the data. (Since the compiling/testing thunderbird requires resources, I have only once instance of VM running on the PC. So I really have to reboot this VM to switch the kernel to obtain data.) I wish someone with 64GB memory could retry and reproduce the issue in their VirtualBox images on their hardware :-) It would be very instructive compare the mmap usage, etc. under different kernel revisions side by side (!) TIA PS: Just in case the HOST CPU/OS may have something to do with the issues: OS: Windows 10 Pro CPU: Intel Xeon CPU E3-1240 V2 Graphics: Radeon 7700 But I am sure that VirtualBox has shielded the bare metal rather well. Windows version of VirtualBox : 5.1.14 r112924 (Qt5.6.2) > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Valgrind-users mailing list > Valgrind-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users