Re: [question] valgrind stack trace under Native / i965

2018-03-12 Thread Jonas Ådahl
On Tue, Feb 20, 2018 at 08:39:56AM +0100, Cyrille Chépélov wrote:
> Hi,
> 
> While working to improve my reports on
> https://gitlab.gnome.org/GNOME/mutter/issues/18, 19 and 31, I've tried to
> run gnome-shell under valgrind.
> 
> Jonas Ådahl's suggestion in
> https://gitlab.gnome.org/GNOME/mutter/issues/31#note_56382 was good, and
> I've successfully reached a (quick but permanent) crash situation, even when
> running in non-hybrid GPU mode.
> 
> I've written a rather liberal suppressions file in order to ignore every
> writes made to memory mapped from the i965 GPU (attached here, fwiw)
> 
> The crash location is very reliably in "st_theme_get_custom_stylesheets
> (st-theme.c:311)" as called from Javascript code — unfortunately
> dump_gjs_stack_on_signal_handler() dies trying to report the JS stack.
> 
> It seems that st_theme_get_custom_stylesheets() at
> https://gitlab.gnome.org/GNOME/gnome-shell/blob/master/src/st/st-theme.c#L311
> is attempting to check the type of an object which has already been
> allocated and de-allocated (the actual place of allocation/deallocation
> varies from run to run, but here are a few samples). Upon closer inspection,
> it actually looks like the themes->stylesheets g_hash_table's nodes got
> overwritten at some point.
> 
> My question is, what is the best course of action here?
> 
>  * non-trivially fixable valgrind false positive ⇒ drop?

Report bugs. But before, you could try with a newer gjs, and with the
gnome-shell commits in https://gitlab.gnome.org/GNOME/gnome-shell as
they might fix some.

>  * mutter issue ⇒ report in the mutter project?

Yes.

>  * gnome-shell issue ⇒ report in the gnome-shell issue?

Yes.

> 
> (my goal is to attempt to get meaningful clues to #18, #19 and possibly #31
> in order to help with the search for a solution)
> 
> Thanks in advance!

Thanks yourself,


Jonas

> 
>     -- Cyrille
> 
> ———
> 
> ==6345== Invalid read of size 8
> ==6345==at 0x5662C07: g_type_check_instance_is_fundamentally_a 
> (gtype.c:4023)
> ==6345==by 0x5641A7D: g_object_ref (gobject.c:3204)
> ==6345==by 0x7ED7EFC:*st_theme_get_custom_stylesheets (st-theme.c:311)*
> ==6345==by 0xB62FFCD: ffi_call_unix64 (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0xB62F93E: ffi_call (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0xEB8AFDB: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==by 0xEB7E086: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==by 0xEB8A845: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==by 0xEB8AE1E: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==by 0xEB8B0F8: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==  Address*0x2215f0a0*  is 0 bytes inside a block of size 32 free'd
> ==6345==at 0x4C2E2BB: operator delete(void*) (vg_replace_malloc.c:576)
> ==6345==by 0x6926891: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0x566577F: g_value_unset (gvalue.c:275)
> ==6345==by 0x56441FB: g_object_new_valist (gobject.c:2123)
> ==6345==by 0x5644798: g_object_new (gobject.c:1640)
> ==6345==by 0x6C3C120: create_child_meta (clutter-container.c:933)
> ==6345==by 0x6C22D64: clutter_actor_add_child_internal 
> (clutter-actor.c:12889)
> ==6345==by 0x6C22D64: clutter_actor_add_child (clutter-actor.c:13024)
> ==6345==by 0xB62FFCD: ffi_call_unix64 (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0xB62F93E: ffi_call (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0xEB8AFDB: ??? (in 
> /usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
> ==6345==  Block was alloc'd at
> ==6345==at 0x4C2D1FF: operator new(unsigned long) 
> (vg_replace_malloc.c:334)
> ==6345==by 0x6926D90: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0x5641AF6: g_object_ref (gobject.c:3210)
> ==6345==by 0x5641BA7: g_value_object_collect_value (gobject.c:3832)
> ==6345==by 0x56443CA: g_object_new_valist (gobject.c:2106)
> ==6345==by 0x5644798: g_object_new (gobject.c:1640)
> ==6345==by 0x6C3C120: create_child_meta (clutter-container.c:933)
> ==6345==by 0x6C22D64: clutter_actor_add_child_internal 
> (clutter-actor.c:12889)
> ==6345==by 0x6C22D64: clutter_actor_add_child (clutter-actor.c:13024)
> ==6345==by 0xB62FFCD: ffi_call_unix64 (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0xB62F93E: ffi_call (in 
> /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
> ==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
> ==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)
> 
> 
> Another run:
> 
> ==7769== Invalid read of size 8
> ==7769==at 0x5662C0F: g_type_chec

[question] valgrind stack trace under Native / i965

2018-02-19 Thread Cyrille Chépélov

Hi,

While working to improve my reports on 
https://gitlab.gnome.org/GNOME/mutter/issues/18, 19 and 31, I've tried 
to run gnome-shell under valgrind.


Jonas Ådahl's suggestion in 
https://gitlab.gnome.org/GNOME/mutter/issues/31#note_56382 was good, and 
I've successfully reached a (quick but permanent) crash situation, even 
when running in non-hybrid GPU mode.


I've written a rather liberal suppressions file in order to ignore every 
writes made to memory mapped from the i965 GPU (attached here, fwiw)


The crash location is very reliably in "st_theme_get_custom_stylesheets 
(st-theme.c:311)" as called from Javascript code — unfortunately 
dump_gjs_stack_on_signal_handler() dies trying to report the JS stack.


It seems that st_theme_get_custom_stylesheets() at 
https://gitlab.gnome.org/GNOME/gnome-shell/blob/master/src/st/st-theme.c#L311 
is attempting to check the type of an object which has already been 
allocated and de-allocated (the actual place of allocation/deallocation 
varies from run to run, but here are a few samples). Upon closer 
inspection, it actually looks like the themes->stylesheets 
g_hash_table's nodes got overwritten at some point.


My question is, what is the best course of action here?

 * non-trivially fixable valgrind false positive ⇒ drop?
 * mutter issue ⇒ report in the mutter project?
 * gnome-shell issue ⇒ report in the gnome-shell issue?

(my goal is to attempt to get meaningful clues to #18, #19 and possibly 
#31 in order to help with the search for a solution)


Thanks in advance!

    -- Cyrille

———

==6345== Invalid read of size 8
==6345==at 0x5662C07: g_type_check_instance_is_fundamentally_a 
(gtype.c:4023)
==6345==by 0x5641A7D: g_object_ref (gobject.c:3204)
==6345==by 0x7ED7EFC:*st_theme_get_custom_stylesheets (st-theme.c:311)*
==6345==by 0xB62FFCD: ffi_call_unix64 (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0xB62F93E: ffi_call (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0xEB8AFDB: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==by 0xEB7E086: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==by 0xEB8A845: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==by 0xEB8AE1E: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==by 0xEB8B0F8: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==  Address*0x2215f0a0*  is 0 bytes inside a block of size 32 free'd
==6345==at 0x4C2E2BB: operator delete(void*) (vg_replace_malloc.c:576)
==6345==by 0x6926891: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0x566577F: g_value_unset (gvalue.c:275)
==6345==by 0x56441FB: g_object_new_valist (gobject.c:2123)
==6345==by 0x5644798: g_object_new (gobject.c:1640)
==6345==by 0x6C3C120: create_child_meta (clutter-container.c:933)
==6345==by 0x6C22D64: clutter_actor_add_child_internal 
(clutter-actor.c:12889)
==6345==by 0x6C22D64: clutter_actor_add_child (clutter-actor.c:13024)
==6345==by 0xB62FFCD: ffi_call_unix64 (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0xB62F93E: ffi_call (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0xEB8AFDB: ??? (in 
/usr/lib/x86_64-linux-gnu/libmozjs-52.so.0.0.0)
==6345==  Block was alloc'd at
==6345==at 0x4C2D1FF: operator new(unsigned long) (vg_replace_malloc.c:334)
==6345==by 0x6926D90: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0x5641AF6: g_object_ref (gobject.c:3210)
==6345==by 0x5641BA7: g_value_object_collect_value (gobject.c:3832)
==6345==by 0x56443CA: g_object_new_valist (gobject.c:2106)
==6345==by 0x5644798: g_object_new (gobject.c:1640)
==6345==by 0x6C3C120: create_child_meta (clutter-container.c:933)
==6345==by 0x6C22D64: clutter_actor_add_child_internal 
(clutter-actor.c:12889)
==6345==by 0x6C22D64: clutter_actor_add_child (clutter-actor.c:13024)
==6345==by 0xB62FFCD: ffi_call_unix64 (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0xB62F93E: ffi_call (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==6345==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
==6345==by 0x69197B3: ??? (in /usr/lib/libgjs.so.0.0.0)


Another run:

==7769== Invalid read of size 8
==7769==at 0x5662C0F: g_type_check_instance_is_fundamentally_a 
(gtype.c:4025)
==7769==by 0x5641A7D: g_object_ref (gobject.c:3204)
==7769==by 0x7ED7EFC:*st_theme_get_custom_stylesheets (st-theme.c:311)*
==7769==by 0xB62FFCD: ffi_call_unix64 (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==7769==by 0xB62F93E: ffi_call (in 
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4)
==7769==by 0x6917ED7: ??? (in /usr/lib/libgjs.so.0.0.0)
==7769==by 0x69197B3: ??? (in /usr/lib/li