[Drizzle-discuss] Hints for some Valgrind issues

Kristian Nielsen Fri, 07 Aug 2009 02:02:54 -0700

Since I recently fixed some Valgrind errors in MariaDB similar to what Drizzle
is working on right now, I wanted to give a few hints as to what I did in case
it could be helpful. Sorry in advance that I don't have time to work on the
fixes myself...


-----------------------------------------------------------------------
First, the InnoDB ones:

==7096== 2,482,440 bytes in 128 blocks are still reachable in loss record 98 of 
99
==7096==    at 0x4C232CB: malloc (vg_replace_malloc.c:207)
==7096==    by 0x65DB3C: mem_heap_create_block (mem0mem.c:352)
==7096==    by 0x6DC461: hash0_create (mem0mem.ic:442)
==7096==    by 0x69DAB0: thr_local_init (thr0loc.c:241)
==7096==    by 0x6262DB: srv_boot (srv0srv.c:1011)
==7096==    by 0x6C746E: innobase_start_or_create_for_mysql (srv0start.c:1197)
==7096==    by 0x6186A3: innobase_init(PluginRegistry&) (ha_innodb.cc:1866)
==7096==    by 0x525810: plugin_initialize(drizzled::plugin::Handle*) 
(sql_plugin.cc:524)
==7096==    by 0x5287FA: plugin_init(int*, char**, int) (sql_plugin.cc:624)
==7096==    by 0x4114D2: init_server_components() (drizzled.cc:1356)
==7096==    by 0x4125CF: main (drizzled.cc:1551)

I think Drizzle is using the Oracle InnoDB plugin, right? Then it is probably
the same issue I had when I merged XtraDB (which is based on the Oracle
InnoDB plugin) into MariaDB.

The old InnoDB kept track of every allocation in some list or so, and freed
everything on that list at shutdown. There are lots of stuff allocated a
single time at startup that InnoDB doesn't bother to free explicitly at
shutdown.

Then the InnoDB plugin introduces --innodb-use-sys-malloc, which disables this
extra memory tracking. Since it is on by default, all the missing explicit
free() at exit comes to the surface.

I am not going to argue that the InnoDB behaviour is correct. The assumption
that everything is freed by the OS at shutdown fails now that storage engines
can be loaded and unloaded dynamically. And this can hide real memory leaks.

But I spent quite a bit of time trying to fix it before realising that this
was too big a task for me to take on. Would be nice if Oracle would fix it.

In the meantime, I concluded the best fix for MariaDB is to disable
--innodb-use-sys-malloc for Valgrind testing. The relevant commits are here:

    http://bazaar.launchpad.net/~maria-captains/maria/5.1/revision/2711.1.3
    http://bazaar.launchpad.net/~maria-captains/maria/5.1/revision/2711.1.4

Relevant patch snippits:

--- mysql-test/mysql-test-run.pl        2009-06-05 15:35:22 +0000
+++ mysql-test/mysql-test-run.pl        2009-06-18 12:39:21 +0000
@@ -1356,6 +1356,18 @@ sub command_line_setup {
               join(" ", @valgrind_args), "\"");
   }
 
+  # InnoDB does not bother to do individual de-allocations at exit. Instead it
+  # relies on a custom allocator to track every allocation, and frees all at
+  # once during exit.
+  # In XtraDB, an option use-sys-malloc is introduced (and on by default) to
+  # disable this (for performance). But this exposes Valgrind to all the
+  # missing de-allocations, so we need to disable it to at least get
+  # meaningful leak checking for the rest of the server.
+  if ($opt_valgrind_mysqld)
+  {
+    push(@opt_extra_mysqld_opt, "--loose-skip-innodb-use-sys-malloc");
+  }
+
   mtr_report("Checking supported features...");
 
   check_ndbcluster_support(\%mysqld_variables);

--- mysql-test/t/innodb-use-sys-malloc.test     2009-06-09 13:19:13 +0000
+++ mysql-test/t/innodb-use-sys-malloc.test     2009-06-18 12:39:21 +0000
@@ -1,4 +1,7 @@
 --source include/have_innodb.inc
+# XtraDB has lots of memory leak warnings at shutdown when
+# --innodb-use-sys-malloc
+--source include/not_valgrind.inc
 
 #display current value of innodb_use_sys_malloc
 SELECT @@GLOBAL.innodb_use_sys_malloc;

-----------------------------------------------------------------------

==7096== 416 bytes in 1 blocks are still reachable in loss record 95 of 99
==7096==    at 0x4C23B29: operator new(unsigned long) (vg_replace_malloc.c:230)
==7096==    by 0x4EA8A1B: 
google::protobuf::protobuf_BuildDesc_google_2fprotobuf_2fdescriptor_2eproto_AssignGlobalDescriptors(google::protobuf::FileDescriptor
 const*) (descriptor.pb.cc:91)
==7096==    by 0x4E8CC11: 
google::protobuf::DescriptorBuilder::BuildFile(google::protobuf::FileDescriptorProto
 const&, void (*)(google::protobuf::FileDescriptor const*)) (descriptor.cc:2374)
==7096==    by 0x4E8DB57: 
google::protobuf::DescriptorPool::InternalBuildGeneratedFile(void const*, int, 
void (*)(google::protobuf::FileDescriptor const*)) (descriptor.cc:1950)
==7096==    by 0x4E97C1E: __static_initialization_and_destruction_0(int, int) 
(descriptor.pb.cc:437)
==7096==    by 0x4EEB6F1: (within 
/home/knielsen.bak/devel/drizzle/install/lib/libprotobuf.so.2.0.0)
==7096==    by 0x4E68932: (within 
/home/knielsen.bak/devel/drizzle/install/lib/libprotobuf.so.2.0.0)
==7096==    by 0x7FEFFF79F: ???
==7096==    by 0x400E165: call_init (in /lib/ld-2.7.so)
==7096==    by 0x400E28D: _dl_init (in /lib/ld-2.7.so)

So I don't know much about the google protobuf stuff, but I can provide a bit
of background information on the "still reachable" Valgrind warnings:

By default, Valgrind does not report memory leaks for memory that is "still
reachable". This is actually a very good feature of Valgrind that most other
leak detectors cannot do. Many programs and libraries do one-time allocations
at startup and rely on the OS to free them at exit, and this generally works
quite well. Reporting still reachable memory as leaks would cause tons of
not-that-interesting warnings from these programs.

But in MySQL, we pass --show-reachable=yes to Valgrind in
mysql-test-run.pl. The reason for this is mostly that the SAFE_MALLOC stuff
also detects leaks using its own method that does not have any kind of "still
reachable" exceptions. So the code has to explicitly free everything at exit
anyway to avoid SAFE_MALLOC warnings, and therefore we might as well track it
with Valgrind as well.

Still, there is quite a bit of work done on the MySQL code to fix all of
these "still reachable" warnings.

I assume that Drizzle does not use SAFE_MALLOC anymore. So Drizzle might want
to consider whether it is worthwhile to keep tracking and fixing these kinds
of warnings. It might be that the effort could be spent better elsewhere. The
"still reachable" warnings have a very large percent that are really not
serious problems.

I'm not saying I think Drizzle should do this one way or the other, there are
also good points to tracking and fixing "still reachable" warnings, just
giving the background as something to consider.

-----------------------------------------------------------------------
==2500== Conditional jump or move depends on uninitialised value(s)
==2500==    at 0x596734D: deflate (deflate.c:818)
==2500==    by 0x5BE20C: do_flush(azio_stream*, int) (azio.cc:682)
==2500==    by 0x5BE6BB: azflush (azio.cc:745)
==2500==    by 0x5BB0B0: ha_archive::info(unsigned int) (ha_archive.cc:1314)
==2500==    by 0x4B6A10: JOIN::optimize() (join.cc:5621)
==2500==    by 0x53061E: mysql_select(Session*, Item***, TableList*, unsigned 
int, List<Item>&, Item*, unsigned int, order_st*, order_st*, Item*, unsigned 
long, select_result*, Select_Lex_Unit*, Select_Lex*) (sql_select.cc:404)
==2500==    by 0x531111: handle_select(Session*, LEX*, select_result*, unsigned 
long) (sql_select.cc:138)
==2500==    by 0x522A05: execute_sqlcom_select(Session*, TableList*) 
(sql_parse.cc:1338)
==2500==    by 0x40BBD3: drizzled::command::Select::execute() (select.cc:29)
==2500==    by 0x51FE59: mysql_execute_command(Session*) (sql_parse.cc:1261)
==2500==    by 0x521DEC: mysql_parse(Session*, char const*, unsigned int, char 
const**) (sql_parse.cc:1559)
==2500==    by 0x52219E: dispatch_command(enum_server_command, Session*, char*, 
unsigned int) (sql_parse.cc:215)
==2500==    by 0x4E4C16: Session::executeStatement() (session.cc:686)
==2500==    by 0x50AF21: handle_one_connection (sql_connect.cc:84)
==2500==    by 0x6FA03F6: start_thread (pthread_create.c:297)
==2500==    by 0x728DB3C: clone (in /usr/lib/debug/libc-2.7.so)

This is actually one of the rare cases of a false alarm in Valgrind. The zlib
code uses an unrolled loop for speed, and thus touches a few bytes after the
data to be (de-)compressed. But it is done in a safe way, and the undefined
results are discarded after the loop.

Traditionally at MySQL we handled this by adding suppressions in
mysql-test/valgrind.supp. The problem with this is that there is then a
constant need to add new suppressions as new versions of libz appear, or libz
compiled with different compilers etc., since this subtly changes the stack
traces.

In MariaDB I implemented what I think is a better fix:

    http://bazaar.launchpad.net/~maria-captains/maria/5.1/revision/2704

It passes a custom memory allocator to libz that clears any memory allocated
by libz. This makes sense since we are not really trying to find bugs in
libz. It also reduces the risk of the suppressions masking a real bug in the
use of libz, such as trying to compress or decompress undefined data in a
buffer passed into libz by the server code.

(It could be usefully extended by using VALGRIND_CHECK_MEM_IS_DEFINED() on the
passed-in buffers to libz).

-----------------------------------------------------------------------
==5461== Conditional jump or move depends on uninitialised value(s)
==5461==    at 0x496E94: Item_in_subselect::init_left_expr_cache() 
(subselect.cc:1834)
==5461==    by 0x497050: Item_in_subselect::exec() (subselect.cc:311)
==5461==    by 0x4933F6: Item_in_subselect::val_bool() (subselect.cc:880)
==5461==    by 0x40E2AC: Item::val_bool_result() (item.h:456)
==5461==    by 0x46CA27: Item_in_optimizer::val_int() (cmpfunc.cc:1580)
==5461==    by 0x4C24C7: SQL_SELECT::skip_record() (opt_range.cc:1099)
==5461==    by 0x42BC3A: filesort(Session*, Table*, st_sort_field*, unsigned 
int, SQL_SELECT*, unsigned long, bool, unsigned long*) (filesort.cc:551)
==5461==    by 0x533271: create_sort_index(Session*, JOIN*, order_st*, unsigned 
long, unsigned long, bool) (sql_select.cc:5934)
==5461==    by 0x4B3A1A: JOIN::exec() (join.cc:1613)
==5461==    by 0x5306DA: mysql_select(Session*, Item***, TableList*, unsigned 
int, List<Item>&, Item*, unsigned int, order_st*, order_st*, Item*, unsigned 
long, select_result*, Select_Lex_Unit*, Select_Lex*) (sql_select.cc:418)
==5461==    by 0x531111: handle_select(Session*, LEX*, select_result*, unsigned 
long) (sql_select.cc:138)
==5461==    by 0x522A05: execute_sqlcom_select(Session*, TableList*) 
(sql_parse.cc:1338)
==5461==    by 0x40BBD3: drizzled::command::Select::execute() (select.cc:29)
==5461==    by 0x51FE59: mysql_execute_command(Session*) (sql_parse.cc:1261)
==5461==    by 0x521DEC: mysql_parse(Session*, char const*, unsigned int, char 
const**) (sql_parse.cc:1559)
==5461==    by 0x52219E: dispatch_command(enum_server_command, Session*, char*, 
unsigned int) (sql_parse.cc:215)

I remember working on this last year back in November/December. Unfortunately,
I did not get very far with it, I remember it looked quite hard :-(. So I am
not much help here, sorry...

-----------------------------------------------------------------------

Hope this helps,

 - Kristian.

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

[Drizzle-discuss] Hints for some Valgrind issues

Reply via email to