luben karavelov wrote: > [email protected] wrote: >> >> Thanks for collecting this info. The valgrind output could be of some >> use, but unfortunately I don't have time right now to set up a working >> RDBMS and extensively debug things. I'll keep this on my todo list. >> >> You should please re-run valgrind with --num-callers=30 or more, because >> in some cases errors are in too nested functions to get a clear idea of >> whether the issue is caused by garbage fed by slapd/back-sql or by errors >> inside the RDBMS/ODBC layers. The fact that valgrind systematically >> complains about internals of the RDBMS/ODBC reading past the end of >> memory >> chunks malloc'ed by slapd could be related to passing some non-nul >> terminated bervals that are dealt with as strings. Having a longer call >> stack could help tracking those occurrences. However, those issues >> should >> not be critical, since there's no invalid writes. >> >> Also, you should walk through the list of attributes being returned, to >> provide a hint about whether back-sql is computing a screwed attrlist or >> so. Along the lines of your current gdb session, you should get to frame >> #5, refresh_merge() in pcache.c, and print *e->e_attrs, >> *e->e_attrs->a_desc, *e->e_attrs->a_vals[0]; then move to >> e->e_attrs->a_next and repeat the prints to the end of the list. The >> fact >> you get a value of "a" equal to 0x500000000 looks definitely odd to >> me, as >> that attr list should result from be_entry_get_rw(), which in turn should >> collect it from the local database. Unless valgrind reveals some oddity >> in back-sql, the behavior you notice should not depend on the specific >> remote database you're using, but rather from the local one. >> >> p. > > Hello, > Tomorrow I will make a setup with pure sql process and a pure pcache > daemon that reads from the first over unix domain socket. In this manner > it will be clear if the crashing part is related to back-sql and the > database drivers/ODBC manager or not. > > Meanwhile, you could find the requested debugging session here: > http://purgatory.spnet.net/~karavelov/attr_list/gdb-1 > > It seems that the "e" pointer is corrupted.
Good catch. > Tomorrow I will start it > through valgrind with more back-frames as requested Another quick check you could probably do relatively quickly is zero out that "e" pointer before calling be_entry_get_rw() within refresh_merge(). Thanks, p.
