On Mon, Nov 17, 2014 at 8:55 AM, Dilip kumar <dilip.ku...@huawei.com> wrote: > > On 13 November 2014 15:35 Amit Kapila Wrote,
> >As mentioned by you offlist that you are not able reproduce this > > >issue, I have tried again and what I observe is that I am able to > > >reproduce it only on *release* build and some cases work without > > >this issue as well, > > >example: > > >./vacuumdb --analyze-in-stages -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 8 -d postgres > > >Generating minimal optimizer statistics (1 target) > > >Generating medium optimizer statistics (10 targets) > > >Generating default (full) optimizer statistics > > > >So to me, it looks like this is a timing issue and please notice > > >why in error the statement looks like > > >"ANALYZE ng minimal optimizer statistics (1 target)". I think this > > >is not a valid statement. > > >Let me know if you still could not reproduce it. > > > I will review the code and try to find the same, meanwhile if you can find some time to debug this, it will be really helpful. > I think I have found the problem and fix for the same. The stacktrace of crash is as below: #0 0x00000080108cf3a4 in .strlen () from /lib64/libc.so.6 #1 0x00000080108925bc in ._IO_vfprintf () from /lib64/libc.so.6 #2 0x00000080108bc1e0 in .__GL__IO_vsnprintf_vsnprintf () from /lib64/libc.so.6 #3 0x00000fff7e581590 in .appendPQExpBufferVA () from /data/akapila/workspace/master/installation/lib/libpq.so.5 #4 0x00000fff7e581774 in .appendPQExpBuffer () from /data/akapila/workspace/master/installation/lib/libpq.so.5 #5 0x0000000010003748 in .run_parallel_vacuum () #6 0x0000000010003f60 in .vacuum_parallel () #7 0x0000000010002ae4 in .main () (gdb) f 5 #5 0x0000000010003748 in .run_parallel_vacuum () So now the real reason here is that the list of tables passed to function is corrupted. The below code seems to be the real culprit: vacuum_parallel() { .. if (!tables || !tables->head) { SimpleStringList dbtables = {NULL, NULL}; ... .. tables = &dbtables; } .. } In above code dbtables is local to if loop and code is using the address of same to assign it to tables which is used out of if block scope, moving declaration to the outer scope fixes the problem in my environment. Find the updated patch that fixes this problem attached with this mail. Let me know your opinion about the same. While looking at this problem, I have noticed couple of other improvements: a. In prepare_command() function, patch is doing init of sql buffer (initPQExpBuffer(sql);) which I think is not required as both places from where this function is called, it is done by caller. I think this will lead to memory leak. b. In prepare_command() function, for fixed strings you can use appendPQExpBufferStr() which is what used in original code as well. c. run_parallel_vacuum() { .. prepare_command(connSlot[free_slot].connection, full, verbose, and_analyze, analyze_only, freeze, &sql); appendPQExpBuffer(&sql, " %s", cell->val); .. } I think it is better to end command with ';' by using appendPQExpBufferStr(&sql, ";"); in above code. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
vacuumdb_parallel_v17.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers