Re: [HACKERS] valgrind errors around dsa.c

2017-04-10 Thread Andres Freund
On 2017-04-08 14:46:04 +1200, Thomas Munro wrote:
> Fix attached.

Thanks. Pushed!

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] valgrind errors around dsa.c

2017-04-07 Thread Thomas Munro
On Sat, Apr 8, 2017 at 8:57 AM, Thomas Munro
 wrote:
> On Sat, Apr 8, 2017 at 4:49 AM, Andres Freund  wrote:
>> Hi,
>>
>> newly added tests exercise parallel bitmap scans.  And they trigger
>> valgrind errors:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2017-04-07%2007%3A10%3A01
>>
>>
>> ==4567== VALGRINDERROR-BEGIN
>> ==4567== Conditional jump or move depends on uninitialised value(s)
>> ==4567==at 0x5FD62A: check_for_freed_segments (dsa.c:2219)
>> ==4567==by 0x5FD97E: dsa_get_address (dsa.c:934)
>> ==4567==by 0x5FDA2A: init_span (dsa.c:1339)
>> ==4567==by 0x5FE6D1: ensure_active_superblock (dsa.c:1696)
>> ==4567==by 0x5FEBBD: alloc_object (dsa.c:1452)
>> ==4567==by 0x5FEBBD: dsa_allocate_extended (dsa.c:693)
>> ==4567==by 0x3C7A83: pagetable_allocate (tidbitmap.c:1536)
>> ==4567==by 0x3C7A83: pagetable_create (simplehash.h:342)
>> ==4567==by 0x3C7A83: tbm_create_pagetable (tidbitmap.c:323)
>> ==4567==by 0x3C8DAD: tbm_get_pageentry (tidbitmap.c:1246)
>> ==4567==by 0x3C98A1: tbm_add_tuples (tidbitmap.c:432)
>> ==4567==by 0x22510C: btgetbitmap (nbtree.c:460)
>> ==4567==by 0x21A8D1: index_getbitmap (indexam.c:726)
>> ==4567==by 0x38AD48: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:91)
>> ==4567==by 0x37D353: MultiExecProcNode (execProcnode.c:621)
>> ==4567==  Uninitialised value was created by a heap allocation
>> ==4567==at 0x602FD5: palloc (mcxt.c:872)
>> ==4567==by 0x5FF73B: create_internal (dsa.c:1242)
>> ==4567==by 0x5FF8F5: dsa_create_in_place (dsa.c:473)
>> ==4567==by 0x37CA32: ExecInitParallelPlan (execParallel.c:532)
>> ==4567==by 0x38C324: ExecGather (nodeGather.c:152)
>> ==4567==by 0x37D247: ExecProcNode (execProcnode.c:551)
>> ==4567==by 0x39870F: ExecNestLoop (nodeNestloop.c:156)
>> ==4567==by 0x37D1B7: ExecProcNode (execProcnode.c:512)
>> ==4567==by 0x3849D4: fetch_input_tuple (nodeAgg.c:686)
>> ==4567==by 0x387764: agg_retrieve_direct (nodeAgg.c:2306)
>> ==4567==by 0x387A11: ExecAgg (nodeAgg.c:2117)
>> ==4567==by 0x37D217: ExecProcNode (execProcnode.c:539)
>> ==4567==
>>
>> It could be that these are spurious due to shared memory - valgrind
>> doesn't track definedness across processes - but the fact that memory
>> allocated by palloc is the source of the undefined memory makes me doubt
>> that.
>
> Thanks.  Will post a fix for this later today.

Fix attached.

Explanation:  Whenever segments are destroyed because they no longer
contain any live blocks, the shared variable
control->freed_segment_counter advances.  Each attached backend has
its own local variable area->freed_segment_counter, and if it sees
that the former differs from the latter it checks all attached
segments to see if any need to be detached.  I failed to initialise
the backend-local version, with the consequence that if you were very
unlucky your backend could fail to detach from a no-longer needed
segment until a another segment was eventually freed causing the
shared counter to move again.  More likely, it would notice that they
are different because one holds uninitialised junk, perform a spurious
scan for dead segments, and then get them in sync.

-- 
Thomas Munro
http://www.enterprisedb.com


initialise-freed-segment-counter.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] valgrind errors around dsa.c

2017-04-07 Thread Thomas Munro
On Sat, Apr 8, 2017 at 4:49 AM, Andres Freund  wrote:
> Hi,
>
> newly added tests exercise parallel bitmap scans.  And they trigger
> valgrind errors:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2017-04-07%2007%3A10%3A01
>
>
> ==4567== VALGRINDERROR-BEGIN
> ==4567== Conditional jump or move depends on uninitialised value(s)
> ==4567==at 0x5FD62A: check_for_freed_segments (dsa.c:2219)
> ==4567==by 0x5FD97E: dsa_get_address (dsa.c:934)
> ==4567==by 0x5FDA2A: init_span (dsa.c:1339)
> ==4567==by 0x5FE6D1: ensure_active_superblock (dsa.c:1696)
> ==4567==by 0x5FEBBD: alloc_object (dsa.c:1452)
> ==4567==by 0x5FEBBD: dsa_allocate_extended (dsa.c:693)
> ==4567==by 0x3C7A83: pagetable_allocate (tidbitmap.c:1536)
> ==4567==by 0x3C7A83: pagetable_create (simplehash.h:342)
> ==4567==by 0x3C7A83: tbm_create_pagetable (tidbitmap.c:323)
> ==4567==by 0x3C8DAD: tbm_get_pageentry (tidbitmap.c:1246)
> ==4567==by 0x3C98A1: tbm_add_tuples (tidbitmap.c:432)
> ==4567==by 0x22510C: btgetbitmap (nbtree.c:460)
> ==4567==by 0x21A8D1: index_getbitmap (indexam.c:726)
> ==4567==by 0x38AD48: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:91)
> ==4567==by 0x37D353: MultiExecProcNode (execProcnode.c:621)
> ==4567==  Uninitialised value was created by a heap allocation
> ==4567==at 0x602FD5: palloc (mcxt.c:872)
> ==4567==by 0x5FF73B: create_internal (dsa.c:1242)
> ==4567==by 0x5FF8F5: dsa_create_in_place (dsa.c:473)
> ==4567==by 0x37CA32: ExecInitParallelPlan (execParallel.c:532)
> ==4567==by 0x38C324: ExecGather (nodeGather.c:152)
> ==4567==by 0x37D247: ExecProcNode (execProcnode.c:551)
> ==4567==by 0x39870F: ExecNestLoop (nodeNestloop.c:156)
> ==4567==by 0x37D1B7: ExecProcNode (execProcnode.c:512)
> ==4567==by 0x3849D4: fetch_input_tuple (nodeAgg.c:686)
> ==4567==by 0x387764: agg_retrieve_direct (nodeAgg.c:2306)
> ==4567==by 0x387A11: ExecAgg (nodeAgg.c:2117)
> ==4567==by 0x37D217: ExecProcNode (execProcnode.c:539)
> ==4567==
>
> It could be that these are spurious due to shared memory - valgrind
> doesn't track definedness across processes - but the fact that memory
> allocated by palloc is the source of the undefined memory makes me doubt
> that.

Thanks.  Will post a fix for this later today.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] valgrind errors around dsa.c

2017-04-07 Thread Andres Freund
Hi,

newly added tests exercise parallel bitmap scans.  And they trigger
valgrind errors:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2017-04-07%2007%3A10%3A01


==4567== VALGRINDERROR-BEGIN
==4567== Conditional jump or move depends on uninitialised value(s)
==4567==at 0x5FD62A: check_for_freed_segments (dsa.c:2219)
==4567==by 0x5FD97E: dsa_get_address (dsa.c:934)
==4567==by 0x5FDA2A: init_span (dsa.c:1339)
==4567==by 0x5FE6D1: ensure_active_superblock (dsa.c:1696)
==4567==by 0x5FEBBD: alloc_object (dsa.c:1452)
==4567==by 0x5FEBBD: dsa_allocate_extended (dsa.c:693)
==4567==by 0x3C7A83: pagetable_allocate (tidbitmap.c:1536)
==4567==by 0x3C7A83: pagetable_create (simplehash.h:342)
==4567==by 0x3C7A83: tbm_create_pagetable (tidbitmap.c:323)
==4567==by 0x3C8DAD: tbm_get_pageentry (tidbitmap.c:1246)
==4567==by 0x3C98A1: tbm_add_tuples (tidbitmap.c:432)
==4567==by 0x22510C: btgetbitmap (nbtree.c:460)
==4567==by 0x21A8D1: index_getbitmap (indexam.c:726)
==4567==by 0x38AD48: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:91)
==4567==by 0x37D353: MultiExecProcNode (execProcnode.c:621)
==4567==  Uninitialised value was created by a heap allocation
==4567==at 0x602FD5: palloc (mcxt.c:872)
==4567==by 0x5FF73B: create_internal (dsa.c:1242)
==4567==by 0x5FF8F5: dsa_create_in_place (dsa.c:473)
==4567==by 0x37CA32: ExecInitParallelPlan (execParallel.c:532)
==4567==by 0x38C324: ExecGather (nodeGather.c:152)
==4567==by 0x37D247: ExecProcNode (execProcnode.c:551)
==4567==by 0x39870F: ExecNestLoop (nodeNestloop.c:156)
==4567==by 0x37D1B7: ExecProcNode (execProcnode.c:512)
==4567==by 0x3849D4: fetch_input_tuple (nodeAgg.c:686)
==4567==by 0x387764: agg_retrieve_direct (nodeAgg.c:2306)
==4567==by 0x387A11: ExecAgg (nodeAgg.c:2117)
==4567==by 0x37D217: ExecProcNode (execProcnode.c:539)
==4567==

It could be that these are spurious due to shared memory - valgrind
doesn't track definedness across processes - but the fact that memory
allocated by palloc is the source of the undefined memory makes me doubt
that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers