Hi,

On 2022-04-02 16:06:40 +0700, John Naylor wrote:
> So far, kestrel and tamandua don't like this, and they both use
> "-fsanitize=undefined,alignment". I'll try to reproduce locally in a
> bit.

Just hit this in my development environment.

FWIW, I found it hard to work with ubsan without applying 0002 from
https://postgr.es/m/20220323173537.ll7klrglnp4gn2um%40alap3.anarazel.de
I plan to polish that to be able to commit it, but won't have cycles before
the CF ends.

with that applied
CFLAGS=-fsanitize=alignment,undefined,address -fno-sanitize-recover=all
and
export 
ASAN_OPTIONS="detect_leaks=0:disable_coredump=0:print_stacktrace=1:abort_on_error=1"
 UBSAN_OPTIONS="disable_coredump=0:print_stacktrace=1:abort_on_error=1"

I get a backtrace to investigate. Unfortunately abort_on_error=1 also removes
the nicer error message :(. So I ran both.

It triggers reliably on
CLUSTER clstr_expression USING clstr_expression_minus_a;
for me.

/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:723:23: runtime 
error: load of value 150, which is not a valid value for type '_Bool'

#0  __ubsan::__ubsan_handle_load_invalid_value_abort (Data=0x7ff65e8fada0, 
Val=70) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:548
#1  0x00007ff65a649093 in qsort_tuple_int32_compare (a=0x62b000fc0240, 
b=0x62b000fc0258, state=0x625000066a20)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:723
#2  0x00007ff65a64bc7b in qsort_tuple_int32 (data=0x62b000fc0240, n=133, 
arg=0x625000066a20)
    at /home/andres/src/postgresql/src/include/lib/sort_template.h:313
#3  0x00007ff65a672d59 in tuplesort_sort_memtuples (state=0x625000066a20) at 
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:3613
#4  0x00007ff65a6617ec in tuplesort_performsort (state=0x625000066a20) at 
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2154
#5  0x00007ff6586965dd in heapam_relation_copy_for_cluster 
(OldHeap=0x7ff64d893308, NewHeap=0x7ff64d896fc0, OldIndex=0x7ff64d89c860, 
use_sort=true,
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, 
num_tuples=0x7ffd4650a4a0, tups_vacuumed=0x7ffd4650a4c0,
    tups_recently_dead=0x7ffd4650a4e0) at 
/home/andres/src/postgresql/src/backend/access/heap/heapam_handler.c:955
#6  0x00007ff658d6c4b7 in table_relation_copy_for_cluster 
(OldTable=0x7ff64d893308, NewTable=0x7ff64d896fc0, OldIndex=0x7ff64d89c860, 
use_sort=true,
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, 
num_tuples=0x7ffd4650a4a0, tups_vacuumed=0x7ffd4650a4c0,
    tups_recently_dead=0x7ffd4650a4e0) at 
/home/andres/src/postgresql/src/include/access/tableam.h:1658
#7  0x00007ff658d728aa in copy_table_data (OIDNewHeap=73728, OIDOldHeap=47052, 
OIDOldIndex=47078, verbose=false, pSwapToastByContent=0x7ffd4650a670,
    pFreezeXid=0x7ffd4650a680, pCutoffMulti=0x7ffd4650a690) at 
/home/andres/src/postgresql/src/backend/commands/cluster.c:913
#8  0x00007ff658d6ff9e in rebuild_relation (OldHeap=0x7ff64d893308, 
indexOid=47078, verbose=false)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:606
#9  0x00007ff658d6e622 in cluster_rel (tableOid=47052, indexOid=47078, 
params=0x7ffd4650a7b0)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:427
#10 0x00007ff658d6d774 in cluster (pstate=0x619000001aa0, stmt=0x625000005d00, 
isTopLevel=true)
    at /home/andres/src/postgresql/src/backend/commands/cluster.c:195
#11 0x00007ff659d425b1 in standard_ProcessUtility (pstmt=0x625000006010,
    queryString=0x625000005220 "CLUSTER clstr_expression USING 
clstr_expression_minus_a;", readOnlyTree=false, 
context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
    queryEnv=0x0, dest=0x6250000060e0, qc=0x7ffd4650ae40) at 
/home/andres/src/postgresql/src/backend/tcop/utility.c:862
#12 0x00007ff659d40901 in ProcessUtility (pstmt=0x625000006010, 
queryString=0x625000005220 "CLUSTER clstr_expression USING 
clstr_expression_minus_a;",
    readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, 
queryEnv=0x0, dest=0x6250000060e0, qc=0x7ffd4650ae40)

(rr) p/x *b
$4 = {tuple = 0x62500006ba78, datum1 = 0x7ff659cc7e9e, isnull1 = 0x46, srctape 
= 0x7ff6}

There's definitely something borked - looks like this is ending up with bogus
pointers? Using rr to set a watchpoint on isnull1, and continuing backward I
see the memory written to with the following stack:

(rr) watch -l b->isnull1
Hardware watchpoint 3: -location b->isnull1

(rr) reverse-continue
Continuing.

Hardware watchpoint 3: -location b->isnull1

Old value = 70
New value = 190
0x00007ff65a65fbe4 in puttuple_common (state=0x625000066a20, 
tuple=0x7ffd4650a150) at 
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2000
2000                            state->memtuples[state->memtupcount++] = *tuple;
(rr) bt
#0  0x00007ff65a65fbe4 in puttuple_common (state=0x625000066a20, 
tuple=0x7ffd4650a150) at 
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:2000
#1  0x00007ff65a65a478 in tuplesort_putheaptuple (state=0x625000066a20, 
tup=0x625000071d28)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:1810
#2  0x00007ff658696300 in heapam_relation_copy_for_cluster 
(OldHeap=0x7ff64d893308, NewHeap=0x7ff64d896fc0, OldIndex=0x7ff64d89c860, 
use_sort=true,
    OldestXmin=23706, xid_cutoff=0x7ffd4650a480, multi_cutoff=0x7ffd4650a490, 
num_tuples=0x7ffd4650a4a0, tups_vacuumed=0x7ffd4650a4c0,

(note new/old are kind of switched around due to continuing in reverse)

and then

Hardware watchpoint 3: -location b->isnull1

Old value = 190
New value = false
__memset_evex_erms () at 
../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:143
143     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file 
or directory.
(rr) bt
#0  __memset_evex_erms () at 
../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:143
#1  0x00007ff654d2c3c4 in __asan::Allocator::Allocate 
(this=this@entry=0x7ff654e3ce00 <__asan::instance>, size=<optimized out>, 
size@entry=24640,
    alignment=<optimized out>, alignment@entry=8, 
stack=stack@entry=0x7ffd465092c0, 
alloc_type=alloc_type@entry=__asan::FROM_MALLOC,
    can_fill=can_fill@entry=true) at 
../../../../src/libsanitizer/asan/asan_allocator.cpp:598
#2  0x00007ff654d28a9b in __asan::asan_malloc (size=size@entry=24640, 
stack=stack@entry=0x7ffd465092c0)
    at ../../../../src/libsanitizer/asan/asan_allocator.cpp:964
#3  0x00007ff654db998f in __interceptor_malloc (size=24640) at 
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:70
#4  0x00007ff65a5c4f37 in AllocSetAlloc (context=0x625000066900, size=24576) at 
/home/andres/src/postgresql/src/backend/utils/mmgr/aset.c:740
#5  0x00007ff65a60e385 in palloc (size=24576) at 
/home/andres/src/postgresql/src/backend/utils/mmgr/mcxt.c:1082
#6  0x00007ff65a6513dc in tuplesort_begin_batch (state=0x625000066a20) at 
/home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:963
#7  0x00007ff65a6501d9 in tuplesort_begin_common (workMem=65536, 
coordinate=0x0, randomAccess=false)
    at /home/andres/src/postgresql/src/backend/utils/sort/tuplesort.c:878


Greetings,

Andres Freund


Reply via email to