Thanks for your clues.
The system I have used to debug the code is x86 64bit based, Ubuntu 1404
and postgres 9.3.13, I have revised the code and it looks like as following:
Datum vquery(PG_FUNCTION_ARGS) {
int array_len = PG_GETARG_INT32(0);
int64 * node_ids;
ArrayType * retarr;
Datum * vals ;
SPI_connect();
//some code to retrieve data from various tables
// node_ids are allocated and filled up
vals = SPI_palloc(array_len * sizeof(Datum));
memset (vals, 0, array_len * sizeof(Datum));
// fill the vals up
for (i = 0 ; i < array_len ; i++)
vals[i] = Int64GetDatum((node_ids[i]));
retarr = construct_array(vals, retcnt, INT8OID, sizeof(int64), true, 'd');
SPI_finish();
PG_RETURN_ARRAYTYPE_P(retarr);
}
It seems to solve the problem, I have tested the code for a while and no
more segmentation faults are reported.
I have built Postgresql with --enable-debug and --enable-cassert, but use
the binary with gdb and get no symbol file loaded. I will take further
researches and use it to facilitate debug. Thanks.
On Tue, Feb 28, 2017 at 12:54 PM, Tom Lane <[email protected]> wrote:
> =?UTF-8?B?6ZKx5paw5p6X?= <[email protected]> writes:
> > I have written an extension to manage openstreetmap data. There is a C
> > function to perform spatial top k query on several tables and return an
> > array of int8 type as result. The code skeleton of this function is as
> > follows:
>
> There are a remarkable lot of bugs in this code fragment. Many of them
> would not bite you as long as you are running on 64-bit Intel hardware,
> but that doesn't make them not bugs.
>
> > Datum vquery(PG_FUNCTION_ARGS) {
>
> > int array_len = PG_GETARG_INT32(0);
> > long * node_ids;
>
> > SPI_connect();
>
> > //some code to retrieve data from various tables
> > // node_ids are allocated and filled up
>
> > ArrayType * retarr;
> > Datum * vals ;
>
> > vals = palloc0(array_len * sizeof(long));
>
> Datum is not necessarily the same as "long".
>
> > // fill the vals up
> > for (i = 0 ; i < array_len ; i++)
> > vals[i] = Int64GetDatum((node_ids[i]));
>
> int64 is not necessarily the same as "long", either.
>
> > retarr = construct_array(vals, retcnt, INT8OID, sizeof(long), true, 'i');
>
> Again, INT8 is not the same size as "long", and it's not necessarily
> pass-by-val, and it's *certainly* not integer alignment.
>
> > SPI_finish();
>
> > PG_RETURN_ARRAYTYPE_P(retarr);
>
> But I think what's really biting you, probably, is that construct_array()
> made the array in CurrentMemoryContext which at that point was the SPI
> execution context; which would be deleted by SPI_finish. So you're
> returning a dangling pointer. You need to do something to either copy
> the array value out to the caller's context, or build it there in the
> first place.
>
> BTW, this failure would be a lot less intermittent if you were testing
> in a CLOBBER_FREED_MEMORY build. I would go so far as to say you should
> *never* develop or test C code for the Postgres backend without using
> the --enable-cassert configure option for your build. You're simply
> tossing away a whole lot of debug support if you don't.
>
> regards, tom lane
>