Hello, Hackers
I recently discovered that some my tables with the functional indices
have the wrong pg_stats records.
wsdb=# \d q3c
Table "public.q3c"
Column | Type | Modifiers
--------+------------------+-----------
ipix | bigint |
ra | double precision |
dec | double precision |
Indexes:
"ipix_idx" btree (ipix)
"q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER
wsdb=# select * from pg_stats where tablename ~'q3c' and attname ~'pg';
schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+--------------------+-----------------+-----------+-----------+------------+-----------------------+-------------------+------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 |
1 | {6000} | {1} | |
1
You see that the most_common_freqs=1 which is certainly not true.
Starting to dig into the problem and checking my C function I saw that when
my C function (returning bigint) use the return PointerGetDatum((&ipix_buf));
instead of PG_RETURN_INT64 ,the ANALYZE command produces the wrong statistics.
But I think that's wrong, isn't it ?
So the short toy example:
CREATE OR REPLACE FUNCTION q3c_ang2ipix(double precision, double precision)
RETURNS bigint
AS '$libdir/q3c', 'pgq3c_ang2ipix'
LANGUAGE C IMMUTABLE STRICT;
C function definition:
PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
{
static q3c_ipix_t ipix_buf;
ipix_buf ++;
elog(WARNING,"%lld",ipix_buf);
return PointerGetDatum((&ipix_buf));
}
And I have the table q3c
wsdb=# \d q3c
Table "public.q3c"
Column | Type | Modifiers
--------+------------------+-----------
ipix | bigint |
ra | double precision |
dec | double precision |
Indexes:
"ipix_idx" btree (ipix)
"q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER
Now I do
wsdb=# ANALYZE q3c;
WARNING: 1
WARNING: 2
WARNING: 3
WARNING: 4
....
WARNING: 2998
WARNING: 2999
WARNING: 3000
ANALYZE
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg';
schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 |
1 | {3000} | {1} | | 1
(1 row)
So the values of most_common_vals, most_common_freqs are wrong
But if I replace the return PointerGetDatum((&ipix_buf));
by the PG_RETURN_INT64(ipix_buf)
the analyze works fine
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg';
schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+----------------------------------------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 |
-1 | | |
{1,300,600,900,1200,1500,1800,2100,2400,2700,3000} | 1
(1 row)
Is it incorrect to use PointerGetDatum in those circumstances ?
Because I always used it in that function to return bigints (to not have
the additional overhead of PG_RETURN_INT64), and that's the first time a
see the bug due to that.
Thank you in advance,
Regards,
Sergey
*******************************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: [EMAIL PROTECTED]
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match