Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-08-01 Thread Robert Haas
On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Now, of the above the only cases where we'd be likely to be able to do
 anything very useful with stats on the expression value are the name
 case, which isn't that exciting in practice, and the tsvector cases.
 For tsvector it was only with 8.4 that we had non-toy stats code, so
 while the limitation is ancient it's only recently that it started to be
 meaningful.

 I don't think this can be claimed to be a corner case.  If you set up
 an FTS index according to the first alternative offered in

 http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

 you will find that the system fails to collect stats for it and so you
 get stupid default estimates for your FTS queries.  If this were a
 documented limitation I'd expect to see a big red warning there to
 *not* do it that way.  The only way that you actually get usable
 tsvector stats at the moment is to explicitly store the tsvector as an
 ordinary column, as in the second approach offered in the above
 documentation section.

Yeah, maybe you're right.  But I'd still prefer to see us break the
ABI and do this just in 9.0 rather than changing 8.4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-08-01 Thread Kevin Grittner
Tom Lane  wrote:
Robert Haas  writes:
 
 I guess I'd appreciate it if someone could explain in more detail
 in what cases we fail to collect stats.
 
 [detailed description]
 
 I don't think this can be claimed to be a corner case. If you set
 up an FTS index according to the first alternative offered in
 

http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
 
 you will find that the system fails to collect stats for it and so
 you get stupid default estimates for your FTS queries.
 
Objection to a fix in 9.0 withdrawn.  No opinion on backpatching.
 
-Kevin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-08-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't think this can be claimed to be a corner case.  If you set up
 an FTS index according to the first alternative offered in
 
 http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
 
 you will find that the system fails to collect stats for it and so you
 get stupid default estimates for your FTS queries.

 Yeah, maybe you're right.  But I'd still prefer to see us break the
 ABI and do this just in 9.0 rather than changing 8.4.

OK, I can live with that.  I'll take a look at it shortly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-08-01 Thread Tom Lane
I wrote:
 Robert Haas robertmh...@gmail.com writes:
 Yeah, maybe you're right.  But I'd still prefer to see us break the
 ABI and do this just in 9.0 rather than changing 8.4.

 OK, I can live with that.  I'll take a look at it shortly.

Proposed patch attached (compiles, untested as yet).

regards, tom lane

Index: src/backend/commands/analyze.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/analyze.c,v
retrieving revision 1.152
diff -c -r1.152 analyze.c
*** src/backend/commands/analyze.c	26 Feb 2010 02:00:37 -	1.152
--- src/backend/commands/analyze.c	1 Aug 2010 19:56:12 -
***
*** 92,98 
  	AnlIndexData *indexdata, int nindexes,
  	HeapTuple *rows, int numrows,
  	MemoryContext col_context);
! static VacAttrStats *examine_attribute(Relation onerel, int attnum);
  static int acquire_sample_rows(Relation onerel, HeapTuple *rows,
  	int targrows, double *totalrows, double *totaldeadrows);
  static double random_fract(void);
--- 92,99 
  	AnlIndexData *indexdata, int nindexes,
  	HeapTuple *rows, int numrows,
  	MemoryContext col_context);
! static VacAttrStats *examine_attribute(Relation onerel, int attnum,
! 	   Node *index_expr);
  static int acquire_sample_rows(Relation onerel, HeapTuple *rows,
  	int targrows, double *totalrows, double *totaldeadrows);
  static double random_fract(void);
***
*** 339,345 
  		(errcode(ERRCODE_UNDEFINED_COLUMN),
  	errmsg(column \%s\ of relation \%s\ does not exist,
  		   col, RelationGetRelationName(onerel;
! 			vacattrstats[tcnt] = examine_attribute(onerel, i);
  			if (vacattrstats[tcnt] != NULL)
  tcnt++;
  		}
--- 340,346 
  		(errcode(ERRCODE_UNDEFINED_COLUMN),
  	errmsg(column \%s\ of relation \%s\ does not exist,
  		   col, RelationGetRelationName(onerel;
! 			vacattrstats[tcnt] = examine_attribute(onerel, i, NULL);
  			if (vacattrstats[tcnt] != NULL)
  tcnt++;
  		}
***
*** 353,359 
  		tcnt = 0;
  		for (i = 1; i = attr_cnt; i++)
  		{
! 			vacattrstats[tcnt] = examine_attribute(onerel, i);
  			if (vacattrstats[tcnt] != NULL)
  tcnt++;
  		}
--- 354,360 
  		tcnt = 0;
  		for (i = 1; i = attr_cnt; i++)
  		{
! 			vacattrstats[tcnt] = examine_attribute(onerel, i, NULL);
  			if (vacattrstats[tcnt] != NULL)
  tcnt++;
  		}
***
*** 407,427 
  			elog(ERROR, too few entries in indexprs list);
  		indexkey = (Node *) lfirst(indexpr_item);
  		indexpr_item = lnext(indexpr_item);
- 
- 		/*
- 		 * Can't analyze if the opclass uses a storage type
- 		 * different from the expression result type. We'd get
- 		 * confused because the type shown in pg_attribute for
- 		 * the index column doesn't match what we are getting
- 		 * from the expression. Perhaps this can be fixed
- 		 * someday, but for now, punt.
- 		 */
- 		if (exprType(indexkey) !=
- 			Irel[ind]-rd_att-attrs[i]-atttypid)
- 			continue;
- 
  		thisdata-vacattrstats[tcnt] =
! 			examine_attribute(Irel[ind], i + 1);
  		if (thisdata-vacattrstats[tcnt] != NULL)
  		{
  			tcnt++;
--- 408,415 
  			elog(ERROR, too few entries in indexprs list);
  		indexkey = (Node *) lfirst(indexpr_item);
  		indexpr_item = lnext(indexpr_item);
  		thisdata-vacattrstats[tcnt] =
! 			examine_attribute(Irel[ind], i + 1, indexkey);
  		if (thisdata-vacattrstats[tcnt] != NULL)
  		{
  			tcnt++;
***
*** 802,810 
   *
   * Determine whether the column is analyzable; if so, create and initialize
   * a VacAttrStats struct for it.  If not, return NULL.
   */
  static VacAttrStats *
! examine_attribute(Relation onerel, int attnum)
  {
  	Form_pg_attribute attr = onerel-rd_att-attrs[attnum - 1];
  	HeapTuple	typtuple;
--- 790,801 
   *
   * Determine whether the column is analyzable; if so, create and initialize
   * a VacAttrStats struct for it.  If not, return NULL.
+  *
+  * If index_expr isn't NULL, then we're trying to analyze an expression index,
+  * and index_expr is the expression tree representing the column's data.
   */
  static VacAttrStats *
! examine_attribute(Relation onerel, int attnum, Node *index_expr)
  {
  	Form_pg_attribute attr = onerel-rd_att-attrs[attnum - 1];
  	HeapTuple	typtuple;
***
*** 827,835 
  	stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
  	stats-attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE);
  	memcpy(stats-attr, attr, ATTRIBUTE_FIXED_PART_SIZE);
! 	typtuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr-atttypid));
  	if (!HeapTupleIsValid(typtuple))
! 		elog(ERROR, cache lookup failed for type %u, attr-atttypid);
  	stats-attrtype = (Form_pg_type) palloc(sizeof(FormData_pg_type));
  	memcpy(stats-attrtype, GETSTRUCT(typtuple), sizeof(FormData_pg_type));
  	

Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 After a bit of study of the code, it appears to me that it's not too
 difficult to fix: we just have to use the expression's result type
 rather than the index column's atttypid in the subsequent processing.
 ANALYZE never actually looks at the index column contents (indeed
 cannot, since our index AMs provide no API for extracting the contents),

I don't think it'll be an issue, but just in case- is there any reason
this will cause trouble for the possible index-only quals/scans work?

 So that seems to make it not practical to back-patch.

But we could get it into 9.0..

 I thought of an ugly hack that would avoid an API/ABI break: since
 VacAttrStats.attr is a copy of the pg_attribute data, we could scribble
 on it, changing atttypid, attlen, attbyval, etc to match the index
 expression result type.  This seems pretty grotty, but it would allow
 the fix to be back-patched into existing branches.

Yeah, that's rather nasty. :/

 Comments?  I'm not sure which way to jump here.

For my 2c- let's get it fixed for 9.0 cleanly and encourage people who
run into this to upgrade to that once it's released.  Perhaps I've
missed it, but I don't recall seeing many complaints about this.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 After a bit of study of the code, it appears to me that it's not too
 difficult to fix: we just have to use the expression's result type
 rather than the index column's atttypid in the subsequent processing.
 ANALYZE never actually looks at the index column contents (indeed
 cannot, since our index AMs provide no API for extracting the contents),

 I don't think it'll be an issue, but just in case- is there any reason
 this will cause trouble for the possible index-only quals/scans work?

Not that I can see.  What ANALYZE and the planner want to do is provide
stats on the behavior of the indexed expression.  The fact that certain
index opclasses don't actually store that value is of no interest.  Even
if it did become of interest later on, we're under no compulsion to not
redefine the contents of pg_statistic when we need to.

 So that seems to make it not practical to back-patch.

 But we could get it into 9.0..

Hmm.  I was thinking it was a bit late for 9.0 too, since it'd
invalidate any testing anyone's done of their custom typanalyze
functions.  However, since we're still only in beta, maybe that's
acceptable.

 Comments?  I'm not sure which way to jump here.

 For my 2c- let's get it fixed for 9.0 cleanly and encourage people who
 run into this to upgrade to that once it's released.  Perhaps I've
 missed it, but I don't recall seeing many complaints about this.

It's only been very recently that we had any useful stats capability
that could apply in such situations (in fact I think we still haven't
actually shipped a non-bogus version of tsvector typanalyze :-().
So I'm not sure anyone would have realized they were missing something.
But it would be nice to have this working in 8.4, and I'll be quite
unhappy if we don't have it in 9.0.

If people feel it's not too late for an ABI break in 9.0 then I'm
willing to compromise on backpatching the clean fix.  I'm not entirely
convinced it's that much cleaner than the hack solution though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Robert Haas
On Jul 31, 2010, at 11:00 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's only been very recently that we had any useful stats capability
 that could apply in such situations (in fact I think we still haven't
 actually shipped a non-bogus version of tsvector typanalyze :-().
 So I'm not sure anyone would have realized they were missing something.
 But it would be nice to have this working in 8.4, and I'll be quite
 unhappy if we don't have it in 9.0.
 
 If people feel it's not too late for an ABI break in 9.0 then I'm
 willing to compromise on backpatching the clean fix.  I'm not entirely
 convinced it's that much cleaner than the hack solution though.

I think this whole discussion is starting with the wrong premise. This is not a 
bug fix; therefore, it's 9.1 material.  If anyone else had suggested slipping 
this into 9.0, let alone 8.4, we would have punted it in a heartbeat.

...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I think this whole discussion is starting with the wrong premise. This
 is not a bug fix; therefore, it's 9.1 material.

Failing to store stats isn't a bug?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Robert Haas
On Sat, Jul 31, 2010 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think this whole discussion is starting with the wrong premise. This
 is not a bug fix; therefore, it's 9.1 material.

 Failing to store stats isn't a bug?

Well, it kind of sounds more like you're removing a known limitation
than fixing a bug.  It's not as if the behavior fails to match the
comment.  I'm pretty hesitant to see us making any changes to 9.0 that
aren't necessary to fix existing bugs or new regressions.  What I want
to do with 9.0 is get it stable and ship it.  I'm not really terribly
concerned about the possibility of an ABI break even at this late
date, but I *am* concerned about the possibility either of (1)
unforeseen consequences necessitating further patching or (2) getting
distracted from the business of getting the release out the door.
We've been in feature freeze for more than five months, so I think
it's certainly time try to reduce to an absolute minimum the number of
changes that need to be made before release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Kevin Grittner
Robert Haas  07/31/10 12:33 PM 
 Tom Lane  wrote:
 Robert Haas  writes:
 I think this whole discussion is starting with the wrong premise.
 This is not a bug fix; therefore, it's 9.1 material.

 Failing to store stats isn't a bug?
 
 Well, it kind of sounds more like you're removing a known
 limitation than fixing a bug.
 
It's operating as designed and documented.  There is room for
enhancement, but the only thing which could possibly justify this as
9.0 material is if there was a demonstrated performance regression in
9.0 for which this was the safest cure.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Stephen Frost
* Kevin Grittner (kevin.gritt...@wicourts.gov) wrote:
 Robert Haas  07/31/10 12:33 PM 
  Tom Lane  wrote:
  Failing to store stats isn't a bug?
  
  Well, it kind of sounds more like you're removing a known
  limitation than fixing a bug.
  
 It's operating as designed and documented.  There is room for
 enhancement, but the only thing which could possibly justify this as
 9.0 material is if there was a demonstrated performance regression in
 9.0 for which this was the safest cure.

I have to disagree with this, to be honest.  The fact that we've
documented what is completely unexpected and frustrating behaviour
doesn't mean we get to say it's not a bug.  Not collecting stats, at
all, is a pretty bad bug, in my view.  Stats are an important part of
the system which needs to work at least decently.  Perhaps before it was
pretty rare that we'd have the situation described (before we brought in
tsearch2), but it's not any longer and we need to support it as we would
the other types.  The only reason I'm against backpatching it to the
beginning is that it's either an ABI change or some rather grotty code,
and even then it wouldn't be hard to push me to accepting the grotty
code if we make the cleaner change for 9.0 and going forward, especially
as we have people in the wild being affected by it.

Certain other databases have done a very good job of documenting their
bugs and in some cases even calling them features.  I'd rather we not go
down that path.  I don't see the lack of stats collecting to be a simple
'limitation'.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Robert Haas
On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost sfr...@snowman.net wrote:
 * Kevin Grittner (kevin.gritt...@wicourts.gov) wrote:
 Robert Haas  07/31/10 12:33 PM 
  Tom Lane  wrote:
  Failing to store stats isn't a bug?
 
  Well, it kind of sounds more like you're removing a known
  limitation than fixing a bug.

 It's operating as designed and documented.  There is room for
 enhancement, but the only thing which could possibly justify this as
 9.0 material is if there was a demonstrated performance regression in
 9.0 for which this was the safest cure.

 I have to disagree with this, to be honest.  The fact that we've
 documented what is completely unexpected and frustrating behaviour
 doesn't mean we get to say it's not a bug.  Not collecting stats, at
 all, is a pretty bad bug, in my view.

I guess I'd appreciate it if someone could explain in more detail in
what cases we fail to collect stats.  Do we have a typanalyze function
here that can't possibly work for anything, ever?  Or is it just some
subset of the cases?

(Apologies if this has been discussed on the original thread; I was
unable to find it in the archives.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-31 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost sfr...@snowman.net wrote:
 * Kevin Grittner (kevin.gritt...@wicourts.gov) wrote:
 Robert Haas  07/31/10 12:33 PM 
 Tom Lane  wrote:
 Failing to store stats isn't a bug?

 Well, it kind of sounds more like you're removing a known
 limitation than fixing a bug.

 It's operating as designed and documented.

 I have to disagree with this, to be honest.  The fact that we've
 documented what is completely unexpected and frustrating behaviour
 doesn't mean we get to say it's not a bug.  Not collecting stats, at
 all, is a pretty bad bug, in my view.

I'm a bit bemused by the claim that this behavior is documented.  One
comment buried deep in the bowels of the source is not user-visible
documentation in my book.

 I guess I'd appreciate it if someone could explain in more detail in
 what cases we fail to collect stats.  Do we have a typanalyze function
 here that can't possibly work for anything, ever?  Or is it just some
 subset of the cases?

ANALYZE normally collects stats for any expression that there is an
expression index for.  However, it will punt and fail to collect stats
if the expression index uses an opclass whose opckeytype (ie, storage
datatype) is different from the actual expression datatype.  A quick
look into the system catalogs shows that that applies to these opclasses:

 amname | opcname  |   opcintype   | opckeytype 
 
+--+---+-
 btree  | name_ops | name  | cstring
 gist   | point_ops| point | box
 gist   | poly_ops | polygon   | box
 gist   | circle_ops   | circle| box
 gin| _int4_ops| integer[] | integer
 gin| _text_ops| text[]| text
 gin| _abstime_ops | abstime[] | abstime
 gin| _bit_ops | bit[] | bit
 gin| _bool_ops| boolean[] | boolean
 gin| _bpchar_ops  | character[]   | character
 gin| _bytea_ops   | bytea[]   | bytea
 gin| _char_ops| char[]  | char
 gin| _cidr_ops| cidr[]| cidr
 gin| _date_ops| date[]| date
 gin| _float4_ops  | real[]| real
 gin| _float8_ops  | double precision[]| double precision
 gin| _inet_ops| inet[]| inet
 gin| _int2_ops| smallint[]| smallint
 gin| _int8_ops| bigint[]  | bigint
 gin| _interval_ops| interval[]| interval
 gin| _macaddr_ops | macaddr[] | macaddr
 gin| _name_ops| name[]| name
 gin| _numeric_ops | numeric[] | numeric
 gin| _oid_ops | oid[] | oid
 gin| _oidvector_ops   | oidvector[]   | oidvector
 gin| _time_ops| time without time zone[]  | time without time 
zone
 gin| _timestamptz_ops | timestamp with time zone[]| timestamp with 
time zone
 gin| _timetz_ops  | time with time zone[] | time with time zone
 gin| _varbit_ops  | bit varying[] | bit varying
 gin| _varchar_ops | character varying[]   | character varying
 gin| _timestamp_ops   | timestamp without time zone[] | timestamp without 
time zone
 gin| _money_ops   | money[]   | money
 gin| _reltime_ops | reltime[] | reltime
 gin| _tinterval_ops   | tinterval[]   | tinterval
 gist   | tsvector_ops | tsvector  | gtsvector
 gin| tsvector_ops | tsvector  | text
 gist   | tsquery_ops  | tsquery   | bigint
(37 rows)

Now, of the above the only cases where we'd be likely to be able to do
anything very useful with stats on the expression value are the name
case, which isn't that exciting in practice, and the tsvector cases.
For tsvector it was only with 8.4 that we had non-toy stats code, so
while the limitation is ancient it's only recently that it started to be
meaningful.

I don't think this can be claimed to be a corner case.  If you set up
an FTS index according to the first alternative offered in

http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

you will find that the system fails to collect stats for it and so you
get stupid default estimates for your FTS queries.  If this were a
documented limitation I'd expect to see a big red warning there to
*not* do it that 

[HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-30 Thread Tom Lane
I've been poking at the FTS problem example recently submitted by
Artur Dabrowski.  It contains an index declared like so:

CREATE INDEX idx_keywords_ger ON search_tab
USING gin (to_tsvector('german'::regconfig, keywords));

which can be queried like so

select * from search_tab where
  to_tsvector('german', keywords) @@ to_tsquery('german', 'whatever');

which is all fine and matches suggestions in our documentation.  But I
was dismayed to discover that ANALYZE wasn't storing any statistics
for this expression index, which rather hobbles the planner's ability
to produce useful estimates for the selectivity of such WHERE clauses.

The reason for that turns out to be this bit in analyze.c:

/*
 * Can't analyze if the opclass uses a storage type
 * different from the expression result type. We'd get
 * confused because the type shown in pg_attribute for
 * the index column doesn't match what we are getting
 * from the expression. Perhaps this can be fixed
 * someday, but for now, punt.
 */
if (exprType(indexkey) !=
Irel[ind]-rd_att-attrs[i]-atttypid)
continue;

Since a tsvector index does have a storage type different from tsvector
(in fact, it's shown as TEXT), this code decides to punt and not analyze
the column.  I think it's past time we did something about this.

After a bit of study of the code, it appears to me that it's not too
difficult to fix: we just have to use the expression's result type
rather than the index column's atttypid in the subsequent processing.
ANALYZE never actually looks at the index column contents (indeed
cannot, since our index AMs provide no API for extracting the contents),
so atttypid isn't really all that relevant to it.  However, this'll
imply an API/ABI break for custom typanalyze functions: we'll need to
store the actual type OID in struct VacAttrStats, and use that rather
than looking to attr-atttypid or any of the other type-dependent
fields of the Form_pg_attribute.  So that seems to make it not practical
to back-patch.

I thought of an ugly hack that would avoid an API/ABI break: since
VacAttrStats.attr is a copy of the pg_attribute data, we could scribble
on it, changing atttypid, attlen, attbyval, etc to match the index
expression result type.  This seems pretty grotty, but it would allow
the fix to be back-patched into existing branches.

Comments?  I'm not sure which way to jump here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers