Currently, functional indexes can be used by a query that explicitly mentions the exact phrasing of the functional index within the WHERE clause.
IMHO it is feasible to extend the range of WHERE clauses for which functional indexes can be used by using implication, much in the same way that we use transitive closure in join queries now. This has some powerful and important uses, just one of which is index prefix compression (so read on). If we have a subclause within a WHERE clause of the form col1 = k1 AND col2 = k2 AND ... colN = kN this implies that the following clause is also true: col1 = k1 AND col2 = k2 AND ... colN = kN AND F(col1, col2 ... colN) = F(k1, k2 ... kN) iff: - the function F is IMMUTABLE, since that definition implies that the output is always the same for any set of constant inputs. - the operator that connects each col1,k1 pair must be defined as true equality (note that in the above example what looks like the same equality operator is in fact context dependent on the datatypes) - the datatypes of each col1, k1 pair must match If we have a Functional Index F(col1, col2...colN) then we can use the second implied form of the subclause to recognise that the index can be useful for the query. This would then allow a query plan that uses an IndexScan on the functional index with a re-check filter of the original query clause. An example might be a query using the clause surname = 'RIGGS' would allow us to use an index defined as metaphone(surname) even though the original application was written before we added the index. Note that the index would be significantly smaller than an index on the full surname, as well as avoiding some of the hotspots caused by certain most-frequent-values in the data. An alternative example might be to define an index on substr(text_col, 1, 100) which still allows searching for longer strings without the need to store the complete string in the index. This is effectively index prefix compression, which is not directly possible with pgsql because of the requirements of datatype encapsulation. (Note that this avoids the need to have very long index keys also). One difficulty to this is defining which operators represent true-equality. There isn't a definition of this currently for operators and we cannot assume that an operator called "=" has the required properties, even if that is true for most built-in types. An example of true-equality and its difficulties is for the FLOAT type minus-zero (-0) and plus-zero (+0) have different byte representations yet when compared with the standard FLOAT comparison operator +0 and -0 would be considered equal. If we were to put each value through a hash-like function such as md5() then we would clearly get different answers. To take advantage of this, I propose to - add code to be called from allpaths.c: when we check functional indexes, if they exist and yet are not usable because of a lack of explicit clauses we will check to see if there any clauses that can be used to imply the correct use of the functional index. This is in many ways similar to existing constraint exclusion code. - add a new boolean to pg_operator to allow us to define which operators offer true equality - add a new keyword EQUALITY to CREATE/ALTER OPERATOR (which I think implies HASHES and MERGES if they are not mentioned explicitly). No promises for 8.2, but does anyone have further input? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org