[ repost, as original message of 28-Feb seems to have gotten lost ] I said: > I was able to reproduce a problem as follows: run the tsearch regression > test, then do "cluster wowidx on test_txtidx". This appears to lose > one row:
Ahh ... it took me way too long to realize what was happening. The problem is simply that GiST indexes do not index nulls (at least not in the first column of an index). So if you CLUSTER, you lose any rows that contain NULLs in the indexed column --- they're not in the index, so they're not seen by the indexscan that copies the data over to the new table. Having CLUSTER lose data is obviously not acceptable :-(. I can see two possible solutions: * Make CLUSTER error out if the target index is not of an 'amindexnulls' index AM. This would amount to restricting CLUSTER to b-trees, which is annoying. * If the index is not amindexnulls and the first target column is not marked attnotnull, make an extra seqscan pass over the source table to look for rows containing nulls. Copy these rows separately. This would work but adds a good deal of overhead. Approach #2 is even worse for functional indexes --- attnotnull is not helpful. We'd have to actually evaluate the function at every single row to see if it yields NULL there. Yech. It occurs to me also that the same kind of pitfall exists for partial indexes: cluster on a partial index, you lose. However, I don't have a problem with simply refusing to cluster on partial indexes. Comments? Any other ideas out there? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])