There's a definitional issue here, which is what does it mean to be
counting index tuples.  I think GIN could bypass the VACUUM error check
by always returning the heap tuple count as its index tuple count.  This

One problem: ambulkdelete hasn't any access to heap or heap's statistics (num_tuples in scan_index() and vacuum_index() in vacuum.c). So, ambulkdelete can't set stats->num_index_tuples equal to num_tuples. With partial index problem is increased...

After looking into vacuum.c I found following ways to skip check:
1) Simplest: just return NULL by ginvacuumcleanup. Disadvantage:
   drop any statistics
2) Quick hack in vacuum.c to be fixed in a future:
        if ( indrel->rd_rel->relam == GIN_AM_OID )
                stats->num_index_tuples = num_tuples;
        else if (stats->num_index_tuples != num_tuples ) {
                checking as now
        }
3) Add column to pg_am pointed to scan_index/vacuum_index's behaviour
   like above. I don't think that column is frequent case - only for
   inverted indexes.


If there is not objections, at Tuesday we add quick hack (2) and commit GIN. After that our plan is:
1) add opclasses for other array
2) add indisclustered=true for all GIN indexes by changes in UpdateIndexRelation() and mark_index_clustered(). The issue is: can table be clustered on several indexes now? Because GIN is always 'clustered' table can be clustered on several GIN index and one any other. Cluster command on GIN index should do nothing. May be, it will be cleaner to add indclustered column to pg_am.
3) Return to WAL problem with GiST
4) work on gincostesimate and, possibly, GIN's opclasses costestimate tweak...
   Including  num_tuples issue


--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to