Tom Lane wrote: > Richard Huxton <[EMAIL PROTECTED]> writes: >> Anyone got anything more elegant? > > Seems to me that no document should have an empty dup_set. If it's not > a match to any existing document, then immediately assign a new dup_set > number to it.
That was my initial thought too, but it means when I actually find a duplicate I have to decide which "direction" to renumber them in. It also means probably keeping a summary table with counts to show which are duplicates, since the duplicates table is now the same size as the documents table. -- Richard Huxton Archonet Ltd -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql