On Thu., December 5, 2019 at 5:45 PM, Tomas Vondra wrote:
> At first I thought maybe this might be due to collations
> changing and breaking the index silently. What collation are you using?

We're using en_US.utf8. We did not make any collation changes to my knowledge.

> 1) When you do the queries, do they use index scan or sequential scan?
> Perhaps it does sequential scan, and if you force index scan (e.g. by
> rewriting the query) it'll only find one of those rows.

By default it used an index scan. When I re-ran the query today (and confirmed 
that the query used an index only scan) I did not see any duplicates. If I 
force a sequential scan using "SET enable_index[only]scan = false" the 
duplicates reappear.

However, using a backup from a week ago I see duplicates in both the query that 
uses an index only scan as well as the query that uses the sequential scan. So 
somehow over the past week the index got changed to eliminate duplicates.

> 2) Can you check in backups if this data corruption was present in the
> PG10 cluster, before running pg_upgrade? 

Sure. I just checked and did not see any corruption in the PG10 pre-upgrade 
backup. I also re-upgraded that PG10 backup to PG12, and right after the 
upgrade I did not see any corruption either. I checked using both index scans 
and sequential scans.

Alex

Reply via email to