> On Sep 23, 2021, at 8:55 PM, Michael Lewis <mle...@entrata.com> wrote: > > It’s only when I add in the AND data.channels=channels.channel that the query > time blows up to 5+ minutes. I personally don’t understand why there would be > such a large difference between the two queries - something wrong with my > indexes? > > Did you do the manual vacuum as suggested by Tom?
I ran a VACUUM ANALYZE, yes. > Maybe at least check pg_stat_user_tables for the last vacuum or autovacuum > timestamp, and how many records have changed since then. volcano_seismology=# SELECT seq_scan,seq_tup_read,idx_scan,idx_tup_fetch,n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd,n_live_tup,n_dead_tup,n_mod_since_analyze,n_ins_since_vacuum,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count FROM pg_stat_user_tables WHERE relname='data'; -[ RECORD 1 ]-------+------------------------------ seq_scan | 95 seq_tup_read | 25899340540 idx_scan | 728372 idx_tup_fetch | 51600217033 n_tup_ins | 840283699 n_tup_upd | 66120702 n_tup_del | 2375651 n_tup_hot_upd | 0 n_live_tup | 839266956 n_dead_tup | 66585751 n_mod_since_analyze | 58896052 n_ins_since_vacuum | 24890460 last_vacuum | 2021-09-22 21:32:11.367855+00 last_autovacuum | 2021-09-14 07:13:23.745862+00 last_analyze | 2021-09-22 21:32:21.071092+00 last_autoanalyze | 2021-09-21 11:54:36.924762+00 vacuum_count | 1 autovacuum_count | 1 analyze_count | 1 autoanalyze_count | 2 Note that the update count was due to a (hopefully) one-time process where I had to change the value of a bunch of records. Generally this *should be* an insert-once-read-many database. > The system is deciding on an index scan because that should be very fast, but > if it needs to look at the actual table table to determine if the process > executing that query should still be allowed to see that tuple (row version) > then the index only scan slows down a lot I believe. The huge number of heap > fetches that Tom pointed out mean that the system doesn't know that ALL > processes should be able to see those tuples. As someone else suggested > lowering the max freeze age, you might just do a manual "vacuum freeze" as > well to allow future auto vacuum processes to quickly skip over those pages > as well. Ok, ran a VACUUM (ANALYZE, FREEZE), and am now showing this from pg_stat_user_tables: volcano_seismology=# SELECT seq_scan,seq_tup_read,idx_scan,idx_tup_fetch,n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd,n_live_tup,n_dead_tup,n_mod_since_analyze,n_ins_since_vacuum,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count FROM pg_stat_user_tables WHERE relname='data'; -[ RECORD 1 ]-------+------------------------------ seq_scan | 96 seq_tup_read | 26737263238 idx_scan | 732396 idx_tup_fetch | 52571927369 n_tup_ins | 841017819 n_tup_upd | 66120702 n_tup_del | 2388723 n_tup_hot_upd | 0 n_live_tup | 840198830 n_dead_tup | 10173 n_mod_since_analyze | 0 n_ins_since_vacuum | 0 last_vacuum | 2021-09-24 17:18:18.34282+00 last_autovacuum | 2021-09-14 07:13:23.745862+00 last_analyze | 2021-09-24 17:18:31.576238+00 last_autoanalyze | 2021-09-21 11:54:36.924762+00 vacuum_count | 2 autovacuum_count | 1 analyze_count | 2 autoanalyze_count | 2 However, adding the AND data.channels=channels.channel to the query still makes it take around 5 minutes (https://explain.depesz.com/s/7hb1 <https://explain.depesz.com/s/7hb1>). So, again, running VACUUM didn’t appear to help any. Also perhaps interestingly, if I again modify the query to only match on channel, not station, it is again fast (though not quite as fast): https://explain.depesz.com/s/HLb8 <https://explain.depesz.com/s/HLb8> So, basically, I can quickly get a list of all channels for which I have data, or all stations for which I have data, but getting a list of all channels for each station is slow. > I've heard of the loose indexscan before mentioned on this thread, but I'm > not seeing how to implement that for multiple columns. Anyone have an example > or pseudo code perhaps? This is my stumbling block to implementing this option as well. That said, with the lookup table in place, perhaps all the rest of this is a moot point? --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145