Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-09-06 Thread jayaprabhakar k
Thanks Maxim and Jeff. 1. Do you have any pointers to the killbits issue on hot standby slaves? We do use a hot standby instance for many queries. So I want to learn more about it. 2. I am now considering partitioning the table. I am curious if we can set up partitions by mutable columns. More

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-09-01 Thread Maxim Boguk
> > But anyway, PostgreSQL has features to prevent the index bloat from > becoming too severe of a problem, and you should figure out why they are > not working for you. The most common ones I know of are 1) long open > snapshots preventing clean up, 2) all index scans being bitmap index scans, >

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-31 Thread Jeff Janes
On Thu, Aug 31, 2023 at 11:06 AM Maxim Boguk wrote: > With the default value of autovacuum_vacuum_scale_factor (The default is > 0.2 (20% of table size).) index will collect like 100M outdated/dead index > entries before autovacuum kicks in and cleans them all (in a worst case), > and of course

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-31 Thread Jeff Janes
On Wed, Aug 30, 2023 at 8:43 PM jayaprabhakar k wrote: > > > On Tue, Aug 29, 2023, 12:43 PM Jeff Janes wrote: > >> On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k >> wrote: >> >>> >>> Since we are only interested in the pending tasks, I created a partial >>> index >>>

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-31 Thread jayaprabhakar k
Thanks Maxim, that's something we are considering now - keep the in progress tasks in one table and periodically move the old and completed tasks to an archive table. We could use a view that unions them for most queries. I'm not sure if that's the best alternative though, and we want to know if

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-31 Thread Maxim Boguk
> At any moment, there are *around 1000-1500 tasks in pending statuses* > (Init + InProgress) out of around 500 million tasks. > > Now, we have a task monitoring query that will look for all pending tasks > that have not received any update in the last n minutes. > > ``` > SELECT [columns list] >

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-30 Thread jayaprabhakar k
On Tue, Aug 29, 2023, 12:43 PM Jeff Janes wrote: > On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k > wrote: > >> Hi, >> >> TL;DR: >> Observations: >> >>1. REINDEX requires a full table scan >> - Roughly create a new index, rename index, drop old index. >> - REINDEX is not

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-29 Thread Jeff Janes
On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k wrote: > Hi, > > TL;DR: > Observations: > >1. REINDEX requires a full table scan > - Roughly create a new index, rename index, drop old index. > - REINDEX is not incremental. running reindex frequently does not > reduce the

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-29 Thread jayaprabhakar k
Thanks Peter. It is *14.4*, But on AWS RDS Aurora instance. I am trying to read the links you shared - B-Tree Deletion and deduplication, etc. I still don't fully understand what I need to do. In the BTree documentation, > The average and worst-case number of versions per logical row can be kept

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-28 Thread Peter Geoghegan
On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k wrote: > REINDEX requires a full table scan > > Roughly create a new index, rename index, drop old index. > REINDEX is not incremental. running reindex frequently does not reduce the > future reindex time. You didn't say which Postgres version