Re: time sorted UUIDs
Hi Tim -- I am looking at the issue of random IDs (ie, UUIDs) as well. Did you have a chance to try time sorted UUIDs as was suggested in one of the responses? On Mon, Apr 17, 2023 at 5:23 PM Tim Jones wrote: > Hi, > > could someone please comment on this article > https://vladmihalcea.com/uuid-database-primary-key/ specifically re the > comments (copied below) in regards to a Postgres database. > > ... > > But, using a random UUID as a database table Primary Key is a bad idea for > multiple reasons. > > First, the UUID is huge. Every single record will need 16 bytes for the > database identifier, and this impacts all associated Foreign Key columns as > well. > > Second, the Primary Key column usually has an associated B+Tree index to > speed up lookups or joins, and B+Tree indexes store data in sorted order. > > However, indexing random values using B+Tree causes a lot of problems: > >- Index pages will have a very low fill factor because the values come >randomly. So, a page of 8kB will end up storing just a few elements, >therefore wasting a lot of space, both on the disk and in the database >memory, as index pages could be cached in the Buffer Pool. >- Because the B+Tree index needs to rebalance itself in order to >maintain its equidistant tree structure, the random key values will cause >more index page splits and merges as there is no pre-determined order of >filling the tree structure. > > ... > > > Any other general comments about time sorted UUIDs would be welcome. > > > > Thanks, > > *Tim Jones* > > >
High QPS, random index writes and vacuum
Hi all The company I work for has a large (50+ instances, 2-4 TB each) Postgres install. One of the key problems we are facing in vanilla Postgres is vacuum behavior on high QPS (20K writes/s), random index access on UUIDs. In one case the table is 50Gb and has 3 indexes which are also 50Gb each. It takes 20 hours to vacuum the entire thing, where bulk of the time is spent doing 'index vacuuming'. The table is then instantly vacuumed again. I increased work_mem to 2Gb, decreased sleep threshold to 2ms and increased the IO limit to 2000. I also changed the autovacuum thresholds for this table. I understand that doing random index writes is not a good strategy, but, 20 hours to vacuum 200Gb is excessive. My question is: what is the recommended strategy to deal with such cases in Postgres? Thanks very much!!
Re: High QPS, random index writes and vacuum
Thank you Tom. Version: I sheepishly admit it's 9.6, 10 and 11 (it's Azure Single Server, that's another story). I am definitely looking at redoing the way we do UUIDs... but that' s not a trivial change given the volume of data we have + 24/7 workload. I was trying to understand whether there are any known workarounds for random access + index vacuums. Are my vacuum times 'normal' ? On Mon, Apr 17, 2023 at 7:01 PM Tom Lane wrote: > peter plachta writes: > > The company I work for has a large (50+ instances, 2-4 TB each) Postgres > > install. One of the key problems we are facing in vanilla Postgres is > > vacuum behavior on high QPS (20K writes/s), random index access on UUIDs. > > Indexing on a UUID column is an antipattern, because you're pretty much > guaranteed the worst-case random access patterns for both lookups and > insert/delete/maintenance cases. Can you switch to timestamps or > the like? > > There are proposals out there for more database-friendly ways of > generating UUIDs than the traditional ones, but nobody's gotten > around to implementing that in Postgres AFAIK. > > regards, tom lane >
Re: High QPS, random index writes and vacuum
Thank you David -- I increased this to 1GB as well (seeing as that was the max). We are doing mostly single passes now. On Mon, Apr 17, 2023 at 7:40 PM David Rowley wrote: > On Tue, 18 Apr 2023 at 12:35, peter plachta wrote: > > I increased work_mem to 2Gb > > maintenance_work_mem is the configuration option that vacuum uses to > control how much memory it'll make available for storage of dead > tuples. I believe 1GB would allow 178,956,970 tuples to be stored > before multiple passes would be required. The chunk of memory for dead > tuple storage is capped at 1GB. > > David >
Results of experiments with UUIDv7, UUIDv8
Hi all My colleague and I did some experiments to see what effect using UUIDs as 2nd-ary indexes has on Index IO. The context is that by default ORM frameworks will use UUIDs as index keys which I found as a major factor to performance issues at Celonis. I suspect this isn't specific to Celonis. The secondary factor is that random IO on Azure Single Server can be slow as a dog -- thus for large enough indexes that aren't cached, and workloads doing insert/delete at a high enough QPS, this really hurts. We found that using UUID v7 (which has a longer time based prefix than v8) gave 30% in IO savings in index access and roughly the same in index size after I/D workload. v8 was ~24%. We simulated slow, random IO by running this on a USB key which seemed to match Azure performance pretty well. SSD was maybe 2x better. This is relative to UUID v3 which is essentially random (actually, pretty good random distribution on a 500Gb table). This isn't as much as I expected, but, again for large indexes, slow IO, it was significant. peter
Table copy with SERIALIZABLE is incredibly slow
Hi all Background is we're trying a pg_repack-like functionality to compact a 500Gb/145Gb index (x2) table from which we deleted 80% rows. Offline is not an option. The table has a moderate (let's say 100QPS) I/D workload running. The typical procedure for this type of thing is basically CDC: 1. create 'log' table/create trigger 2. under SERIALIZABLE: select * from current_table insert into new_table What we're finding is that for the 1st 30 mins the rate is 10Gb/s, then it drops to 1Mb/s and stays there and 22 hours later the copy is still going and now the log table is huge so we know the replay will also take a very long time. === Q: what are some ways in which we could optimize the copy? Btw this is Postgres 9.6 (we tried unlogged table (that did nothing), we tried creating indexes after (that helped), we're experimenting with RRI) Thanks!
Re: Any risk or overhead considerations for frequently executing queries against catalog tables?
DataDog — which implements such metrics for Postgres - has ran into multiple issues doing this type of thing. You may be able to search their bugs / repo to see what they were. I just can’t remember them off hand, it’s been a while.Sent from my iPhoneOn Jan 25, 2025, at 12:01 PM, Frits Hoogland wrote:I am looking at whether sampling key database catalog information per second would have any drawback whatsoever.I think you're saying that you think isn't the case, except maybe for pg_database, and I figure that is because of the frozen and multi xact fields per database.If the database client application is too unpredictable to know what SQL it will produce, then having runtime data available at that granularity, so it can be reasonably constructed what is going on is very convenient and allows tremendous insight. It would also allow usage of the waitevents to spot any weird behavior, such as short-lived peaks. (pg_stat_statements can do that on a busy database, for example).And if there is no known drawback, if such a low interval can be organized: why not? I am not saying you are doing it wrong, this is about trying to figure out what are the borders of what would be technically possible without unreasonably affecting the database, a thought experiment.If course the gathered data needs to be organized so that you don't swamp in it, and it shouldn't lead to the monitoring data swamping the system, either in memory or on disk, but that is a given.Why would per second be too much for locks? Is there overhead to select from pg_locks, or pg_blocking_pids()?Again, please realise I am happy and appreciative of the time you take, I am toying with the above described idea. Frits Hoogland On 25 Jan 2025, at 19:18, Pavel Stehule wrote:Hiso 25. 1. 2025 v 18:00 odesílatel Frits Hooglandnapsal:Thank you Pavel, that is really useful. I can imagine other people thinking about getting fine grained data from postgres might wonder the same as I do about this.And really from a computer's perspective I would say that once a second isn't really a high frequency?I usually work with minute sampling and usually it is good enough (statistics are cumulative, so you can lose the timestamp, but you never lose data.Only when we try to investigate some special case, then I use second sampling. When you investigate lock issues, then seconds are too muchRegardsPavel If I time the amount of time that these queries take, it's around 20ms (local connection), so there is a relative long time of all the objects including pg_database are not actively queried.I git grepped the sourcecode, it seems that there is a rowexclusive lock for pg_database manipulation in case of addition, removal and change of a database in dbcommands.c, but I do think your reasoning is based on the columns datfrozenxid and datminmxid?There is a lock for updating the frozenxid and mxid for a database in (vacuum.c:LockDatabaseFrozenIds, ExclusiveLock), but it seems a select should play nice with that?btw, it's interesting to see that both datfrozenxid and datminmxid are in place updated, with no read consistency provided. Frits Hoogland On 25 Jan 2025, at 14:32, Pavel Stehule wrote:Hiso 25. 1. 2025 v 12:23 odesílatel Frits Hoogland napsal:For monitoring database behaviour and trying to build an history of activity, if I would create an application that creates a single connection and execute something like:select * from pg_stat_activity;select * from pg_stat_database;select * from pg_stat_bgwriter;select * from pg_stat_wal;select * from pg_settings;select * from pg_database;For which the query is prepared, and execute that every 1 second, would there be any realistic danger or overhead that should be considered?My thinking is that the data for these catalogs are all in shared memory and when executed serially and do not cause any significant resources to be taken?The queries to all tables excluding pg_database every 1 sec will have probably zero impact to performance.I am not sure about pg_database - it is a very important table, and your query can block operations that need exclusive lock to this table. So theoretically, there can be some impact to performance.RegardsPavel Thanks, Frits Hoogland
Re: Read-only connectios optimizatios
You can still block vacuum from running if you have long running (or very aggressive) read transactions. I don’t think they are very helpful or performant from a Postgres engine perspective. They can be helpful in application development because they will fail if devs attempt any mutations inside read only (from what I recall). Sent from my iPhone > On Jan 25, 2025, at 10:01 AM, Laurenz Albe wrote: > > On Sat, 2025-01-25 at 14:55 +, Edson Richter wrote: >> -Connections are established using the jdbc "readonly" attribute. >> >> Does PostgreSQL perform any optimization on queries in this scenario to avoid >> establishing locks? Or are these queries treated like any other? > > The only difference that I am aware of is that read-only transactions at the > SERIALIZABLE isolation level can release predicate locks earlier, which can > benefit performance. > > But I don't think that you need to worry: reading transactions only take an > ACCESS SHARE lock on tables, which won't conflict with data modifications. > > Yours, > Laurenz Albe > >