Re: [PERFORM] select with max functions
On 03/10/17 04:29, Tom Lane wrote: Mariel Cherkassky writes: explain analyze SELECT Ma.User_Id, COUNT(*) COUNT FROM Manuim Ma WHERE Ma.Bb_Open_Date = (SELECT Bb_Open_Date FROM Manuim Man WHERE Man.User_Id = Ma.User_Id order by bb_open_date desc limit 1 ) GROUP BY Ma.User_Id HAVING COUNT(*) > 1; The core problem with this query is that the sub-select has to be done over again for each row of the outer table, since it's a correlated sub-select (ie, it refers to Ma.User_Id from the outer table). Replacing a max() call with handmade logic doesn't do anything to help that. I'd try refactoring it so that you calculate the max Bb_Open_Date just once for each user id, perhaps along the lines of SELECT Ma.User_Id, COUNT(*) COUNT FROM Manuim Ma, (SELECT User_Id, max(Bb_Open_Date) as max FROM Manuim Man GROUP BY User_Id) ss WHERE Ma.User_Id = ss.User_Id AND Ma.Bb_Open_Date = ss.max GROUP BY Ma.User_Id HAVING COUNT(*) > 1; This is still not going to be instantaneous, but it might be better. It's possible that an index on (User_Id, Bb_Open_Date) would help, but I'm not sure. regards, tom lane Further ideas based on Tom's rewrite: If that MAX is still expensive it might be worth breaking SELECT User_Id, max(Bb_Open_Date) as max FROM Manuim Man GROUP BY User_Id out into a VIEW, and considering making it MATERIALIZED, or creating an equivalent trigger based summary table (there are examples in the docs of how to do this). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
On 19/08/17 13:49, Mark Kirkwood wrote: On 19/08/17 02:21, Jeremy Finzel wrote: On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe mailto:scott.marl...@gmail.com>> wrote: So do iostat or iotop show you if / where your disks are working hardest? Or is this CPU overhead that's killing performance? Sorry for the delayed reply. I took a look in more detail at the query plans from our problem query during this incident. There are actually 6 plans, because there were 6 unique queries. I traced one query through our logs, and found something really interesting. That is that all of the first 5 queries are creating temp tables, and all of them took upwards of 500ms each to run. The final query, however, is a simple select from the last temp table, and that query took 0.035ms! This really confirms that somehow, the issue had to do with /writing /to the SAN, I think. Of course this doesn't answer a whole lot, because we had no other apparent issues with write performance at all. I also provide some graphs below. Hi, graphs for latency (or await etc) might be worth looking at too - sometimes the troughs between the IO spikes are actually when the disks have been overwhelmed with queued up pending IOs... Sorry - I see you *did* actually have iowait in there under your CPU graph...which doesn't look to be showing up a lot of waiting. However still might be well worth getting graphs showing per device waits and utilizations. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very poor read performance, query independent
Nice! Pleased that the general idea worked well for you! I'm also relieved that you did not follow my recommendation exactly - I'm been trialling a Samsung 960 Evo (256GB) and Intel 600p (256GB) and I've stumbled across the serious disadvantages of (consumer) M.2 drives using TLC NAND - terrible sustained write performance! While these guys can happily do ~ 2GB/s reads, their write performance is only 'burst capable'. They have small SLC NAND 'write caches' that do ~1GB/s for a *limited time* (10-20s) and after that you get ~ 200 MB/s! Ouch - my old Crucial 550 can do 350 MB/s sustained writes (so two of them in RAID0 are doing 700 MB/s for hours). Bigger capacity drives can do better - but overall I'm not that impressed with the current trend of using TLC NAND. regards Mark On 21/07/17 00:50, Charles Nadeau wrote: Mark, I received yesterday a second server having 16 drives bays. Just for a quick trial, I used 2 old 60GB SSD (a Kingston V300 and a ADATA SP900) to build a RAID0. To my surprise, my very pecky RAID controller (HP P410i) recognised them without a fuss (although as SATAII drives at 3Gb/s. A quick fio benchmark gives me 22000 random 4k read IOPS, more than my 5 146GB 10k SAS disks in RAID0). I moved my most frequently used index to this array and will try to do some benchmarks. Knowing that SSDs based on SandForce-2281 controller are recognised by my server, I may buy a pair of bigger/newer ones to put my tables on. Thanks! Charles On Sat, Jul 15, 2017 at 1:57 AM, Mark Kirkwood mailto:mark.kirkw...@catalyst.net.nz>> wrote: Thinking about this a bit more - if somewhat more blazing performance is needed, then this could be achieved via losing the RAID card and spinning disks altogether and buying 1 of the NVME or SATA solid state products: e.g - Samsung 960 Pro or Evo 2 TB (approx 1 or 2 GB/s seq scan speeds and 200K IOPS) - Intel S3610 or similar 1.2 TB (500 MB/s seq scan and 30K IOPS) The Samsung needs an M.2 port on the mobo (but most should have 'em - and if not PCIe X4 adapter cards are quite cheap). The Intel is a bit more expensive compared to the Samsung, and is slower but has a longer lifetime. However for your workload the Sammy is probably fine. regards Mark On 15/07/17 11:09, Mark Kirkwood wrote: Ah yes - that seems more sensible (but still slower than I would expect for 5 disks RAID 0). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org <mailto:pgsql-performance@postgresql.org>) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance <http://www.postgresql.org/mailpref/pgsql-performance> -- Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
On 19/08/17 02:21, Jeremy Finzel wrote: On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe mailto:scott.marl...@gmail.com>> wrote: So do iostat or iotop show you if / where your disks are working hardest? Or is this CPU overhead that's killing performance? Sorry for the delayed reply. I took a look in more detail at the query plans from our problem query during this incident. There are actually 6 plans, because there were 6 unique queries. I traced one query through our logs, and found something really interesting. That is that all of the first 5 queries are creating temp tables, and all of them took upwards of 500ms each to run. The final query, however, is a simple select from the last temp table, and that query took 0.035ms! This really confirms that somehow, the issue had to do with /writing /to the SAN, I think. Of course this doesn't answer a whole lot, because we had no other apparent issues with write performance at all. I also provide some graphs below. Hi, graphs for latency (or await etc) might be worth looking at too - sometimes the troughs between the IO spikes are actually when the disks have been overwhelmed with queued up pending IOs... Also SANs are notorious for this sort of thing - typically they have a big RAM cache that you are actually writing to, and everything is nice and fast until your workload (along with everyone else's) fills up the cache and then performance drops of a cliff for a while (I've seen SAN disks with iostat utilizations of 105% <-- Lol... and await numbers that scroll off the page in that scenario)! regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very poor read performance, query independent
Right, that is a bit of a show stopper for those SSD (the Intel needs SATA 6Gb/s and the Sammy's need PCIe 3.0 to perform to their rated specs). regards Mark On 16/07/17 04:12, Charles Nadeau wrote: Mark, The server is a . It doesn't really work with SATA drives. And when you find one that is compatible, it is only used at 3Gb/s with a maximum of 5 IOPS (a well know caracteristic of the HP P410i SAS RAID controller). I am looking at getting a Kingston Digital HyperX Predator that I could use in one of the PCIe 2.0 x4 slot. However I am worried about the "thermal runaway", i.e. when the server can't get a temperature reading from a PCIe card, it spins the fans at full speed to protect the server against high temperature. The machine being next to my desk I worry about the deafening noise it will create. Thanks! Chales On Sat, Jul 15, 2017 at 1:57 AM, Mark Kirkwood mailto:mark.kirkw...@catalyst.net.nz>> wrote: Thinking about this a bit more - if somewhat more blazing performance is needed, then this could be achieved via losing the RAID card and spinning disks altogether and buying 1 of the NVME or SATA solid state products: e.g - Samsung 960 Pro or Evo 2 TB (approx 1 or 2 GB/s seq scan speeds and 200K IOPS) - Intel S3610 or similar 1.2 TB (500 MB/s seq scan and 30K IOPS) The Samsung needs an M.2 port on the mobo (but most should have 'em - and if not PCIe X4 adapter cards are quite cheap). The Intel is a bit more expensive compared to the Samsung, and is slower but has a longer lifetime. However for your workload the Sammy is probably fine. regards Mark On 15/07/17 11:09, Mark Kirkwood wrote: Ah yes - that seems more sensible (but still slower than I would expect for 5 disks RAID 0). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org <mailto:pgsql-performance@postgresql.org>) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance <http://www.postgresql.org/mailpref/pgsql-performance> -- Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very poor read performance, query independent
Thinking about this a bit more - if somewhat more blazing performance is needed, then this could be achieved via losing the RAID card and spinning disks altogether and buying 1 of the NVME or SATA solid state products: e.g - Samsung 960 Pro or Evo 2 TB (approx 1 or 2 GB/s seq scan speeds and 200K IOPS) - Intel S3610 or similar 1.2 TB (500 MB/s seq scan and 30K IOPS) The Samsung needs an M.2 port on the mobo (but most should have 'em - and if not PCIe X4 adapter cards are quite cheap). The Intel is a bit more expensive compared to the Samsung, and is slower but has a longer lifetime. However for your workload the Sammy is probably fine. regards Mark On 15/07/17 11:09, Mark Kirkwood wrote: Ah yes - that seems more sensible (but still slower than I would expect for 5 disks RAID 0). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very poor read performance, query independent
Ah yes - that seems more sensible (but still slower than I would expect for 5 disks RAID 0). You should be able to get something like 5 * (single disk speed) i.e about 500MB/s. Might be worth increasing device read ahead (more than you have already). Some of these so-called 'smart' RAID cards need to be hit over the head before they will perform. E.g: I believe you have it set to 128 - I'd try 4096 or even 16384 (In the past I've used those settings on some extremely stupid cards that refused to max out their disks known speeds). Also worth investigating is RAID stripe size - for DW work it makes sense for it to be reasonably big (256K to 1M), which again will help speed is sequential scans. Cheers Mark On 15/07/17 02:09, Charles Nadeau wrote: Mark, First I must say that I changed my disks configuration from 4 disks in RAID 10 to 5 disks in RAID 0 because I almost ran out of disk space during the last ingest of data. Here is the result test you asked. It was done with a cold cache: flows=# \timing Timing is on. flows=# explain select count(*) from flows; QUERY PLAN --- Finalize Aggregate (cost=17214914.09..17214914.09 rows=1 width=8) -> Gather (cost=17214914.07..17214914.09 rows=1 width=8) Workers Planned: 1 -> Partial Aggregate (cost=17213914.07..17213914.07 rows=1 width=8) -> Parallel Seq Scan on flows (cost=0.00..17019464.49 rows=388899162 width=0) (5 rows) Time: 171.835 ms flows=# select pg_relation_size('flows'); pg_relation_size -- 129865867264 (1 row) Time: 57.157 ms flows=# select count(*) from flows; LOG: duration: 625546.522 ms statement: select count(*) from flows; count --- 589831190 (1 row) Time: 625546.662 ms The throughput reported by Postgresql is almost 198MB/s, and the throughput as mesured by dstat during the query execution was between 25 and 299MB/s. It is much better than what I had before! The i/o wait was about 12% all through the query. One thing I noticed is the discrepency between the read throughput reported by pg_activity and the one reported by dstat: pg_activity always report a value lower than dstat. Besides the change of disks configuration, here is what contributed the most to the improvment of the performance so far: Using Hugepage Increasing effective_io_concurrency to 256 Reducing random_page_cost from 22 to 4 Reducing min_parallel_relation_size to 512kB to have more workers when doing sequential parallel scan of my biggest table Thanks for recomending this test, I now know what the real throughput should be! Charles On Wed, Jul 12, 2017 at 4:11 AM, Mark Kirkwood mailto:mark.kirkw...@catalyst.net.nz>> wrote: Hmm - how are you measuring that sequential scan speed of 4MB/s? I'd recommend doing a very simple test e.g, here's one on my workstation - 13 GB single table on 1 SATA drive - cold cache after reboot, sequential scan using Postgres 9.6.2: bench=# EXPLAIN SELECT count(*) FROM pgbench_accounts; QUERY PLAN Aggregate (cost=2889345.00..2889345.01 rows=1 width=8) -> Seq Scan on pgbench_accounts (cost=0.00..2639345.00 rows=1 width=0) (2 rows) bench=# SELECT pg_relation_size('pgbench_accounts'); pg_relation_size -- 13429514240 (1 row) bench=# SELECT count(*) FROM pgbench_accounts; count --- 1 (1 row) Time: 118884.277 ms So doing the math seq read speed is about 110MB/s (i.e 13 GB in 120 sec). Sure enough, while I was running the query iostat showed: Device: rrqm/s wrqm/s r/s w/srMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 926.000.00 114.89 0.00 254.10 1.902.032.030.00 1.08 100.00 So might be useful for us to see something like that from your system - note you need to check you really have flushed the cache, and that no other apps are using the db. regards Mark On 12/07/17 00:46, Charles Nadeau wrote: After reducing random_page_cost to 4 and testing more, I can report that the aggregate read throughput for parallel sequential scan is about 90MB/s. However the throughput for sequential scan is still around 4MB/s. -- Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To m
Re: [PERFORM] Very poor read performance, query independent
Hmm - how are you measuring that sequential scan speed of 4MB/s? I'd recommend doing a very simple test e.g, here's one on my workstation - 13 GB single table on 1 SATA drive - cold cache after reboot, sequential scan using Postgres 9.6.2: bench=# EXPLAIN SELECT count(*) FROM pgbench_accounts; QUERY PLAN Aggregate (cost=2889345.00..2889345.01 rows=1 width=8) -> Seq Scan on pgbench_accounts (cost=0.00..2639345.00 rows=1 width=0) (2 rows) bench=# SELECT pg_relation_size('pgbench_accounts'); pg_relation_size -- 13429514240 (1 row) bench=# SELECT count(*) FROM pgbench_accounts; count --- 1 (1 row) Time: 118884.277 ms So doing the math seq read speed is about 110MB/s (i.e 13 GB in 120 sec). Sure enough, while I was running the query iostat showed: Device: rrqm/s wrqm/s r/s w/srMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 926.000.00 114.89 0.00 254.10 1.902.032.030.00 1.08 100.00 So might be useful for us to see something like that from your system - note you need to check you really have flushed the cache, and that no other apps are using the db. regards Mark On 12/07/17 00:46, Charles Nadeau wrote: After reducing random_page_cost to 4 and testing more, I can report that the aggregate read throughput for parallel sequential scan is about 90MB/s. However the throughput for sequential scan is still around 4MB/s. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Can postgresql plan a query using multiple CPU cores?
On 23/05/17 08:21, Clemens Eisserer wrote: > Hi, > > I have a letancy-sensitive legacy application, where the time consumed > by query planning was always causing some headaches. > Currently it is running on postgresql-8.4 - will postgresql-10 support > generating plans using multiple CPU cores to reduce the time required > to generate a single plan? > > Thank you in advance and best regards, Clemens > > Hi, Might be worthwhile posting an example (query + EXPLAIN ANALYZE etc), so we can see what type of queries are resulting in long plan times. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Size of Temporary tablespace is increasing very much in postgresql 9.1.
On 15/12/16 23:28, Dinesh Chandra 12108 wrote: Dear expert, In postgreSQL-9.1,the size of *pgsql_tmp* inside tablespace (Temp tablespace) is increased by 544G in one day. However, the DBsize is as usual but tablespace size is getting increased. Could you please suggest why it is happening ? That is due to queries doing sorts or (hash) joins. You can log which queries are doing this with the log_temp_files parameter. Now it might be that this is just normal/expected (e.g complex data warehouse style workload), but it could also be many small queries that might benefit from some additional indexes (logging the queries will help you decide what if anything needs to be done). regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql 8.4 optimize for heavy select load
On 19/09/16 19:40, Job wrote: Hello, i would please like to have some suggestions to optimize Postgres 8.4 for a very heavy number of select (with join) queries. The queries read data, very rarely they write. We probably need to see schema and query examples to help you (with EXPLAIN ANALYZE output). Also - err 8.4 - I (and others probably) will recommend you upgrade to a more recent (and supported for that matter) version - currently 9.5/9.6 - lots of performance improvements you are missing out on! Best wishes Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to find disk IOs for a Query?
On 01/09/16 17:56, Mark Kirkwood wrote: the other way to attack this is to trace your backend postgres process (err perfmon...no idea how to do this on windows...) No idea why I thought you were on windows (maybe was reading another message just before yours) - sorry! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to find disk IOs for a Query?
On 01/09/16 10:01, Bobby Mozumder wrote: Is it possible to find the number of disk IOs performed for a query? EXPLAIN ANALYZE looks like it shows number of sequential rows scanned, but not number of IOs. My database is on an NVMe SSD, and am trying to cut microseconds of disk IO per query by possibly denormalizing. Try EXPLAIN (ANALYZE, BUFFERS) e.g: bench=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM pgbench_accounts WHERE bid=1; QUERY PLAN Finalize Aggregate (cost=217118.90..217118.91 rows=1 width=8) (actual time=259 .723..259.723 rows=1 loops=1) Buffers: shared hit=2370 read=161727 -> Gather (cost=217118.68..217118.89 rows=2 width=8) (actual time=259.686.. 259.720 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2370 read=161727 -> Partial Aggregate (cost=216118.68..216118.69 rows=1 width=8) (actu al time=258.473..258.473 rows=1 loops=3) Buffers: shared hit=2208 read=161727 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=40139 width=0) (actual time=0.014..256.820 rows=3 loops=3) Filter: (bid = 1) Rows Removed by Filter: 330 Buffers: shared hit=2208 read=161727 Planning time: 0.044 ms Execution time: 260.357 ms (14 rows) ...shows the number of (8k unless you've changed it) pages read from disk or cache. Now this might not be exactly what you are after - the other way to attack this is to trace your backend postgres process (err perfmon...no idea how to do this on windows...) and count read and write calls. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance problems with 9.2.15
On 22/07/16 13:07, Johan Fredriksson wrote: And by the way, I have also tried to upgrade to Postgresql 9.4.8 (the latest version in postgresl.org's own repository) without improvment. Not sure what repo you are using, but 9.5.3 and 9.6 Beta are the *actual* latest versions. Now I'm not sure they will actually help your particular query, but are probably worth a try out! regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tuning guidelines for server with 256GB of RAM and SSDs?
On 08/07/16 02:09, Wes Vaske (wvaske) wrote: ?The Crucial drive does not have power loss protection. The Samsung drive does. (The Crucial M550 has capacitors to protect data that's already been written to the device but not the entire cache. For instance, if data is read from the device during a garbage collection operation, the M550 will protect that data instead of introducing corruption of old data. This is listed as "power loss protection" on the spec sheet but it's not the level of protection that people on this list would expect from a drive) Yes - the MX200 board (see): http://www.anandtech.com/show/9258/crucial-mx200-250gb-500gb-1tb-ssd-review looks to have the same sort of capacitors that the M550 uses, so not ideal for db or transaction logs! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tuning guidelines for server with 256GB of RAM and SSDs?
On 06/07/16 07:17, Mkrtchyan, Tigran wrote: Hi, We had a similar situation and the best performance was with 64MB background_bytes and 512 MB dirty_bytes. Tigran. On Jul 5, 2016 16:51, Kaixi Luo wrote: Here are my server specs: RAID1 - 2x480GB Samsung SSD with power loss protection (will be used to store the PostgreSQL database) RAID1 - 2x240GB Crucial SSD with power loss protection. (will be used to store PostgreSQL transactions logs) Can you tell the exact model numbers for the Samsung and Crucial SSD's? It typically matters! E.g I have some Crucial M550 that have capacitors and (originally) claimed to be power off safe, but with testing have been shown to be not really power off safe at all. I'd be dubious about Samsungs too. The Intel Datacenter range (S3700 and similar) are known to have power off safety that does work. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How we made Postgres upserts 2-3* quicker than MongoDB
On 08/01/16 19:07, Nicolas Paris wrote: Hello Mark, As far as I know, MongoDB is able to get better writing performances thanks to scaling (easy to manage sharding). Postgresql cannot (is not designed for - complicated). Why comparing postgresql & mongoDB performances on a standalone instance since mongoDB is not really designed for that ? Yes you can get better performance with mongo via the sharding route but there are a number of quite bad downsides to mongo sharding - limited ability to perform aggregation, loss of unique key constraints other than the shard key, requires at minimum 4-6* the hardware (2 replicas for each block = 4 + 2 * mongos gateway servers)... Actually pretty similar to the issues you see when trying to scale a RDBMS via sharding... We tried doing some mongo sharding and the result was a massive drop in write performance so we gave up pretty quickly... Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How we made Postgres upserts 2-3* quicker than MongoDB
Hi all, I just wrote an article about the postgres performance optimizations I've been working on recently especially compared to our old MongoDB platform https://mark.zealey.org/2016/01/08/how-we-tweaked-postgres-upsert-performance-to-be-2-3-faster-than-mongodb -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Proposal for unlogged tables
On 04/01/16 18:12, Andres Freund wrote: Pages containing data of unlogged tables aren't ever flushed to disk unless a) a shutdown checkpoint is performed b) a buffer containing data from an unlogged table is used for something else c) the database being copied is the the source of a CREATE DATABASE .. TEMPLATE Hence, if there's an unclean shutdown, there's absolutely no guarantee about the on-disk state of unlogged tables. Even if they haven't been modified in ages - there could have been many many dirty pages in shared buffers when crashing. Always flushing dirty pages of unlogged tables at checkpoint would greatly increase the overhead for memory resident, write heavy workloads that use unlogged tables. If there was a command to flush a specific unlogged table to disk it would work around all these issues no? Perhaps if you marked the table as read only at the same time it would flush it to disk and ensure no more data could be written to it eg (ALTER TABLE ... SET READ ONLY on an unlogged table would flush + not truncate after crash). In our case this would be great as we want to use these as permanent tables for speed; but after an initial data dump we don't change the data again so we could just do this at the end of the import process. Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Proposal for unlogged tables
On 04/01/16 16:27, Yves Dorfsman wrote: I haven't tried, but won't converting an unlogged table into a logged table write all the inserts at once instead of once per insert? Or are you wanting to do more bulk insert into that table later? Are you trying to avoid running a CHECKPOINT? Are you afraid the activity on the other tables will create too much I/O? Setting a table to logged still pushes all the inserts into the WAL which we don't need and causes a lot of extra IO. It also takes quite a long time as it is basically rewriting the table and all indexes (eg 60 seconds for 2m rows on one of my test tables). We can do this but a) it generates lots of additional IO which isn't really required for us, and b) it acquires an exclusive lock on the table which is also not nice for us. If the last data modification statement was run more than eg 30 seconds or 1 minute before an unclean shutdown (or the data was otherwise flushed to disk and there was no IO since then) can we not assume that the data is not corrupted and hence not truncate the unlogged tables? I have to admit that I have been surprised by this, it feels like unlogged tables are never written properly unless you do an explicit CHECKSUM. I don't know how the internals work but unlogged tables definitely flushed to disk and persist through normal server restarts. It is just according to the docs if the server ever has an unclean shutdown the tables are truncated even if they have not been updated in a year. I can't understand why it has to be like this and it seems that it would be much nicer to not automatically truncate if it doesn't have to. This would be great in the situation where you can tolerate a low chance of data-loss but want very quick upserts. Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Proposal for unlogged tables
I've recently been doing some performance testing with unlogged tables vs logged tables on 9.5-rc1. Basically we're trying to do big loads of data into the database periodically. If on the very rare occasion the server crashes half way through the import it's no big deal so I've been looking specifically at unlogged tables with transactions having synchronous_commit set to OFF. When we do the inserts on a logged table with default WAL configuration settings we get a *lot* of disk IO generated (500mb/sec of pretty random IO - we have a nice ssd raid array but even so this maxes it out). Tweaking WAL settings (commit_delay, max_wal_size, min_wal_size) improves the situation quite a bit (50-100mb/sec of writes), but still we have no need to log the inserts into the WAL at the moment. Doing the imports to unlogged tables we get virtually no IO until the insert process has finished when the table gets flushed to disk which is great for us. However I read in the manuals that if the server ever has an unclean shutdown all unlogged tables will be truncated. Obviously with 9.5 we can now alter tables to be logged/unlogged after insert but this will still write all the inserts into the WAL. I can understand the requirement to truncate tables with active IO at the point of unclean shutdown where you may get corrupted data; but I'm interested to find out how easy it would be to not perform the truncate for historical unlogged tables. If the last data modification statement was run more than eg 30 seconds or 1 minute before an unclean shutdown (or the data was otherwise flushed to disk and there was no IO since then) can we not assume that the data is not corrupted and hence not truncate the unlogged tables? Thanks Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hyperthreadin low performance
On 23/07/15 23:37, domenico febbo wrote: is the problem also in PostgreSQL 9.4.x? I'm going to buy a production's server with 4 sockets E7-4850 12 cores so 12*4 = 48 cores (and 96 threads using HT). What do you suggest? Using or not HT? From my experience 9.4 is considerably better (we are using it on the 60 core box mentioned prev). 48 cores should be fine, enabling HT and asking Postgres to effectively handle 96 could provoke issues. However it is reasonably easy to test - tune shared_buffers and checkpoint segments sensibly and run pgbench for a steadily increasing number of clients. With 48 cores you should (hopefully) see a tps curve that increases and then gently flattens off somewhere. If 96 cores are "too many" then you will see a tps curve that initially increases then sharply drops. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hyperthreadin low performance
On 21/07/15 20:04, David Rowley wrote: On 21 July 2015 at 14:59, Jeison Bedoya Delgado mailto:jeis...@audifarma.com.co>> wrote: hi everyone, Recently update a database to machine with RHEL7, but i see that the performance is betther if the hyperthreading tecnology is deactivated and use only 32 cores. is normal that the machine performance is better with 32 cores that 64 cores?. You might be interested in http://www.postgresql.org/message-id/53f4f36e.6050...@agliodbs.com However I do wonder if we have been misinterpreting these tests. We tend to assume the position of "see hyperthreading is bad, switch it off". The linked post under the one above: http://www.postgresql.org/message-id/53ed371d@catalyst.net.nz shows that 60 core (no hyperthreading) performance is also pessimal, leading me to conclude that *perhaps* it is simply the number of cores that is the problem - particularly as benchmark results for single socket cpus clearly show hyperthreading helps performance... Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow query (planner insisting on using 'external merge' sort type)
On 24/06/15 09:05, Jim Nasby wrote: > On 6/19/15 9:57 AM, Ian Pushee wrote: >> >> >> On 6/19/2015 10:47 AM, Andreas Kretschmer wrote: >>>> Explain Analyze outputs (links as requested): >>>> Default plan: http://explain.depesz.com/s/ib3k >>>> Forced index (random_page_cost=1.0): http://explain.depesz.com/s/lYaP >>>> >>>> Software/Hardware: PGSql 9.2.1, Windows 8.1, 8GB RAM >>>> All pgsql settings are at their defaults. >>> increase work_mem. per session via set work_mem = 'xxxMB'; or in >>> postgresql.conf, reload. >>> >>> >> >> Hi Andreas, >> >> The number of rows in the events table isn't constrained, so >> unfortunately it isn't feasible to set work_mem high enough to allow an >> in-memory sort. Forcing the planner to use the index works to produce a >> fast query, so I'm wondering if there is a more general way to getting >> the planner to take into account that work_mem isn't big enough to fit >> the query which will result in a MUCH more costly external merge. > > What Andreas is saying is the reason the sort is so expensive is because > it spilled to disk. If you don't have enough memory to do the sort > in-memory, then you probably don't have enough memory to buffer the > table either, which means the index scan is going to be a LOT more > expensive than a sort. > > That said, the better your IO system is the lower you need to set > random_page_cost. With a good raid setup 2.0 is a good starting point, > and I've run as low as 1.1. I've never run a system on all SSD, but I've > heard others recommend setting it as low as 1.0 on an all SSD setup. > > It's also worth noting that there's some consensus that the optimizer is > generally too eager to switch from an index scan to a seqscan. Mind you, this eagerness could be caused by the OP having effective_cache_size set to the default. This should be changed (set to a few GB...)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to reduce writing on disk ? (90 gb on pgsql_tmp)
On 04/06/15 12:58, Scott Marlowe wrote: On Wed, Jun 3, 2015 at 6:53 PM, Scott Marlowe wrote: On Wed, Jun 3, 2015 at 4:29 PM, Joshua D. Drake wrote: On 06/03/2015 03:16 PM, Tomas Vondra wrote: What is more important, though, is the amount of memory. OP reported the query writes ~95GB of temp files (and dies because of full disk, so there may be more). The on-disk format is usually more compact than the in-memory representation - for example on-disk sort often needs 3x less space than in-memory qsort. So we can assume the query needs >95GB of data. Can you explain how that's going to fit into the 64GB RAM? Cache is free memory. If you think of it any other way when you're looking at memory usage and pressure on theings like swap you're gonna make some bad decisions. Cache is not free memory - it's there for a purpose and usually plays a significant role in performance. Sure, it may be freed and used for other purposes, but that has consequences - e.g. it impacts performance of other queries etc. You generally don't want to do that on production. Exactly. If your cache is reduced your performance is reduced because less things are in cache. It is not free memory. Also the command "free" is not useful in this scenario. It is almost always better to use sar so you can see where the data points are that free is using. But if that WAS happening he wouldn't still HAVE 60G of cache! That's my whole point. He's NOT running out of memory. He's not even having to dump cache right now. Further if he started using a few gig here for this one it wouldn't have a big impact on cache (60G-1G etc) but might make it much faster, as spilling to disk is a lot less intrusive when you've got a bigger chunk of ram to work in. OTOH doing something like setting work_mem to 60G would likely be fatal. But he's not down to 3GB of memory by any kind of imagination. Any working machine will slowly, certainly fill its caches since it's not using the memory for anything else. That's normal. As long as you're not blowing out the cache you're fine. I agree with Scott's analysis here. It seems to me that the issue is the query(s) using too much disk space. As others have said, it may not be practical to up work_mem to the point where is all happens in memory...so probably need to: - get more disk or, - tweak postgres params to get a less disk hungry plan (need to see that explain analyze)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Survey: Max TPS you've ever seen
On 13/02/15 00:20, Gudmundsson Martin (mg) wrote: Hi all! - checkpoint_segments 1000 - checkpoint_completion_target 0.9 - wal_buffers 256MB - shared_buffers 31 gb - max_connections 500 I see that some of you are using wal_buffers = 256MB. I was under the impression that Postgres will not benefit from higher value than the segment size, i.e. 16MB. More than that will not do/help anything. What's the reasoning behind setting it to higher than 16MB? Do I have old information? Best regards, Martin There was some discussion a while ago in which 32MB and 8MB both demonstrated better performance than 16MB (probably related to the fact the the default wal file size is 16MB). We just experimented further with bigger values, and saw some improvement. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Survey: Max TPS you've ever seen
On 10/02/15 10:29, Gavin Flower wrote: On 10/02/15 08:30, Luis Antonio Dias de Sá Junior wrote: Hi, A survay: with pgbench using TPS-B, what is the maximum TPS you're ever seen? For me: 12000 TPS. -- Luis Antonio Dias de Sá Junior Important to specify: 1. O/S 2. version of PostgreSQL 3. PostgreSQL configuration 4. hardware configuration 5. anything else that might affect performance I suspect that Linux will out perform Microsoft on the same hardware, and optimum configuration for both O/S's... Yes, exactly - and also the pgbench parameters: - scale - number of clients - number of threads - statement options (prepared or simple etc) - length of test We've managed to get 4 to 6 TPS on some pretty serious hardware: - 60 core, 1 TB ram - 16 SSD + 4 PCIe SSD storage - Ubuntu 14.04 - Postgres 9.4 (beta and rc) ...with Postgres parameters customized: - checkpoint_segments 1920 - checkpoint_completion_target 0.8 - wal_buffers 256MB - wal_sync_method open_datasync - shared_buffers 10GB - max_connections 600 - effective_io_concurrency 10 ..and finally pgbench parameters - scale 2000 - clients 32, 64, 128, 256 (best results at 32 and 64 generally) - threads = 1/2 client number - prepared option - 10 minute test run time Points to note, we did *not* disable fsync or prevent buffers being actually written (common dirty tricks in benchmarks). However, as others have remarked - raw numbers mean little. Pgbench is very useful for testing how tuning configurations are helping (or not) for a particular hardware and software setup, but is less useful for answering the question "how many TPS can postgres do"... Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange choice of general index over partial index
On 16/01/15 16:28, Josh Berkus wrote: On 01/16/2015 04:17 PM, Mark Kirkwood wrote: On 16/01/15 16:06, Mark Kirkwood wrote: A bit more poking about shows that the major factor (which this fake dataset anyway) is the default for effective_cache_size (changes from 128MB to 4GB in 9.4). Increasing this makes 9.2 start using the files_in_flight index in a plain index scan too. Arrg - misread the planner outputin 9.2 what changes is a plan that uses an index scan on the *file_state* index (not files_in_flight)...which appears much faster than the bitmap scan on file_state. Apologies for the confusion. I'm thinking that I'm seeing the effect Tom has just mentioned. It's not using a bitmapscan in either case; it's a straight indexscan. Right, I suspect that bloating is possibly the significant factor then - can you REINDEX? Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange choice of general index over partial index
On 16/01/15 16:06, Mark Kirkwood wrote: A bit more poking about shows that the major factor (which this fake dataset anyway) is the default for effective_cache_size (changes from 128MB to 4GB in 9.4). Increasing this makes 9.2 start using the files_in_flight index in a plain index scan too. Arrg - misread the planner outputin 9.2 what changes is a plan that uses an index scan on the *file_state* index (not files_in_flight)...which appears much faster than the bitmap scan on file_state. Apologies for the confusion. I'm thinking that I'm seeing the effect Tom has just mentioned. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange choice of general index over partial index
On 16/01/15 15:32, Mark Kirkwood wrote: On 16/01/15 13:37, Mark Kirkwood wrote: On 16/01/15 11:30, Josh Berkus wrote: This is an obfuscation and mock up, but: table files ( id serial pk, filename text not null, state varchar(20) not null ... 18 more columns ) index file_state on (state) (35GB in size) index file_in_flight_state (state) where state in ( 'waiting','assigning', 'processing' ) (600MB in size) ... 10 more indexes More important facts: * state = 'done' 95% of the time. thereform the partial index represents only 5% of the table * all indexes and the table are very bloated * server has 128GB RAM * Version 9.2. Given this setup, I would expect the planner to *always* choose file_in_flight_state over file_state for this query: SELECT id, filename FROM files WHERE state = 'waiting'; ... and yet it keeps selecting file_state based on extremely small changes to the stats. This is important because the same query, using file_state, is 20X to 50X slower, because that index frequently gets pushed out of memory. What am I missing? Or is this potentially a planner bug for costing? Are you seeing a bitmapscan access plan? If so see if disabling it gets you a plan on the files_in_flight index. I'm seeing this scenario with a fake/generated dataset a bit like yours in 9.2 (9.5 uses the files_in_flight w/o any coercing). FWIW: For me 9.2 and 9.3 (default config) generate plans like: state=# EXPLAIN ANALYZE SELECT id, filename FROM files WHERE state = 'processing'; QUERY PLAN -- Bitmap Heap Scan on files (cost=3102.02..89228.68 rows=164333 width=15) (actual time=26.629..803.507 rows=166696 loops=1) Recheck Cond: ((state)::text = 'processing'::text) Rows Removed by Index Recheck: 7714304 -> Bitmap Index Scan on file_state (cost=0.00..3060.93 rows=164333 width=0) (actual time=25.682..25.682 rows=166696 loops=1) Index Cond: ((state)::text = 'processing'::text) Total runtime: 808.662 ms (6 rows) whereas 9.4 and 9.5 get: QUERY PLAN --- Index Scan using file_in_flight on files (cost=0.42..62857.39 rows=158330 width=15) (actual time=0.055..202.732 rows=166696 loops=1) Index Cond: ((state)::text = 'processing'::text) Planning time: 24.203 ms Execution time: 208.926 ms (4 rows) This is with each version loading exactly the same dataset (generated by the attached scripty). Obviously this is a vast simplification of what Josh is looking at - but it is (hopefully) interesting that these later versions are doing so much better... A bit more poking about shows that the major factor (which this fake dataset anyway) is the default for effective_cache_size (changes from 128MB to 4GB in 9.4). Increasing this makes 9.2 start using the files_in_flight index in a plain index scan too. Josh - might be worth experimenting with this parameter. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange choice of general index over partial index
On 16/01/15 13:37, Mark Kirkwood wrote: On 16/01/15 11:30, Josh Berkus wrote: This is an obfuscation and mock up, but: table files ( id serial pk, filename text not null, state varchar(20) not null ... 18 more columns ) index file_state on (state) (35GB in size) index file_in_flight_state (state) where state in ( 'waiting','assigning', 'processing' ) (600MB in size) ... 10 more indexes More important facts: * state = 'done' 95% of the time. thereform the partial index represents only 5% of the table * all indexes and the table are very bloated * server has 128GB RAM * Version 9.2. Given this setup, I would expect the planner to *always* choose file_in_flight_state over file_state for this query: SELECT id, filename FROM files WHERE state = 'waiting'; ... and yet it keeps selecting file_state based on extremely small changes to the stats. This is important because the same query, using file_state, is 20X to 50X slower, because that index frequently gets pushed out of memory. What am I missing? Or is this potentially a planner bug for costing? Are you seeing a bitmapscan access plan? If so see if disabling it gets you a plan on the files_in_flight index. I'm seeing this scenario with a fake/generated dataset a bit like yours in 9.2 (9.5 uses the files_in_flight w/o any coercing). FWIW: For me 9.2 and 9.3 (default config) generate plans like: state=# EXPLAIN ANALYZE SELECT id, filename FROM files WHERE state = 'processing'; QUERY PLAN -- Bitmap Heap Scan on files (cost=3102.02..89228.68 rows=164333 width=15) (actual time=26.629..803.507 rows=166696 loops=1) Recheck Cond: ((state)::text = 'processing'::text) Rows Removed by Index Recheck: 7714304 -> Bitmap Index Scan on file_state (cost=0.00..3060.93 rows=164333 width=0) (actual time=25.682..25.682 rows=166696 loops=1) Index Cond: ((state)::text = 'processing'::text) Total runtime: 808.662 ms (6 rows) whereas 9.4 and 9.5 get: QUERY PLAN --- Index Scan using file_in_flight on files (cost=0.42..62857.39 rows=158330 width=15) (actual time=0.055..202.732 rows=166696 loops=1) Index Cond: ((state)::text = 'processing'::text) Planning time: 24.203 ms Execution time: 208.926 ms (4 rows) This is with each version loading exactly the same dataset (generated by the attached scripty). Obviously this is a vast simplification of what Josh is looking at - but it is (hopefully) interesting that these later versions are doing so much better... Cheers Mark gendata.pl Description: Perl program -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange choice of general index over partial index
On 16/01/15 11:30, Josh Berkus wrote: This is an obfuscation and mock up, but: table files ( id serial pk, filename text not null, state varchar(20) not null ... 18 more columns ) index file_state on (state) (35GB in size) index file_in_flight_state (state) where state in ( 'waiting','assigning', 'processing' ) (600MB in size) ... 10 more indexes More important facts: * state = 'done' 95% of the time. thereform the partial index represents only 5% of the table * all indexes and the table are very bloated * server has 128GB RAM * Version 9.2. Given this setup, I would expect the planner to *always* choose file_in_flight_state over file_state for this query: SELECT id, filename FROM files WHERE state = 'waiting'; ... and yet it keeps selecting file_state based on extremely small changes to the stats. This is important because the same query, using file_state, is 20X to 50X slower, because that index frequently gets pushed out of memory. What am I missing? Or is this potentially a planner bug for costing? Are you seeing a bitmapscan access plan? If so see if disabling it gets you a plan on the files_in_flight index. I'm seeing this scenario with a fake/generated dataset a bit like yours in 9.2 (9.5 uses the files_in_flight w/o any coercing). regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tuning the configuration
On 12/12/14 11:36, Eric Pierce wrote: From: pgsql-performance-ow...@postgresql.org on behalf of Evgeniy Shishkin Sent: Thursday, December 11, 2014 7:11 AM To: Andrea Suisani Cc: mfatticci...@mbigroup.it; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Tuning the configuration On 11 Dec 2014, at 15:02, Andrea Suisani wrote: On 12/10/2014 11:44 AM, Maila Fatticcioni wrote: 2- I would like to use the two SDD to store the wal file. Do you think it is useful or how should I use them? I definitely would give it a try. I don't understand the logic behind using drives, which are best for random io, for sequent io workloads. Better use 10k sas with BBU raid for wal, money wise. Very much agree with this. Because SSD is fast doesn't make it suited for certain things, and a streaming sequential 100% write workload is one of them. I've worked with everything from local disk to high-end SAN and even at the high end we've always put any DB logs on spinning disk. RAID1 is generally sufficient. SSD is king for read heavy random I/O workload. Mind you wal is a little different - the limiting factor is (usually) not raw sequential speed but fsync latency. These days a modern SSD has fsync response pretty much equal to that of a card with BBU + spinners - and has "more" high speed storage available (cards usually have only a 1G or so of RAM on them). regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8xIntel S3500 SSD in RAID10 on Dell H710p
On 10/12/14 21:30, Strahinja Kustudić wrote: On Wed, Dec 10, 2014 at 4:55 AM, Mark Kirkwood < mark.kirkw...@catalyst.net.nz> wrote: That is interesting: I've done some testing on this type of card with 16 (slightly faster Hitachi) SSD attached. Setting WT and NORA should enable the so-called 'fastpath' mode for the card [1]. We saw performance improve markedly (300MB/s random write go to 1300MB/s). This *might* be related to the fact that 16 SSD can put out more IOPS than the card can actually handle - whereas your 8 S3500 is probably the perfect number (e.g 8*11000 = 88000 which the card can handle ok). [1] If you make the change while there are no outstanding background operations (array rebuild etc) in progress (see http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf ). I read that guide too, which is the reason why I tried with WT/NORA, but the document also states: "NOTE: RAID 10, RAID 50, and RAID 60 virtual disks cannot use FastPath." Which is a little odd, since usually if you want performance with reliability, you go RAID10. Do you have any suggestions what I could try to tweak to get more performance? We are using these configured as *individual* drives on RAID0 that are then md raided in a (software) RAID 10 array. Maybe try that out (as fastpath only cares about the HW RAID setup). Interestingly we were also seeing better performance on a fully HW RAID 10 array with WT/NORA...so (I guess) our Hitachi SSD probably have lower latency than the S3500 does. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8xIntel S3500 SSD in RAID10 on Dell H710p
On 10/12/14 12:28, Strahinja Kustudić wrote: * These tests are with the H710p controller set to write-back (WB) and with adaptive read ahead (ADRA). I ran a few tests with write-through (WT) and no read ahead (NORA), but the results were worse. That is interesting: I've done some testing on this type of card with 16 (slightly faster Hitachi) SSD attached. Setting WT and NORA should enable the so-called 'fastpath' mode for the card [1]. We saw performance improve markedly (300MB/s random write go to 1300MB/s). This *might* be related to the fact that 16 SSD can put out more IOPS than the card can actually handle - whereas your 8 S3500 is probably the perfect number (e.g 8*11000 = 88000 which the card can handle ok). [1] If you make the change while there are no outstanding background operations (array rebuild etc) in progress (see http://www.flagshiptech.com/eBay/Dell/poweredgeh310h710h810UsersGuide.pdf). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pgtune + configurations with 9.3
On 15/11/14 15:08, Jim Nasby wrote: On 11/14/14, 5:00 PM, Mark Kirkwood wrote: as the 'rule of thumb' for setting shared_buffers. However I was recently benchmarking a machine with a lot of ram (1TB) and entirely SSD storage [1], and that seemed quite happy with 50GB of shared buffers (better performance than with 8GB). Now shared_buffers was not the variable we were concentrating on so I didn't get too carried away and try much bigger than about 100GB - but this seems like a good thing to come out with some numbers for i.e pgbench read write and read only tps vs shared_buffers 1 -> 100 GB in size. What PG version? One of the huge issues with large shared_buffers is the immense overhead you end up with for running the clock sweep, and on most systems that overhead is born by every backend individually. You will only see that overhead if your database is larger than shared bufers, because you only pay it when you need to evict a buffer. I suspect you'd actually need a database at least 2x > shared_buffers for it to really start showing up. That was 9.4 beta1 and2. A variety of db sizes were tried, some just fitting inside shared_buffers and some a bit over 2x larger, and one variant where we sized the db to 600GB, and used 4,8 and 50GB shared_buffers (50 was the best by a small margin...and certainly no worse). Now we were mainly looking at 60 core performance issues (see thread "60 core performance with 9.3"), and possibly some detrimental effects of larger shared_buffers may have been masked by this - but performance was certainly not hurt with larger shared_buffers. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pgtune + configurations with 9.3
On 15/11/14 06:06, Shaun Thomas wrote: Alexey, The issue is not that 8GB is the maximum. You *can* set it higher. What I'm saying, and I'm not alone in this, is that setting it higher can actually decrease performance for various reasons. Setting it to 25% of memory on a system with 512GB of RAM for instance, would be tantamount to disaster. A checkpoint with a setting that high could overwhelm pretty much any disk controller and end up completely ruining DB performance. And that's just *one* of the drawbacks. It is probably time to revisit this 8GB limit with some benchmarking. We don't really have a hard and fast rule that is known to be correct, and that makes Alexey's job really difficult. Informally folk (including myself at times) have suggested: min(ram/4, 8GB) as the 'rule of thumb' for setting shared_buffers. However I was recently benchmarking a machine with a lot of ram (1TB) and entirely SSD storage [1], and that seemed quite happy with 50GB of shared buffers (better performance than with 8GB). Now shared_buffers was not the variable we were concentrating on so I didn't get too carried away and try much bigger than about 100GB - but this seems like a good thing to come out with some numbers for i.e pgbench read write and read only tps vs shared_buffers 1 -> 100 GB in size. Cheers Mark [1] I may be in a position to benchmark the machines these replaced at some not to distant time. These are the previous generation (0.5TB ram, 32 cores and all SSD storage) but probably still good for this test. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 25/09/14 01:03, Mkrtchyan, Tigran wrote: With pg_test_timing I can see, that overhead is 48 nsec on my server and 32 nsec on the laptop. what makes this difference and have it any influence on the overall performance? Hmm - 22 nsec for my workstation, so while it could be a factor, your laptop and my workstation performed the pgbench about the same, so I'd look elsewhere - in particlular sync IO performance: $ cd $ pg_test_fsync 5 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync 140.231 ops/sec7131 usecs/op fdatasync 138.159 ops/sec7238 usecs/op fsync 137.680 ops/sec7263 usecs/op fsync_writethrough n/a open_sync 137.202 ops/sec7289 usecs/op Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync68.832 ops/sec 14528 usecs/op fdatasync 135.994 ops/sec7353 usecs/op fsync 137.454 ops/sec7275 usecs/op fsync_writethrough n/a open_sync69.092 ops/sec 14473 usecs/op Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write 136.904 ops/sec7304 usecs/op 2 * 8kB open_sync writes 68.857 ops/sec 14523 usecs/op 4 * 4kB open_sync writes 34.744 ops/sec 28782 usecs/op 8 * 2kB open_sync writes write failed: Invalid argument -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 24/09/14 21:23, Mkrtchyan, Tigran wrote: Hi Merlin et al. after building postgres 9.4 myself from sources I get the same performance as with 9.3. The difference was in the value of debug_assertions setting. Now the next step. Why my 3 years old laptop gets x1.8 times more tps than my one month old server? And Mark Kirkwood's desktop gets x2 times more tps as well? Is there some special optimization for i7 which does not work with Intel(R) Xeon(R) CPU E5-2660? Yes - firstly, nicely done re finding the assertions (my 9.4 beta2 was built from src - never thought to mention sorry)! I'd guess that you are seeing some bios setting re the p320 SSD - it *should* be seriously fast...but does not seem to be. You could try running some pure IO benchmarks to confirm this (e.g fio). Also see if the manual for however it is attached to the system allows for some optimized-for-ssd settings that tend to work better (altho these usually imply the drive is plugged into an adapter card of some kind - mind you your p320 *does* used a custom connector that does 2.5" SATA to PCIe style interconnect so I'd look to debug that first). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 19:24, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" To: "Tigran Mkrtchyan" Cc: "Merlin Moncure" , "postgres performance list" Sent: Friday, September 19, 2014 8:26:27 AM Subject: Re: [PERFORM] postgres 9.3 vs. 9.4 On 19/09/14 17:53, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" Further to the confusion, here's my 9.3 vs 9.4 on two M550 (one for 9.3 one for 9.4), see below for results. I'm running xfs on them with trim/discard enabled: $ mount|grep pg /dev/sdd4 on /mnt/pg94 type xfs (rw,discard) /dev/sdc4 on /mnt/pg93 type xfs (rw,discard) I'm *not* seeing any significant difference between 9.3 and 9.4, and the numbers are both about 2x your best number, which is food for thought (those P320's should toast my M550 for write performance...). cool! any details on OS and other options? I still get the same numbers as before. Sorry, Ubuntu 14.04 on a single socket i7 3.4 Ghz, 16G (i.e my workstation). I saw the suggestion that Didier made to run 9.3 on the SSD that you were using for 9.4, and see if it suddenly goes slow - then we'd know it's something about the disk (or filesystem/mount options). Can you test this? swapping the disks did not change the results. Do you mean that 9.3 was still faster using the disk that 9.4 had used? If so that strongly suggests that there is something you have configured differently in the 9.4 installation [1]. Not wanting to sound mean - but it is really easy to accidentally connect to the wrong instance when there are two on the same box (ahem, yes , done it myself). So perhaps another look at the 9.4 vs 9.3 setup (or even posti the config files postgresql.conf + postgresql.auto.conf for 9.4 here). Regards Mark [1] In the light of my previous test of (essentially) your config + numerous other folk have been benchmarking 9.4. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 19:24, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" To: "Tigran Mkrtchyan" Cc: "Merlin Moncure" , "postgres performance list" Sent: Friday, September 19, 2014 8:26:27 AM Subject: Re: [PERFORM] postgres 9.3 vs. 9.4 On 19/09/14 17:53, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" Further to the confusion, here's my 9.3 vs 9.4 on two M550 (one for 9.3 one for 9.4), see below for results. I'm running xfs on them with trim/discard enabled: $ mount|grep pg /dev/sdd4 on /mnt/pg94 type xfs (rw,discard) /dev/sdc4 on /mnt/pg93 type xfs (rw,discard) I'm *not* seeing any significant difference between 9.3 and 9.4, and the numbers are both about 2x your best number, which is food for thought (those P320's should toast my M550 for write performance...). cool! any details on OS and other options? I still get the same numbers as before. Sorry, Ubuntu 14.04 on a single socket i7 3.4 Ghz, 16G (i.e my workstation). I saw the suggestion that Didier made to run 9.3 on the SSD that you were using for 9.4, and see if it suddenly goes slow - then we'd know it's something about the disk (or filesystem/mount options). Can you test this? swapping the disks did not change the results. Nevertheless, I run the same test on my fedora20 laptop 8GB RAM, i7 2.2GHz and got 2600tps! I am totally confused now! Is it kernel version? libc? Well, that's progress anyway! I guess you could try fedora 20 on the Dell server and see if that makes any difference. But yes, confusing. Having been dealing with a high end Dell server myself recently (R920), some re-reading of any manuals you can find might be useful, we were continually surprised how easy it was to have everything configured *slow*... and the detail in the manuals...could be better! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 17:53, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" Further to the confusion, here's my 9.3 vs 9.4 on two M550 (one for 9.3 one for 9.4), see below for results. I'm running xfs on them with trim/discard enabled: $ mount|grep pg /dev/sdd4 on /mnt/pg94 type xfs (rw,discard) /dev/sdc4 on /mnt/pg93 type xfs (rw,discard) I'm *not* seeing any significant difference between 9.3 and 9.4, and the numbers are both about 2x your best number, which is food for thought (those P320's should toast my M550 for write performance...). cool! any details on OS and other options? I still get the same numbers as before. Sorry, Ubuntu 14.04 on a single socket i7 3.4 Ghz, 16G (i.e my workstation). I saw the suggestion that Didier made to run 9.3 on the SSD that you were using for 9.4, and see if it suddenly goes slow - then we'd know it's something about the disk (or filesystem/mount options). Can you test this? Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 10:16, Mark Kirkwood wrote: On 19/09/14 09:10, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" To: "Merlin Moncure" , "Tigran Mkrtchyan" Cc: "postgres performance list" Sent: Thursday, September 18, 2014 10:56:36 PM Subject: Re: [PERFORM] postgres 9.3 vs. 9.4 On 19/09/14 08:32, Merlin Moncure wrote: On Thu, Sep 18, 2014 at 4:58 AM, Mkrtchyan, Tigran wrote: 9.3.5: 0.035940END; 9.4beta2: 0.957854END; time being spent on 'END' is definitely suggesting i/o related issues. This is making me very skeptical that postgres is the source of the problem. I also thing synchronous_commit is not set properly on the new instance (or possibly there is a bug or some such). Can you verify via: select * from pg_settings where name = 'synchronous_commit'; on both servers? Yes, does look suspicious. It *could* be that the 9.4 case is getting unlucky and checkpointing just before the end of the 60s run, and 9.3 isn't. 10 minutes run had the same results. Is there some kind of statistics which can tell there time is spend? Or the only way is to run on solaris with dtrace? For me it's more important to find why I get only 1500tps with 9.3. The test with 9.4 was just a hope for a magic code change that will give me a better performance. Interesting. With respect to dtrace, you can use systemtap on Linux to achieve similar things. However before getting too carried away with that - we already *know* that 9.4 is spending longer in END (i.e commit) than 9.3 is. I'd recommend you see what wal_sync_method is set to on both systems. If it is the same, then my suspicion is that one of the SSD's needs to be trimmed [1]. You can do this by running: $ fstrim /mountpoint Also - are you using the same filesystem and mount options on each SSD? Cheers Mark [1] if fact, for the paranoid - I usually secure erase any SSD before performance testing, and then check the SMART counters too... Further to the confusion, here's my 9.3 vs 9.4 on two M550 (one for 9.3 one for 9.4), see below for results. I'm running xfs on them with trim/discard enabled: $ mount|grep pg /dev/sdd4 on /mnt/pg94 type xfs (rw,discard) /dev/sdc4 on /mnt/pg93 type xfs (rw,discard) I'm *not* seeing any significant difference between 9.3 and 9.4, and the numbers are both about 2x your best number, which is food for thought (those P320's should toast my M550 for write performance...). 9.3: $ pgbench -r -j 1 -c 1 -T 60 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 194615 tps = 3243.567115 (including connections establishing) tps = 3243.771688 (excluding connections establishing) statement latencies in milliseconds: 0.000798\set nbranches 1 * :scale 0.000302\set ntellers 10 * :scale 0.000276\set naccounts 10 * :scale 0.000330\setrandom aid 1 :naccounts 0.000265\setrandom bid 1 :nbranches 0.000278\setrandom tid 1 :ntellers 0.000298\setrandom delta -5000 5000 0.012818BEGIN; 0.065403 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.048516SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.058343 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.057763 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.043293 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 0.017087END; 9.4: $ pgbench -r -j 1 -c 1 -T 60 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 194130 latency average: 0.309 ms tps = 3235.488190 (including connections establishing) tps = 3235.560235 (excluding connections establishing) statement latencies in milliseconds: 0.000460\set nbranches 1 * :scale 0.000231\set ntellers 10 * :scale 0.000224\set naccounts 10 * :scale 0.000258\setrandom aid 1 :naccounts 0.000252\setrandom bid 1 :nbranches 0.000266\setrandom tid 1 :ntellers 0.000272\setrandom delta -5000 5000 0.011724BEGIN; 0.083750 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.045553SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.054412 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.053371 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid =
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 09:10, Mkrtchyan, Tigran wrote: - Original Message - From: "Mark Kirkwood" To: "Merlin Moncure" , "Tigran Mkrtchyan" Cc: "postgres performance list" Sent: Thursday, September 18, 2014 10:56:36 PM Subject: Re: [PERFORM] postgres 9.3 vs. 9.4 On 19/09/14 08:32, Merlin Moncure wrote: On Thu, Sep 18, 2014 at 4:58 AM, Mkrtchyan, Tigran wrote: 9.3.5: 0.035940END; 9.4beta2: 0.957854END; time being spent on 'END' is definitely suggesting i/o related issues. This is making me very skeptical that postgres is the source of the problem. I also thing synchronous_commit is not set properly on the new instance (or possibly there is a bug or some such). Can you verify via: select * from pg_settings where name = 'synchronous_commit'; on both servers? Yes, does look suspicious. It *could* be that the 9.4 case is getting unlucky and checkpointing just before the end of the 60s run, and 9.3 isn't. 10 minutes run had the same results. Is there some kind of statistics which can tell there time is spend? Or the only way is to run on solaris with dtrace? For me it's more important to find why I get only 1500tps with 9.3. The test with 9.4 was just a hope for a magic code change that will give me a better performance. Interesting. With respect to dtrace, you can use systemtap on Linux to achieve similar things. However before getting too carried away with that - we already *know* that 9.4 is spending longer in END (i.e commit) than 9.3 is. I'd recommend you see what wal_sync_method is set to on both systems. If it is the same, then my suspicion is that one of the SSD's needs to be trimmed [1]. You can do this by running: $ fstrim /mountpoint Also - are you using the same filesystem and mount options on each SSD? Cheers Mark [1] if fact, for the paranoid - I usually secure erase any SSD before performance testing, and then check the SMART counters too... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 19/09/14 08:32, Merlin Moncure wrote: On Thu, Sep 18, 2014 at 4:58 AM, Mkrtchyan, Tigran wrote: 9.3.5: 0.035940END; 9.4beta2: 0.957854END; time being spent on 'END' is definitely suggesting i/o related issues. This is making me very skeptical that postgres is the source of the problem. I also thing synchronous_commit is not set properly on the new instance (or possibly there is a bug or some such). Can you verify via: select * from pg_settings where name = 'synchronous_commit'; on both servers? Yes, does look suspicious. It *could* be that the 9.4 case is getting unlucky and checkpointing just before the end of the 60s run, and 9.3 isn't. What is iowait? For pci-e SSD, these drives don't seem very fast... These look like rebranded Micron P320's and should be extremely fast...However I note that my Crucial/Micron M550's are very fast for most writes *but* are much slower for sync writes (and fsync) that happen at commit... Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9.3 vs. 9.4
On 18/09/14 21:58, Mkrtchyan, Tigran wrote: Hi Folk, I am trying to investigate some performance issues which we have with postgres (a different topic by itself) and tried postgres.9.4beta2, with a hope that it perform better. Turned out that 9.4 is 2x slower than 9.3.5 on the same hardware. Some technical details: Host: rhel 6.5 2.6.32-431.23.3.el6.x86_64 256 GB RAM, 40 cores, Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz 2x160GB PCIe SSD DELL_P320h-MTFDGAL175SAH ( on one 9.3, on an other one 9.4 ) postgres tweaks: default_statistics_target = 100 wal_writer_delay = 10s vacuum_cost_delay = 50 synchronous_commit = off maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 effective_cache_size = 94GB work_mem = 402MB wal_buffers = 16MB checkpoint_segments = 64 shared_buffers = 8GB max_connections = 100 random_page_cost = 1.5 # other goodies log_line_prefix = '%m <%d %u %r> %%' log_temp_files = 0 log_min_duration_statement = 5 in both cases databases are fresh - no data. Here is a results with pgbench. 9.3.5: # /usr/pgsql-9.3/bin/pgbench -r -j 1 -c 1 -T 60 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 96361 tps = 1605.972262 (including connections establishing) tps = 1606.064501 (excluding connections establishing) statement latencies in milliseconds: 0.001391\set nbranches 1 * :scale 0.000473\set ntellers 10 * :scale 0.000430\set naccounts 10 * :scale 0.000533\setrandom aid 1 :naccounts 0.000393\setrandom bid 1 :nbranches 0.000468\setrandom tid 1 :ntellers 0.000447\setrandom delta -5000 5000 0.025161BEGIN; 0.131317UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.100211SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.117406UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.114332UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.086660INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 0.035940END; 9.4beta2: # /usr/pgsql-9.3/bin/pgbench -r -j 1 -c 1 -T 60 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 34017 tps = 566.948384 (including connections establishing) tps = 567.008666 (excluding connections establishing) statement latencies in milliseconds: 0.001879\set nbranches 1 * :scale 0.000526\set ntellers 10 * :scale 0.000490\set naccounts 10 * :scale 0.000595\setrandom aid 1 :naccounts 0.000421\setrandom bid 1 :nbranches 0.000480\setrandom tid 1 :ntellers 0.000484\setrandom delta -5000 5000 0.055047BEGIN; 0.172179UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.135392SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.157224UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.147969UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.123001INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 0.957854END; any ideas? Hi Tigran, Some ideas: 60s is too short for reliable results (default settings for checkpoints is 300s so 600s is the typical elapsed time to get reasonably repeatable numbers (to ensure you get about 1 checkpoint in your run). In addition I usually do psql <before each run so that I've got some confidence that we are starting from approximately the same state each time (and getting hopefully only *one* checkpoint per run)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 26/08/14 10:13, Josh Berkus wrote: On 08/22/2014 07:02 AM, Andres Freund wrote: On 2014-08-21 14:02:26 -0700, Josh Berkus wrote: On 08/20/2014 07:40 PM, Bruce Momjian wrote: Not sure how you can make such a blanket statement when so many people have tested and shown the benefits of hyper-threading. Actually, I don't know that anyone has posted the benefits of HT. Link? There's definitely cases where it can help. But it's highly workload *and* hardware dependent. The only cases I've seen where HT can be beneficial is when you have large numbers of idle connections. Then the idle connections can be "parked" on the HT virtual cores. However, even in this case I haven't seen a head-to-head performance comparison. I've just had a pair of Crucial m550's arrive, so a bit of benchmarking is in order. The results (below) seem to suggest that HT enabled is certainly not inhibiting scaling performance for single socket i7's. I performed several runs (typical results shown below). Intel i7-4770 3.4 Ghz, 16G 2x Crucial m550 Ubuntu 14.04 Postgres 9.4 beta2 logging_collector = on max_connections = 600 shared_buffers = 1GB wal_buffers = 32MB checkpoint_segments = 128 effective_cache_size = 10GB pgbench scale = 300 test duration (each) = 600s db on 1x m550 xlog on 1x m550 clients | tps (HT)| tps (no HT) +--+- 4 | 517 | 520 8 | 1013 | 999 16 | 1938 | 1913 32 | 3574 | 3560 64 | 5873 | 5412 128 | 8351 | 7450 256 | 9426 | 7840 512 | 9357 | 7288 Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tuning postgresql 9.3.5 and multiple cores
On 26/08/14 06:47, Jeison Bedoya Delgado wrote: hi, recently i change the hardware of my database 32 cores up to 64 cores and 128GB Ram, but the performance is the same. Perhaps i have to change any parameter in the postgresql.conf?. In addition to the points that others have made, even if you do have > 32 active sessions it it not clear that 64 cores will automagically get you twice (or in fact any) better performance than 32. We are seeing exactly this effect with a (60 core) machine that gets pretty much the same performance as an older generation 32 core one. Interestingly while this is *likely* a software issue - it is not immediately obvious where it lies - we tested Postgres (9.3/9.4/9.5) and Mysql (5.5/5.6/5.7) *all* of which exhibited the the lack of improvement with more cores. Profiling suggested numa effects - but trying to eliminate these seemed to simply throw up new factors to inhibit performance. My *guess* (and it is a guess) is that we are seeing 2 (perhaps more) performance bottlenecks very close to each other: numa and spinlock contention at least. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 26/08/14 10:13, Josh Berkus wrote: On 08/22/2014 07:02 AM, Andres Freund wrote: On 2014-08-21 14:02:26 -0700, Josh Berkus wrote: On 08/20/2014 07:40 PM, Bruce Momjian wrote: Not sure how you can make such a blanket statement when so many people have tested and shown the benefits of hyper-threading. Actually, I don't know that anyone has posted the benefits of HT. Link? There's definitely cases where it can help. But it's highly workload *and* hardware dependent. The only cases I've seen where HT can be beneficial is when you have large numbers of idle connections. Then the idle connections can be "parked" on the HT virtual cores. However, even in this case I haven't seen a head-to-head performance comparison. I recall HT beneficial on a single socket (i3 or i7), using pgbench as the measuring tool. However I didn't save the results at the time. I've just got some new ssd's to play with so might run some pgbench tests on my home machine (Haswell i7) with HT on and off. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 22/08/14 11:29, Josh Berkus wrote: On 08/21/2014 04:08 PM, Steve Crawford wrote: On 08/21/2014 03:51 PM, Josh Berkus wrote: On 08/21/2014 02:26 PM, Scott Marlowe wrote: I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. You know about the IO performance issues with 3.2, yes? Were those 3.2 only and since fixed or are there issues persisting in 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. The issues I know of were fixed in 3.9. There is a 3.11 kernel series for Ubuntu 12.04 Precise. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 21/08/14 11:14, Mark Kirkwood wrote: You didn't mention what cpu this is for (or how many sockets etc), would be useful to know. Just to clarify - while you mentioned that the production system was 40 cores, it wasn't immediately obvious that the same system was the source of the measurements you posted...sorry if I'm being a mixture of pedantic and dense - just trying to make sure it is clear what systems/cpus etc we are talking about (with this in mind it never hurts to quote cpu and mobo model numbers)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 21/08/14 07:13, Josh Berkus wrote: Mark, all: So, this is pretty damming: Read-only test with HT ON: [pgtest@db ~]$ pgbench -c 20 -j 4 -T 600 -S bench starting vacuum...end. transaction type: SELECT only scaling factor: 30 query mode: simple number of clients: 20 number of threads: 4 duration: 600 s number of transactions actually processed: 47167533 tps = 78612.471802 (including connections establishing) tps = 78614.604352 (excluding connections establishing) Read-only test with HT Off: [pgtest@db ~]$ pgbench -c 20 -j 4 -T 600 -S bench starting vacuum...end. transaction type: SELECT only scaling factor: 30 query mode: simple number of clients: 20 number of threads: 4 duration: 600 s number of transactions actually processed: 82457739 tps = 137429.508196 (including connections establishing) tps = 137432.893796 (excluding connections establishing) On a read-write test, it's 10% faster with HT off as well. Further, from their production machine we've seen that having HT on causes the machine to slow down by 5X whenever you get more than 40 cores (as in 100% of real cores or 50% of HT cores) worth of activity. So we're definitely back to "If you're using PostgreSQL, turn off Hyperthreading". Hmm - that is interesting - I don't think we compared read only scaling for hyperthreading on and off (only read write). You didn't mention what cpu this is for (or how many sockets etc), would be useful to know. Notwithstanding the above results, my workmate Matt made an interesting observation: the scaling graph for (our) 60 core box (HT off), looks just like the one for our 32 core box with HT *on*. We are wondering if a lot of the previous analysis of HT performance regressions should actually be reevaluated in the light of ...err is it just that we have a lot more cores...? [1] Regards Mark [1] Particularly as in *some* cases (single socket i7 for instance) HT on seems to scale fine. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 15/08/14 06:18, Josh Berkus wrote: Mark, Is the 60-core machine using some of the Intel chips which have 20 hyperthreaded virtual cores? If so, I've been seeing some performance issues on these processors. I'm currently doing a side-by-side hyperthreading on/off test. Hi Josh, The board has 4 sockets with E7-4890 v2 cpus. They have 15 cores/30 threads. We've running with hyperthreading off (noticed the usual steep/sudden scaling dropoff with it on). What model are your 20 cores cpus? Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
| 391212 | 6963| 0.524354 | 1.225602 WALBufMappingLock | 484693 | 83| 0.649650 | 0.014609 So we're seeing delay coming fairly equally from 5 lwlocks. Thanks again - any other suggestions welcome! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 31/07/14 00:47, Tomas Vondra wrote: On 30 Červenec 2014, 14:39, Tom Lane wrote: "Tomas Vondra" writes: On 30 ??ervenec 2014, 3:44, Mark Kirkwood wrote: While these numbers look great in the middle range (12-96 clients), then benefit looks to be tailing off as client numbers increase. Also running with no stats (and hence no auto vacuum or analyze) is way too scary! By disabling statistics collector you loose runtime counters - number of sequential/index scans on a table, tuples read from a relation aetc. But it does not influence VACUUM or planning at all. It does break autovacuum. Of course, you're right. It throws away info about how much data was modified and when the table was last (auto)vacuumed. This is a clear proof that I really need to drink at least one cup of coffee in the morning before doing anything in the morning. Lol - thanks for taking a look anyway. Yes, coffee is often an important part of the exercise. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
Hi Tomas, Unfortunately I think you are mistaken - disabling the stats collector (i.e. track_counts = off) means that autovacuum has no idea about when/if it needs to start a worker (as it uses those counts to decide), and hence you lose all automatic vacuum and analyze as a result. With respect to comments like "it shouldn't make difference" etc etc, well the profile suggests otherwise, and the change in tps numbers support the observation. regards Mark On 30/07/14 20:42, Tomas Vondra wrote: On 30 Červenec 2014, 3:44, Mark Kirkwood wrote: While these numbers look great in the middle range (12-96 clients), then benefit looks to be tailing off as client numbers increase. Also running with no stats (and hence no auto vacuum or analyze) is way too scary! I assume you've disabled statistics collector, which has nothing to do with vacuum or analyze. There are two kinds of statistics in PostgreSQL - data distribution statistics (which is collected by ANALYZE and stored in actual tables within the database) and runtime statistics (which is collected by the stats collector and stored in a file somewhere on the dist). By disabling statistics collector you loose runtime counters - number of sequential/index scans on a table, tuples read from a relation aetc. But it does not influence VACUUM or planning at all. Also, it's mostly async (send over UDP and you're done) and shouldn't make much difference unless you have large number of objects. There are ways to improve this (e.g. by placing the stat files into a tmpfs). Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 17/07/14 11:58, Mark Kirkwood wrote: Trying out with numa_balancing=0 seemed to get essentially the same performance. Similarly wrapping postgres startup with --interleave. All this made me want to try with numa *really* disabled. So rebooted the box with "numa=off" appended to the kernel cmdline. Somewhat surprisingly (to me anyway), the numbers were essentially identical. The profile, however is quite different: A little more tweaking got some further improvement: rwlocks patch as before wal_buffers = 256MB checkpoint_segments = 1920 wal_sync_method = open_datasync LSI RAID adaptor disable read ahead and write cache for SSD fast path mode numa_balancing = 0 Pgbench scale 2000 again: clients | tps (prev) | tps (tweaked config) -++- 6| 8175 | 8281 12 | 14409 | 15896 24 | 17191 | 19522 48 | 23122 | 29776 96 | 22308 | 32352 192 | 23109 | 28804 Now recall we were seeing no actual tps changes with numa_balancing=0 or 1 (so the improvement above is from the other changes), but figured it might be informative to try to track down what the non-numa bottlenecks looked like. We tried profiling the entire 10 minute run which showed the stats collector as a possible source of contention: 3.86%postgres [kernel.kallsyms][k] _raw_spin_lock_bh | --- _raw_spin_lock_bh | |--95.78%-- lock_sock_nested | udpv6_sendmsg | inet_sendmsg | sock_sendmsg | SYSC_sendto | sys_sendto | tracesys | __libc_send | | | |--99.17%-- pgstat_report_stat | | PostgresMain | | ServerLoop | | PostmasterMain | | main | | __libc_start_main | | | |--0.77%-- pgstat_send_bgwriter | | BackgroundWriterMain | | AuxiliaryProcessMain | | 0x7f08efe8d453 | | reaper | | __restore_rt | | PostmasterMain | | main | | __libc_start_main | --0.07%-- [...] | |--2.54%-- __lock_sock | | | |--91.95%-- lock_sock_nested | | udpv6_sendmsg | | inet_sendmsg | | sock_sendmsg | | SYSC_sendto | | sys_sendto | | tracesys | | __libc_send | | | | | |--99.73%-- pgstat_report_stat | | | PostgresMain | | | ServerLoop Disabling track_counts and rerunning pgbench: clients | tps (no counts) -+ 6|9806 12 | 18000 24 | 29281 48 | 43703 96 | 54539 192 | 36114 While these numbers look great in the middle range (12-96 clients), then benefit looks to be tailing off as client numbers increase. Also running with no stats (and hence no auto vacuum or analyze) is way too scary! Trying out less write heavy workloads shows that the stats overhead does not appear to be significant for *read* heavy cases, so this result above is perhaps more of a curiosity than anything (given that read heavy is more typical...and our real workload is more similar to read heavy). The profile for counts off looks like: 4.79% swapper [kernel.kallsyms][k] read_hpet | --- read_hpet | |--97.10%-- ktime_get | | | |--35.24%-- clockevents_program_event | | tick_program_event | | | | | |--56.59%-- __hrtimer_start_range_ns | | | | | | |
Re: [PERFORM] 60 core performance with 9.3
On 12/07/14 01:19, Kevin Grittner wrote: It might be worth a test using a cpuset to interleave OS cache and the NUMA patch I submitted to the current CF to see whether this is getting into territory where the patch makes a bigger difference. I would expect it to do much better than using numactl --interleave because work_mem and other process-local memory would be allocated in "near" memory for each process. http://www.postgresql.org/message-id/1402267501.4.yahoomail...@web122304.mail.ne1.yahoo.com Thanks Kevin - I did try this out - seemed slightly better than using --interleave, but almost identical to the results posted previously. However looking at my postgres binary with ldd, I'm not seeing any link to libnuma (despite it demanding the library whilst building), so I wonder if my package build has somehow vanilla-ified the result :-( Also I am guessing that with 60 cores I do: $ sudo /bin/bash -c "echo 0-59 >/dev/cpuset/postgres/cpus" i.e cpus are cores not packages...? If I've stuffed it up I'll redo! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 11/07/14 20:22, Andres Freund wrote: On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote: Full report http://paste.ubuntu.com/886/ # 8.82%postgres [kernel.kallsyms][k] _raw_spin_lock_irqsave | --- _raw_spin_lock_irqsave | |--75.69%-- pagevec_lru_move_fn | __lru_cache_add | lru_cache_add | putback_lru_page | migrate_pages | migrate_misplaced_page | do_numa_page | handle_mm_fault | __do_page_fault | do_page_fault | page_fault So, the majority of the time is spent in numa page migration. Can you disable numa_balancing? I'm not sure if your kernel version does that at runtime or whether you need to reboot. The kernel.numa_balancing sysctl might work. Otherwise you probably need to boot with numa_balancing=0. It'd also be worthwhile to test this with numactl --interleave. Trying out with numa_balancing=0 seemed to get essentially the same performance. Similarly wrapping postgres startup with --interleave. All this made me want to try with numa *really* disabled. So rebooted the box with "numa=off" appended to the kernel cmdline. Somewhat surprisingly (to me anyway), the numbers were essentially identical. The profile, however is quite different: Full report at http://paste.ubuntu.com/7806285/ 4.56% postgres [kernel.kallsyms] [k] _raw_spin_lock_irqsave | --- _raw_spin_lock_irqsave | |--41.89%-- try_to_wake_up | | | |--96.12%-- default_wake_function | | | | | |--99.96%-- pollwake | | | __wake_up_common | | | __wake_up_sync_key | | | sock_def_readable | | | | | | | |--99.94%-- unix_stream_sendmsg | | | | sock_sendmsg | | | | SYSC_sendto | | | | sys_sendto | | | | tracesys | | | | __libc_send | | | | pq_flush | | | | ReadyForQuery | | | | PostgresMain | | | | ServerLoop | | | | PostmasterMain | | | | main | | | | __libc_start_main | | | --0.06%-- [...] | | --0.04%-- [...] | | | |--2.87%-- wake_up_process | | | | | |--95.71%-- wake_up_sem_queue_do | | | SYSC_semtimedop | | | sys_semop | | | tracesys | | | __GI___semop | | | | | | | |--99.75%-- LWLockRelease | | | | | | | | | |--25.09%-- RecordTransactionCommit | | | | | CommitTransaction | | | | | CommitTransactionCommand | | | | | finish_xact_command.part.4 | | | | | PostgresMain | | | | | ServerLoop | | | | | PostmasterMain | | | | | main | | | | | __libc_start_ma
Re: [PERFORM] 60 core performance with 9.3
On 11/07/14 20:22, Andres Freund wrote: On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote: Postgres 9.4 beta rwlock patch pgbench scale = 2000 On that scale - that's bigger than shared_buffers IIRC - I'd not expect the patch to make much of a difference. Right - we did test with it bigger (can't recall exactly how big), but will retry again after setting the numa parameters below. # 8.82%postgres [kernel.kallsyms][k] _raw_spin_lock_irqsave | --- _raw_spin_lock_irqsave | |--75.69%-- pagevec_lru_move_fn | __lru_cache_add | lru_cache_add | putback_lru_page | migrate_pages | migrate_misplaced_page | do_numa_page | handle_mm_fault | __do_page_fault | do_page_fault | page_fault So, the majority of the time is spent in numa page migration. Can you disable numa_balancing? I'm not sure if your kernel version does that at runtime or whether you need to reboot. The kernel.numa_balancing sysctl might work. Otherwise you probably need to boot with numa_balancing=0. It'd also be worthwhile to test this with numactl --interleave. That was my feeling too - but I had no idea what the magic switch was to tame it (appears to be in 3.13 kernels), will experiment and report back. Thanks again! Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 01/07/14 22:13, Andres Freund wrote: On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Wow - that is more like it! Andres that is some nice work, we definitely owe you some beers for that :-) I am aware that I need to retest with an unpatched 9.4 src - as it is not clear from this data how much is due to Andres's patches and how much to the steady stream of 9.4 development. I'll post an update on that later, but figured this was interesting enough to note for now. Cool. That's what I like (and expect) to see :). I don't think unpatched 9.4 will show significantly different results than 9.3, but it'd be good to validate that. If you do so, could you post the results in the -hackers thread I just CCed you on? That'll help the work to get into 9.5. So we seem to have nailed read only performance. Going back and revisiting read write performance finds: Postgres 9.4 beta rwlock patch pgbench scale = 2000 max_connections = 200; shared_buffers = "10GB"; maintenance_work_mem = "1GB"; effective_io_concurrency = 10; wal_buffers = "32MB"; checkpoint_segments = 192; checkpoint_completion_target = 0.8; clients | tps (32 cores) | tps -++- 6| 8313 | 8175 12 | 11012 | 14409 24 | 16151 | 17191 48 | 21153 | 23122 96 | 21977 | 22308 192 | 22917 | 23109 So we are back to not doing significantly better than 32 cores. Hmmm. Doing quite a few more tweaks gets some better numbers: kernel.sched_autogroup_enabled=0 kernel.sched_migration_cost_ns=500 net.core.somaxconn=1024 /sys/kernel/mm/transparent_hugepage/enabled [never] +checkpoint_segments = 1920 +wal_buffers = "256MB"; clients | tps -+- 6| 8366 12 | 15988 24 | 19828 48 | 30315 96 | 31649 192 | 29497 One more: +wal__sync_method = "open_datasync" clients | tps -+- 6| 9566 12 | 17129 24 | 22962 48 | 34564 96 | 32584 192 | 28367 So this looks better - however I suspect 32 core performance would improve with these as well! The problem does *not* look to be connected with IO (I will include some iostat below). So time to get the profiler out (192 clients for 1 minute): Full report http://paste.ubuntu.com/886/ # # captured on: Fri Jul 11 03:09:06 2014 # hostname : ncel-prod-db3 # os release : 3.13.0-24-generic # perf version : 3.13.9 # arch : x86_64 # nrcpus online : 60 # nrcpus avail : 60 # cpudesc : Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz # cpuid : GenuineIntel,6,62,7 # total memory : 1056692116 kB # cmdline : /usr/lib/linux-tools-3.13.0-24/perf record -ag # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, attr_mmap2 = 0, attr_mmap = 1, attr_mmap_data = 0 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, uncore_cbox_10 = 17, uncore_cbox_11 = 18, uncore_cbox_12 = 19, uncore_cbox_13 = 20, uncore_cbox_14 = 21, software = 1, uncore_irp = 33, uncore_pcu = 22, tracepoint = 2, uncore_imc_0 = 25, uncore_imc_1 = 26, uncore_imc_2 = 27, uncore_imc_3 = 28, uncore_imc_4 = 29, uncore_imc_5 = 30, uncore_imc_6 = 31, uncore_imc_7 = 32, uncore_qpi_0 = 34, uncore_qpi_1 = 35, uncore_qpi_2 = 36, uncore_cbox_0 = 7, uncore_cbox_1 = 8, uncore_cbox_2 = 9, uncore_cbox_3 = 10, uncore_cbox_4 = 11, uncore_cbox_5 = 12, uncore_cbox_6 = 13, uncore_cbox_7 = 14, uncore_cbox_8 = 15, uncore_cbox_9 = 16, uncore_r2pcie = 37, uncore_r3qpi_0 = 38, uncore_r3qpi_1 = 39, breakpoint = 5, uncore_ha_0 = 23, uncore_ha_1 = 24, uncore_ubox = 6 # # # Samples: 1M of event 'cycles' # Event count (approx.): 359906321606 # # Overhead CommandShared Object Symbol # .. ... . # 8.82%postgres [kernel.kallsyms][k] _raw_spin_lock_irqsave | --- _raw_spin_lock_irqsave | |--75.69%-- pagevec_lru_move_fn | __lru_cache_add | lru_cache_add | putback_lru_page | migrate_pages | migrate_misplaced_page | do_numa_page
Re: [PERFORM] 60 core performance with 9.3
On 01/07/14 21:48, Mark Kirkwood wrote: [1] from git://git.postgresql.org/git/users/andresfreund/postgres.git, commits: 4b82477dcaf81ad7b0c102f4b66e479a5eb9504a 10d72b97f108b6002210ea97a414076a62302d4e 67ffebe50111743975d54782a3a94b15ac4e755f fe686ed18fe132021ee5e557c67cc4d7c50a1ada f2378dc2fa5b73c688f696704976980bab90c611 Hmmm, should read last 5 commits in 'rwlock-contention' and I had pasted the commit nos from my tree not Andres's, sorry, here are the right ones: 472c87400377a7dc418d8b77e47ba08f5c89b1bb e1e549a8e42b753cc7ac60e914a3939584cb1c56 65c2174469d2e0e7c2894202dc63b8fa6f8d2a7f 959aa6e0084d1264e5b228e5a055d66e5173db7d a5c3ddaef0ee679cf5e8e10d59e0a1fe9f0f1893 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile? I.e. perf record -ga? That'll at least give the callers for kernel level stuff. For more information compile postgres with -fno-omit-frame-pointer. Unfortunately this did not help - had lots of unknown symbols from postgres in the profile - I'm guessing the Ubuntu postgresql-9.3 package needs either the -dev package or to be rebuilt with the enable profile option (debug and no-omit-frame-pointer seem to be there already). However further investigation did uncover *very* interesting things. Firstly I had previously said that read only performance looked ok...this was wrong, purely based on comparison to Robert's blog post. Rebooting the 60 core box with 32 cores enabled showed that we got *better* scaling performance in the read only case and illustrated we were hitting a serious regression with more cores. At this point data is needed: Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Wow - that is more like it! Andres that is some nice work, we definitely owe you some beers for that :-) I am aware that I need to retest with an unpatched 9.4 src - as it is not clear from this data how much is due to Andres's patches and how much to the steady stream of 9.4 development. I'll post an update on that later, but figured this was interesting enough to note for now. Regards Mark [1] from git://git.postgresql.org/git/users/andresfreund/postgres.git, commits: 4b82477dcaf81ad7b0c102f4b66e479a5eb9504a 10d72b97f108b6002210ea97a414076a62302d4e 67ffebe50111743975d54782a3a94b15ac4e755f fe686ed18fe132021ee5e557c67cc4d7c50a1ada f2378dc2fa5b73c688f696704976980bab90c611 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile? I.e. perf record -ga? That'll at least give the callers for kernel level stuff. For more information compile postgres with -fno-omit-frame-pointer. Excellent suggestion, will do next week! regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 27/06/14 14:01, Scott Marlowe wrote: On Thu, Jun 26, 2014 at 5:49 PM, Mark Kirkwood wrote: I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1]. The context is the current machine in use by the customer is a 32 core one, and due to growth we are looking at something larger (hence 60 cores). Some initial tests show similar pgbench read only performance to what Robert found here http://rhaas.blogspot.co.nz/2012/04/did-i-say-32-cores-how-about-64.html (actually a bit quicker around 40 tps). However doing a mixed read-write workload is getting results the same or only marginally quicker than the 32 core machine - particularly at higher number of clients (e.g 200 - 500). I have yet to break out the perf toolset, but I'm wondering if any folk has compared 32 and 60 (or 64) core read write pgbench performance? My guess is that the read only test is CPU / memory bandwidth limited, but the mixed test is IO bound. What's your iostat / vmstat / iotop etc look like when you're doing both read only and read/write mixed? That was what I would have thought too, but it does not appear to be the case, here is a typical iostat: Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 nvme0n1 0.00 0.000.00 4448.00 0.0041.47 19.10 0.140.030.000.03 0.03 14.40 nvme1n1 0.00 0.000.00 4448.00 0.0041.47 19.10 0.150.030.000.03 0.03 15.20 nvme2n1 0.00 0.000.00 4549.00 0.0042.20 19.00 0.150.030.000.03 0.03 15.20 nvme3n1 0.00 0.000.00 4548.00 0.0042.19 19.00 0.160.040.000.04 0.04 16.00 dm-0 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 md0 0.00 0.000.00 17961.00 0.0083.67 9.54 0.000.000.000.00 0.00 0.00 dm-1 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 dm-2 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 dm-3 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 dm-4 0.00 0.000.000.00 0.00 0.00 0.00 0.000.000.000.00 0.00 0.00 My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 60 core performance with 9.3
I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1]. The context is the current machine in use by the customer is a 32 core one, and due to growth we are looking at something larger (hence 60 cores). Some initial tests show similar pgbench read only performance to what Robert found here http://rhaas.blogspot.co.nz/2012/04/did-i-say-32-cores-how-about-64.html (actually a bit quicker around 40 tps). However doing a mixed read-write workload is getting results the same or only marginally quicker than the 32 core machine - particularly at higher number of clients (e.g 200 - 500). I have yet to break out the perf toolset, but I'm wondering if any folk has compared 32 and 60 (or 64) core read write pgbench performance? regards Mark [1] Details: 4x E7-4890 15 cores each. 1 TB ram 16x Toshiba PX02SS SATA SSD 4x Samsung NVMe XS1715 PCIe SSD Ubuntu 14.04 (Linux 3.13) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Increasing query time after updates
On 21/01/14 21:45, Mark Kirkwood wrote: On 21/01/14 21:37, Katharina Koobs wrote: Dear Heikki, thank you for your valuable feedback. Regarding your questions: It gradually slower every day. The database size is increasing only slightly over time. I will try your hint regarding CLUSTERING. The difference in effect of VACUUM FULL in version 9.0 sounds very interesting. I will discuss the update to version 9.0 with my colleague. Any further idea or feedback is much appreciated. Index bloat could be a factor too - performing a regular REINDEX on the relevant tables could be worth a try. Sorry - I missed that you had tried reindex already. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Increasing query time after updates
On 21/01/14 21:37, Katharina Koobs wrote: Dear Heikki, thank you for your valuable feedback. Regarding your questions: It gradually slower every day. The database size is increasing only slightly over time. I will try your hint regarding CLUSTERING. The difference in effect of VACUUM FULL in version 9.0 sounds very interesting. I will discuss the update to version 9.0 with my colleague. Any further idea or feedback is much appreciated. Index bloat could be a factor too - performing a regular REINDEX on the relevant tables could be worth a try. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Are there some additional postgres tuning to improve performance in multi tenant system
On 28/12/13 18:19, ankush upadhyay wrote: Hello All, I am using multi tenant system and doing performance testing of multi tenant application. In case of single tenant it is working fine but once I enable tenants, then some time database servers not responding. Any clue? It is a bit tricky to tell without any relevant information (e.g schema description). But a likely culprit would be a missing index on the relevant 'tenant_id' type field in each table that you are using to distinguish the various tenant datasets. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query not using index
On 23/12/13 21:58, Johann Spies wrote: On 19 December 2013 16:48, Tom Lane mailto:t...@sss.pgh.pa.us>> wrote: Johann Spies mailto:johann.sp...@gmail.com>> writes: > I would appreciate some help optimising the following query: It's a mistake to imagine that indexes are going to help much with a join of this size. Hash or merge join is going to be a lot better than nestloop. What you need to do is make sure those will perform as well as possible, and to that end, it'd likely help to raise work_mem. I'm not sure if you can sanely put it high enough to make the query operate totally in memory --- it looks like you'd need work_mem of 500MB or more to prevent any of the sorts or hashes from spilling to disk, and keep in mind that this query is going to use several times work_mem because there are multiple sorts/hashes going on. But if you can transiently dedicate a lot of RAM to this query, that should help some. I'd suggest increasing work_mem via a SET command in the particular session running this query --- you don't want such a high value to be the global default. Thanks Tom. Raising work_mem from 384MB to 512MB made a significant difference. You said "hash or merge join id going to be a lot better than nestloop". Is that purely in the hands of the query planner or what can I do to get the planner to use that options apart from raising the work_mem? You can disable the hash and merge join options by doing: SET enable_hashjoin=off; SET enable_mergejoin=off; before running the query again. Timing it (or EXPLAIN ANALYZE) should demonstrate if that planner made the right call by choosing hash or merge in the first place. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Optimizing a query
On 20/12/13 06:53, Shaun Thomas wrote: On 12/17/2013 08:48 PM, Kai Sellgren wrote: This is your select: SELECT * FROM "Log" LEFT JOIN "NewsArticle" ON "NewsArticle".id = "Log"."targetId" AND "Log"."targetType" = 'NewsArticle' ORDER BY "Log"."createdAt" DESC LIMIT 10 This is your index: CREATE INDEX "Log_targetId_targetType_idx" ON "Log" USING btree ("targetId", "targetType" COLLATE pg_catalog."default"); Unfortunately, this won't help you. You are not matching on any IDs you indexed, aside from joining against the article table. You have no WHERE clause to restrict the data set, so it absolutely must read the entire table to find the most recent records. Without an index on "createdAt", how is it supposed to know what the ten most recent records are? Add an index to the createdAt column: CREATE INDEX idx_log_createdat ON "Log" (createdAt DESC); Using that, it should get the ten most recent Log records almost immediately, including associated article content. Also, might be worth creating an index on NewsArticle(id) so that the join to this table does not require a full table scan: CREATE INDEX newsarticle_id_idx ON "NewsArticle" (id); (probably not a problem when you only have a few articles - but will be as the volume increases over time). Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] WAL + SSD = slow inserts?
On 06/12/13 05:13, Skarsol wrote: On Thu, Dec 5, 2013 at 9:50 AM, Scott Marlowe wrote: On Thu, Dec 5, 2013 at 8:16 AM, Skarsol wrote: psql (PostgreSQL) 9.2.5 Red Hat Enterprise Linux Server release 6.4 (Santiago) Linux 2.6.32-358.6.1.el6.x86_64 #1 SMP Fri Mar 29 16:51:51 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux All relevant filesystems are ext4 Changes from defaults: max_connections = 500 shared_buffers = 32000MB temp_buffers = 24MB work_mem = 1GB maintenance_work_mem = 5GB wal_level = archive wal_buffers = 16MB checkpoint_completion_target = 0.9 archive_mode = on archive_command = 'test ! -f /databases/pg_archive/db/%f && cp %p /databases/pg_archive/db/%f' effective_cache_size = 64000MB default_statistics_target = 5000 log_checkpoints = on stats_temp_directory = '/tmp/pgstat' OK I'd make the following changes. 1: Drop shared_buffers to something like 1000MB 2: drop work_mem to 16MB or so. 1GB is pathological, as it can make the machine run out of memory quite fast. 3: drop max_connections to 100 or so. if you really need 500 conns, then work_mem of 1G is that much worse. Next, move pg_xlog OFF the SSDs and back onto spinning media and put your data/base dir on the SSDs. SSDs aren't much faster, if at all, for pg_xlog, but are much much faster for data/base files. Also changing the io schduler for the SSDs to noop: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/block/switching-sched.txt?id=HEAD Changing the scheduler to noop seems to have had a decent effect. I've made the other recommended changes other than the connections as we do need that many currently. We're looking to implement pg_bouncer which should help with that. What model SSD are you using? Some can work better with deadline than noop (as their own scheduling firmware may be pretty poor). Also, check if there are updates for the SSD firmware. I have a couple of Crucial M4s that changed from being fairly average to very fast indeed after getting later firmware... Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql in a Virtual Machine
On 26/11/13 09:28, Andrew Dunstan wrote: On 11/25/2013 03:19 PM, Heikki Linnakangas wrote: On 25.11.2013 22:01, Lee Nguyen wrote: Hi, Having attended a few PGCons, I've always heard the remark from a few presenters and attendees that Postgres shouldn't be run inside a VM. That bare metal is the only way to go. Here at work we were entertaining the idea of running our Postgres database on our VM farm alongside our application vm's. We are planning to run a few Postgres synchronous replication nodes. Why shouldn't we run Postgres in a VM? What are the downsides? Does anyone have any metrics or benchmarks with the latest Postgres? I've also heard people say that they've seen PostgreSQL to perform worse in a VM. In the performance testing that we've done in VMware, though, we haven't seen any big impact. So I guess the answer is that it depends on the specific configuration of CPU, memory, disks and the software. Synchronous replication is likely going to be the biggest bottleneck by far, unless it's mostly read-only. I don't know if virtualization will have a measurable impact on network latency, which is what matters for synchronous replication. So, I'd suggest that you try it yourself, and see how it performs. And please report back to the list, I'd also love to see some numbers! Yeah, and there are large numbers of public and/or private cloud-based offerings out there (from Amazon RDS, Heroku, EnterpriseDB and VMware among others.) Pretty much all of these are VM based, and can be suitable for many workloads. Maybe the advice is a bit out of date. Agreed. Possibly years ago the maturity of various virtualization layers was such that the advice was sound. But these days it seems that provided some reading is done (so you understand for instance how to make writes go to the hosting hardware), it should be fine. We make use of many KVM guest VMs on usually Ubuntu and the IO performance is pretty indistinguishable from bare metal. In some tests we did notice that VMs with >8 cpus tended to stop scaling so we are using more smaller VMs rather than fewer big ones [1]. regards Mark [1] This was with Pgbench. Note this was over a year ago, so this effect may be not present (different kernels and kvm versions), or the magic number may be higher than 8 now... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] View with and without ::text casting performs differently.
Good Afternoon, I also came across this too. The issue goes away if you keep your join columns the same data type on both tables. The nested loop happens when the join columns are not the same data type. Hope this helps. Best -Mark On Fri, Sep 6, 2013 at 2:35 PM, Tom Lane wrote: > Brian Fehrle writes: > > On 09/05/2013 05:50 PM, Tom Lane wrote: > >> I rather doubt that the now-explicit-instead-of-implicit casts have much > >> to do with that. It seems more likely that you forgot to re-ANALYZE in > >> the new database, or there are some different planner settings, or > >> something along that line. > > > I have two versions of the view in place on the same server, one with > > the typecasting and one without, and this is where I see the differences > > (no ::text runs in 0.5ms and with ::text runs in 13 or so minutes with > > nested loop), so it's all running off the same statistics on the data. > > Hm. Can you provide a self-contained example? > > regards, tom lane > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] View with and without ::text casting performs differently.
*Sorry correction. I meant the Materialize disappears when the join columns are the same data type. On Fri, Sep 6, 2013 at 3:46 PM, Mark Mayo wrote: > Good Afternoon, > > I also came across this too. > The issue goes away if you keep your join columns the same data type on > both tables. > The nested loop happens when the join columns are not the same data type. > Hope this helps. > > Best > -Mark > > > On Fri, Sep 6, 2013 at 2:35 PM, Tom Lane wrote: > >> Brian Fehrle writes: >> > On 09/05/2013 05:50 PM, Tom Lane wrote: >> >> I rather doubt that the now-explicit-instead-of-implicit casts have >> much >> >> to do with that. It seems more likely that you forgot to re-ANALYZE in >> >> the new database, or there are some different planner settings, or >> >> something along that line. >> >> > I have two versions of the view in place on the same server, one with >> > the typecasting and one without, and this is where I see the differences >> > (no ::text runs in 0.5ms and with ::text runs in 13 or so minutes with >> > nested loop), so it's all running off the same statistics on the data. >> >> Hm. Can you provide a self-contained example? >> >> regards, tom lane >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org >> ) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance >> > >
Re: [PERFORM] ORDER BY, LIMIT and indexes
On 06/08/13 22:46, Ivan Voras wrote: Here are two more unexpected results. Same test table (1 mil. records, "id" is SERIAL PRIMARY KEY, PostgreSQL 9.1, VACUUM ANALYZE performed before the experiments): ivoras=# explain analyze select * from lt where id > 90 limit 10; QUERY PLAN Limit (cost=0.00..1.71 rows=10 width=9) (actual time=142.669..142.680 rows=10 loops=1) -> Seq Scan on lt (cost=0.00..17402.00 rows=101630 width=9) (actual time=142.665..142.672 rows=10 loops=1) Filter: (id > 90) Total runtime: 142.735 ms (4 rows) Note the Seq Scan. ivoras=# explain analyze select * from lt where id > 90; QUERY PLAN --- Bitmap Heap Scan on lt (cost=1683.97..7856.35 rows=101630 width=9) (actual time=38.462..85.780 rows=10 loops=1) Recheck Cond: (id > 90) -> Bitmap Index Scan on lt_pkey (cost=0.00..1658.56 rows=101630 width=0) (actual time=38.310..38.310 rows=10 loops=1) Index Cond: (id > 90) Total runtime: 115.674 ms (5 rows) This somewhat explains the above case - we are simply fetching 100,000 records here, and it's slow enough even with the index scan, so planner skips the index in the former case. BUT, if it did use the index, it would have been expectedly fast: ivoras=# set enable_seqscan to off; SET ivoras=# explain analyze select * from lt where id > 90 limit 10; QUERY PLAN Limit (cost=0.00..1.74 rows=10 width=9) (actual time=0.081..0.112 rows=10 loops=1) -> Index Scan using lt_pkey on lt (cost=0.00..17644.17 rows=101630 width=9) (actual time=0.078..0.100 rows=10 loops=1) Index Cond: (id > 90) Total runtime: 0.175 ms (4 rows) It looks like the problem is in the difference between what the planner expects and what the Filter or Index operations deliver: (cost=0.00..17402.00 rows=101630 width=9) (actual time=142.665..142.672 rows=10 loops=1). Hmm - I wonder if the lack or ORDER BY is part of the problem here. Consider a similar query on pgbench_accounts: bench=# explain analyze select aid from pgbench_accounts where aid > 10 limit 20; QUERY PLAN - Limit (cost=0.00..0.91 rows=20 width=4) (actual time=0.005..0.464 rows=20 loops=1) -> Seq Scan on pgbench_accounts (cost=0.00..499187.31 rows=10994846 width=4) (actual time=0.005..0.463 rows=20 loops=1) Filter: (aid > 10) Total runtime: 0.474 ms (4 rows) bench=# explain analyze select aid from pgbench_accounts where aid > 1000 limit 20; QUERY PLAN -- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.017 rows=20 loops=1) Index Cond: (aid > 1000) Total runtime: 0.030 ms (4 rows) So at some point you get index scans. Now add an ORDER BY: bench=# explain analyze select aid from pgbench_accounts where aid > 10 order by aid limit 20; QUERY PLAN -- -- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.008..0.012 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..1235355.34 rows=10994846 width=4) (actual time=0.008..0.011 rows=20 loops=1 ) Index Cond: (aid > 10) Total runtime: 0.023 ms (4 rows) bench=# explain analyze select aid from pgbench_accounts where aid > 1000 order by aid limit 20; QUERY PLAN -- Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20 loops=1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.016 rows=20 loops=1) Index Cond: (aid > 1000) Total runtime: 0.029 ms (4 rows) ...and we have index scans for both cases. Cheers Mark -- Sent via pgs
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 23/05/13 14:26, Mark Kirkwood wrote: On 23/05/13 14:22, Greg Smith wrote: On 5/22/13 10:04 PM, Mark Kirkwood wrote: Make that quite a few capacitors (top right corner): http://regmedia.co.uk/2013/05/07/m500_4.jpg There are some more shots and descriptions of the internals in the excellent review at http://techreport.com/review/24666/crucial-m500-ssd-reviewed That also highlights the big problem with this drive that's kept me from buying one so far: "Unlike rivals Intel and Samsung, Crucial doesn't provide utility software with a built-in health indicator. The M500's payload of SMART attributes doesn't contain any references to flash wear or bytes written, either. Several of the SMART attributes are labeled "Vendor-specific," but you'll need to guess what they track and read the associated values using third-party software." That's a serious problem for most business use of this sort of drive. Agreed - I was thinking the same thing! Having said that, there does seem to be a wear leveling counter in its SMART attributes - but, yes - I'd like to see indicators more similar the level of detail that Intel provides. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 23/05/13 14:22, Greg Smith wrote: On 5/22/13 10:04 PM, Mark Kirkwood wrote: Make that quite a few capacitors (top right corner): http://regmedia.co.uk/2013/05/07/m500_4.jpg There are some more shots and descriptions of the internals in the excellent review at http://techreport.com/review/24666/crucial-m500-ssd-reviewed That also highlights the big problem with this drive that's kept me from buying one so far: "Unlike rivals Intel and Samsung, Crucial doesn't provide utility software with a built-in health indicator. The M500's payload of SMART attributes doesn't contain any references to flash wear or bytes written, either. Several of the SMART attributes are labeled "Vendor-specific," but you'll need to guess what they track and read the associated values using third-party software." That's a serious problem for most business use of this sort of drive. Agreed - I was thinking the same thing! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 23/05/13 13:32, Mark Kirkwood wrote: On 23/05/13 13:01, Joshua D. Drake wrote: On 05/22/2013 04:37 PM, Merlin Moncure wrote: On Wed, May 22, 2013 at 5:42 PM, Joshua D. Drake wrote: I am curious how the 710 or S3700 stacks up against the new M500 from Crucial? I know Intel is kind of the goto for these things but the m500 is power off protected and rated at: Endurance: 72TB total bytes written (TBW), equal to 40GB per day for 5 years . I don't think the m500 is power safe (nor is any drive at the <1$/gb price point). According the the data sheet it is power safe. http://investors.micron.com/releasedetail.cfm?ReleaseID=732650 http://www.micron.com/products/solid-state-storage/client-ssd/m500-ssd Yeah - they apparently have a capacitor on board. Make that quite a few capacitors (top right corner): http://regmedia.co.uk/2013/05/07/m500_4.jpg -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 23/05/13 13:01, Joshua D. Drake wrote: On 05/22/2013 04:37 PM, Merlin Moncure wrote: On Wed, May 22, 2013 at 5:42 PM, Joshua D. Drake wrote: I am curious how the 710 or S3700 stacks up against the new M500 from Crucial? I know Intel is kind of the goto for these things but the m500 is power off protected and rated at: Endurance: 72TB total bytes written (TBW), equal to 40GB per day for 5 years . I don't think the m500 is power safe (nor is any drive at the <1$/gb price point). According the the data sheet it is power safe. http://investors.micron.com/releasedetail.cfm?ReleaseID=732650 http://www.micron.com/products/solid-state-storage/client-ssd/m500-ssd Yeah - they apparently have a capacitor on board. Their write endurance is where they don't compare so favorably to the S3700 (they are *much* cheaper mind you): - M500 120GB drive: 40GB per day for 5 years - S3700 100GB drive: 1000GB per day for 5 years But great to see more reasonably priced SSD with power off protection. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 17/05/13 12:06, Tomas Vondra wrote: Hi, On 16.5.2013 16:46, Cuong Hoang wrote: Pro for the master server. I'm aware of write cache issue on SSDs in case of power loss. However, our hosting provider doesn't offer any other choices of SSD drives with supercapacitor. To minimise risk, we will also set up another RAID 10 SAS in streaming replication mode. For our application, a few seconds of data loss is acceptable. Streaming replication allows zero data loss if used in synchronous mode. I'm not sure synchronous replication is really an option here as it will slow the master down to spinning disk io speeds, unless the standby is configured with SSDs as well - which probably defeats the purpose of this setup. On the other hand, if the system is so loaded that a pure SAS (spinning drive) solution can't keen up, then the standby lag may get to be way more than a few seconds...which means look out for huge data loss. I'd be inclined to apply more leverage to hosting provider to source SSDs suitable for your needs, or change hosting providers. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] statistics target for columns in unique constraint?
On 14/05/13 10:10, Marti Raudsepp wrote: On Mon, May 13, 2013 at 6:01 PM, ach wrote: what I'm wondering is, since the unique constraint already covers the whole table and all rows in entirety, is it really necessary for statistics to be set that high on those? AFAIK if there are exact-matching unique constraints/indexes for a query's WHERE clause, the planner will deduce that the query only returns 1 row and won't consult statistics at all. Or does that only serve to slow down inserts to that table? It doesn't slow down inserts directly. Tables are analyzed in the background by autovacuum. However, I/O traffic from autovacuum analyze may slow down inserts running concurrently. A higher number in stats target means larger stats structures - which in turn means that the planning stage of *all* queries may be impacted - e.g takes up more memory, slightly slower as these larger structures are read, iterated over, free'd etc. So if your only access is via a defined unique key, then (as Marti suggests) - a large setting for stats target would seem to be unnecessary. If you have access to a test environment I'd recommend you model the effect of reducing stats target down (back to the default of 100 or even to the old version default of 10). A little - paranoia - maybe switch on statement logging and ensure that there are no *other* ways this table is accessed...the fact that the number was cranked up from the default is a little suspicious! Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RT3.4 query needed a lot more tuning with 9.2 than it did with 8.1
On Tue, 30 Apr 2013 06:20:55 -0500, Christoph Berg wrote: Hi, this is more of a report than a question, because we thought this would be interesting to share. We recently (finally) migrated an Request Tracker 3.4 database running on 8.1.19 to 9.2.4. The queries used by rt3.4 are sometimes weird, but 8.1 coped without too much tuning. The schema looks like this: What version of DBIx-SearchBuilder do you have on that server? The RT guys usually recommend you have the latest possible so RT is performing the most sane/optimized queries possible for your database. I honestly don't know if it will make a difference for you, but it's worth a shot. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 11/05/13 01:30, Tom Lane wrote: Mark Kirkwood writes: Unfortunately a trigger will not really do the job - analyze ignores in progress rows (unless they were added by the current transaction), and then the changes made by analyze are not seen by any other sessions. So no changes to plans until the entire INSERT is complete and COMMIT happens (which could be a while - too long in our case). I'm not sure I believe the thesis that plans won't change at all. The planner will notice that the physical size of the table is growing. That may not be enough, if the table-contents statistics are missing or completely unreflective of reality, but it's something. It is true that *already cached* plans won't change until after an ANALYZE is done (the key point there being that ANALYZE sends out a shared-inval message to force replanning of plans for the table). Conceivably you could issue concurrent ANALYZEs occasionally while the INSERT is running, not so much to update the stats --- because they wouldn't --- as to force cached-plan invalidation. Yeah - true, I was focusing on the particular type of query illustrated in the test case - pretty much entirely needing updated selectivity stats for a column, which wouldn't change unfortunately. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
(See below for the reply) On 10/05/13 22:48, Vitalii Tymchyshyn wrote: Well, could you write a trigger that would do what you need? AFAIR analyze data is stored no matter transaction boundaries. You could store some counters in session vars and issue an explicit analyze when enough rows were added. 7 трав. 2013 08:33, "Mark Kirkwood" mailto:mark.kirkw...@catalyst.net.nz>> напис. On 07/05/13 18:10, Simon Riggs wrote: On 7 May 2013 01:23, mailto:mark.kirkw...@catalyst.net.nz>__> wrote: I'm thinking that a variant of (2) might be simpler to inplement: (I think Matt C essentially beat me to this suggestion - he originally discovered this issue). It is probably good enough for only *new* plans to react to the increased/increasing number of in progress rows. So this would require backends doing significant numbers of row changes to either directly update pg_statistic or report their in progress numbers to the stats collector. The key change here is the partial execution numbers would need to be sent. Clearly one would need to avoid doing this too often (!) - possibly only when number of changed rows > autovacuum_analyze_scale___factor proportion of the relation concerned or similar. Are you loading using COPY? Why not break down the load into chunks? INSERT - but we could maybe workaround by chunking the INSERT. However that *really* breaks the idea that in SQL you just say what you want, not how the database engine should do it! And more practically means that the most obvious and clear way to add your new data has nasty side effects, and you have to tip toe around muttering secret incantations to make things work well :-) I'm still thinking that making postgres smarter about having current stats for getting the actual optimal plan is the best solution. Unfortunately a trigger will not really do the job - analyze ignores in progress rows (unless they were added by the current transaction), and then the changes made by analyze are not seen by any other sessions. So no changes to plans until the entire INSERT is complete and COMMIT happens (which could be a while - too long in our case). Figuring out how to improve on this situation is tricky. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 07/05/13 19:33, Simon Riggs wrote: On 7 May 2013 07:32, Mark Kirkwood wrote: On 07/05/13 18:10, Simon Riggs wrote: On 7 May 2013 01:23, wrote: I'm thinking that a variant of (2) might be simpler to inplement: (I think Matt C essentially beat me to this suggestion - he originally discovered this issue). It is probably good enough for only *new* plans to react to the increased/increasing number of in progress rows. So this would require backends doing significant numbers of row changes to either directly update pg_statistic or report their in progress numbers to the stats collector. The key change here is the partial execution numbers would need to be sent. Clearly one would need to avoid doing this too often (!) - possibly only when number of changed rows > autovacuum_analyze_scale_factor proportion of the relation concerned or similar. Are you loading using COPY? Why not break down the load into chunks? INSERT - but we could maybe workaround by chunking the INSERT. However that *really* breaks the idea that in SQL you just say what you want, not how the database engine should do it! And more practically means that the most obvious and clear way to add your new data has nasty side effects, and you have to tip toe around muttering secret incantations to make things work well :-) Yes, we'd need to break up SQL statements into pieces and use external transaction snapshots to do that. I'm still thinking that making postgres smarter about having current stats for getting the actual optimal plan is the best solution. I agree. The challenge now is to come up with something that actually works; most of the ideas have been very vague and ignore the many downsides. The hard bit is the analysis and balanced thinking, not the developing. Yeah - seeing likely downsides can be a bit tricky too. I'll have a play with some prototyping ideas, since this is actually an area of postgres (analyze/stats collector) that I've fiddled with before :-) Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 07/05/13 18:10, Simon Riggs wrote: On 7 May 2013 01:23, wrote: I'm thinking that a variant of (2) might be simpler to inplement: (I think Matt C essentially beat me to this suggestion - he originally discovered this issue). It is probably good enough for only *new* plans to react to the increased/increasing number of in progress rows. So this would require backends doing significant numbers of row changes to either directly update pg_statistic or report their in progress numbers to the stats collector. The key change here is the partial execution numbers would need to be sent. Clearly one would need to avoid doing this too often (!) - possibly only when number of changed rows > autovacuum_analyze_scale_factor proportion of the relation concerned or similar. Are you loading using COPY? Why not break down the load into chunks? INSERT - but we could maybe workaround by chunking the INSERT. However that *really* breaks the idea that in SQL you just say what you want, not how the database engine should do it! And more practically means that the most obvious and clear way to add your new data has nasty side effects, and you have to tip toe around muttering secret incantations to make things work well :-) I'm still thinking that making postgres smarter about having current stats for getting the actual optimal plan is the best solution. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
> Simon Riggs wrote: > > Patch works and improves things, but we're still swamped by the block > accesses via the index. Which *might* be enough to stop it making the server go unresponsive, we'll look at the effect of this in the next few days, nice work! > > Which brings me back to Mark's original point, which is that we are > x100 times slower in this case and it *is* because the choice of > IndexScan is a bad one for this situation. > > After some thought on this, I do think we need to do something about > it directly, rather than by tuning infrastructire (as I just > attempted). The root cause here is that IndexScan plans are sensitive > to mistakes in data distribution, much more so than other plan types. > > The two options, broadly, are to either > > 1. avoid IndexScans in the planner unless they have a *significantly* > better cost. At the moment we use IndexScans if cost is lowest, even > if that is only by a whisker. > > 2. make IndexScans adaptive so that they switch to other plan types > mid-way through execution. > > (2) seems fairly hard generically, since we'd have to keep track of > the tids returned from the IndexScan to allow us to switch to a > different plan and avoid re-issuing rows that we've already returned. > But maybe if we adapted the IndexScan plan type so that it adopted a > more page oriented approach internally, it could act like a > bitmapscan. Anyway, that would need some proof that it would work and > sounds like a fair task. > > (1) sounds more easily possible and plausible. At the moment we have > enable_indexscan = off. If we had something like > plan_cost_weight_indexscan = N, we could selectively increase the cost > of index scans so that they would be less likely to be selected. i.e. > plan_cost_weight_indexscan = 2 would mean an indexscan would need to > be half the cost of any other plan before it was selected. (parameter > name selected so it could apply to all parameter types). The reason to > apply this weighting would be to calculate "risk adjusted cost" not > just estimated cost. > I'm thinking that a variant of (2) might be simpler to inplement: (I think Matt C essentially beat me to this suggestion - he originally discovered this issue). It is probably good enough for only *new* plans to react to the increased/increasing number of in progress rows. So this would require backends doing significant numbers of row changes to either directly update pg_statistic or report their in progress numbers to the stats collector. The key change here is the partial execution numbers would need to be sent. Clearly one would need to avoid doing this too often (!) - possibly only when number of changed rows > autovacuum_analyze_scale_factor proportion of the relation concerned or similar. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 05/05/13 00:49, Simon Riggs wrote: On 3 May 2013 13:41, Simon Riggs wrote: (3) to make the check on TransactionIdIsInProgress() into a heuristic, since we don't *need* to check that, so if we keep checking the same xid repeatedly we can reduce the number of checks or avoid xids that seem to be long running. That's slightly more coding than my quick hack here but seems worth it. I think we need both (1) and (3) but the attached patch does just (1). This is a similar optimisation to the one I introduced for TransactionIdIsKnownCompleted(), except this applies to repeated checking of as yet-incomplete xids, and to bulk concurrent transactions. ISTM we can improve performance of TransactionIdIsInProgress() by caching the procno of our last xid. Mark, could you retest with both these patches? Thanks. Thanks Simon, will do and report back. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
> mark.kirkw...@catalyst.net.nz wrote on 03.05.2013 00:19: >> I think the idea of telling postgres that we are doing a load is >> probably >> the wrong way to go about this. We have a framework that tries to >> automatically figure out the best plans...I think some more thought >> about >> how to make that understand some of the more subtle triggers for a >> time-to-do-new-plans moment is the way to go. I understand this is >> probably hard - and may imply some radical surgery to how the stats >> collector and planner interact. > > I wonder if "freezing" (analyze, then disable autovacuum) the statistics > for the large number of rows would work. > > > I'm thinking that the issue is actually the opposite - it is that a new plan is needed because the new (uncomitted) rows are changing the data distribution. So we want more plan instability rather than plan stability :-) Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
> On 2 May 2013 01:49, Mark Kirkwood wrote: > > I think we need a problem statement before we attempt a solution, > which is what Tom is alluding to. > Actually no - I think Tom (quite correctly) was saying that the patch was not a viable solution. With which I agree. I believe the title of this thread is the problem statement. > ISTM that you've got a case where the plan is very sensitive to a > table load. Which is a pretty common situation and one that can be > solved in various ways. I don't see much that Postgres can do because > it can't know ahead of time you're about to load rows. We could > imagine an optimizer that set thresholds on plans that caused the > whole plan to be recalculated half way thru a run, but that would be a > lot of work to design and implement and even harder to test. Having > static plans at least allows us to discuss what it does after the fact > with some ease. > > The plan is set using stats that are set when there are very few > non-NULL rows, and those increase massively on load. The way to cope > is to run the ANALYZE immediately after the load and then don't allow > auto-ANALYZE to reset them later. No. We do run analyze immediately after the load. The surprise was that this was not sufficient - the (small) amount of time where non optimal plans were being used due to the in progress row activity was enough to cripple the system - that is the problem. The analysis of why not led to the test case included in the original email. And sure it is deliberately crafted to display the issue, and is therefore open to criticism for being artificial. However it was purely meant to make it easy to see what I was talking about. Currently we are working around this by coercing one of the predicates in the query to discourage the attractive looking but dangerous index. I think the idea of telling postgres that we are doing a load is probably the wrong way to go about this. We have a framework that tries to automatically figure out the best plans...I think some more thought about how to make that understand some of the more subtle triggers for a time-to-do-new-plans moment is the way to go. I understand this is probably hard - and may imply some radical surgery to how the stats collector and planner interact. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 02/05/13 02:06, Tom Lane wrote: Mark Kirkwood writes: I am concerned that the deafening lack of any replies to my original message is a result of folk glancing at your original quick reply and thinking... incomplete problem spec...ignore... when that is not that case - yes I should have muttered "9.2" in the original email, but we have covered that now. No, I think it's more that we're trying to get to beta, and so anything that looks like new development is getting shuffled to folks' "to look at later" queues. The proposed patch is IMO a complete nonstarter anyway; but I'm not sure what a less bogus solution would look like. Yeah, I did think that beta might be consuming everyone's attention (of course immediately *after* sending the email)! And yes, the patch was merely to illustrate the problem rather than any serious attempt at a solution. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] In progress INSERT wrecks plans on table
On 26/04/13 15:34, Gavin Flower wrote: On 26/04/13 15:19, Mark Kirkwood wrote: While in general you are quite correct - in the above case (particularly as I've supplied a test case) it should be pretty obvious that any moderately modern version of postgres on any supported platform will exhibit this. > While I admit that I did not look closely at your test case - I am aware that several times changes to Postgres from one minor version to another, can have drastic unintended side effects (which might, or might not, be relevant to your situation). Besides, it helps sets the scene, and is one less thing that needs to be deduced. Indeed - however, my perhaps slightly grumpy reply to your email was based on an impression of over keen-ness to dismiss my message without reading it (!) and a - dare I say it - one size fits all presentation of "here are the hoops to jump through". Now I spent a reasonable amount of time preparing the message and its attendant test case - and a comment such as your based on *not reading it* ...errrm... well lets say I think we can/should do better. I am concerned that the deafening lack of any replies to my original message is a result of folk glancing at your original quick reply and thinking... incomplete problem spec...ignore... when that is not that case - yes I should have muttered "9.2" in the original email, but we have covered that now. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Execution Plan with "OR" Clauses Across Outer-Joined Tables
It's an interesting idea, however when I rewrote the original query to use "WHERE EXISTS" rather than "WHERE IN", I get the same bad execution plan. I think this really has to do with the Postgres optimizer's limitations with respect to outer joins. In my case it's certainly possible to rewrite the query by hand to eliminate the outer join and get the same results. And after posting the original problem, I have also found that with some work it's possible to make Hibernate generate a query that eliminates the outer join and get the same results. But I think improving the Postgres optimizer to handle such cases would be a nice improvement. Then again, having lived through many years of Oracle optimizer bugs, it might be easier said than done. On Tue, Apr 30, 2013 at 3:24 PM, Vitalii Tymchyshyn wrote: > What I can say is that hibernate has "exists" in both HQL and criteria API > (e.g. see > http://www.cereslogic.com/pages/2008/09/22/hibernate-criteria-subqueries-exists/ > for > criteria). So, may be it's easier for you to tune your hibernate query to > use exists > > > 2013/4/30 Mark Hampton > >> I have a Hibernate-generated query (That's not going to change, so let's >> just focus on the Postgres side for now) like this: >> >> SELECT * >> from PERSON p >> where p.PERSON_ID in ( >>select distinct p2.PERSON_ID >>from PERSON p2 >> left outer join PERSON_ALIAS pa on >> p2.PERSON_ID = pa.PERSON_ID >>where (lower(p1.SURNAME) = 'duck' or >> lower(pa.SURNAME) = 'duck') and >> (lower(p1.FORENAME) = 'donald' or >> lower(pa.FORENAME) = 'donald') >> ) >> order by p.PERSON_ID asc; >> >> There are function-based indexes on PERSON and PERSON_ALIAS as follows: >> >> CREATE INDEX PERSON_FORENAME_LOWER_FBIDX ON PERSON (LOWER(FORENAME) >> VARCHAR_PATTERN_OPS); >> CREATE INDEX PERSON_SURNAME_LOWER_FBIDX ON PERSON (LOWER(SURNAME) VARCHAR >> _PATTERN_OPS); >> CREATE INDEX PERSON_ALIAS_FORENAME_LOWER_FBIDX ON PERSON_ALIAS >> (LOWER(FORENAME) VARCHAR_PATTERN_OPS); >> CREATE INDEX PERSON_ALIAS_SURNAME_LOWER_FBIDX ON PERSON_ALIAS >> (LOWER(SURNAME) VARCHAR_PATTERN_OPS); >> >> The problem is that the above query doesn't use the indexes. The "or" >> clauses across the outer-join seem to be the culprit. If I rewrite the >> query as follows, Postgres will use the index: >> >> SELECT * >> from PERSON p >> where (p.PERSON_ID in ( >> select p2.PERSON_ID >> from TRAVELER.PERSON p2 >> join TRAVELER.OTHER_NAME pa on p2.PERSON_ID = >>pa.PERSON_ID >> where lower(p2.SURNAME) = 'duck' and >> lower(pa.FORENAME) = 'donald' >> ) or >> p.PERSON_ID in ( >>select p2.PERSON_ID >>from TRAVELER.PERSON p2 >> join TRAVELER.OTHER_NAME pa on p2.PERSON_ID = >> pa.PERSON_ID >>where lower(pa.SURNAME) = 'duck' and >> lower(p2.FORENAME) = 'donald' >> ) or >> p.PERSON_ID in ( >>select p2.PERSON_ID >>from TRAVELER.PERSON p2 >>where lower(p2.SURNAME) = 'duck' and >> lower(p2.FORENAME) = 'donald' >> ) or >> p.PERSON_ID in ( >>select p2.PERSON_ID >>from TRAVELER.OTHER_NAME pa >>where lower(pa.SURNAME) = 'duck' and >> lower(pa.FORENAME) = 'donald' >> )) >> order by p.PERSON_ID asc; >> >> So my question is this: Is there a way to get the Postgres optimizer >> "rewrite" the query execution plan to use the equivalent, but much more >> efficient latter form? >> >> And before you ask; yes, there are better ways of writing this query. >> But we're dealing with Java developers and Hibernate here. It's a legacy >> system, and the policy is to avoid hand-written SQL, so for the moment >> let's not go down that rabbit hole, and focus on the issue of what the >> optimizer can and cannot do. >> > > > > -- > Best regards, > Vitalii Tymchyshyn >
[PERFORM] Bad Execution Plan with "OR" Clauses Across Outer-Joined Tables
I have a Hibernate-generated query (That's not going to change, so let's just focus on the Postgres side for now) like this: SELECT * from PERSON p where p.PERSON_ID in ( select distinct p2.PERSON_ID from PERSON p2 left outer join PERSON_ALIAS pa on p2.PERSON_ID = pa.PERSON_ID where (lower(p1.SURNAME) = 'duck' or lower(pa.SURNAME) = 'duck') and (lower(p1.FORENAME) = 'donald' or lower(pa.FORENAME) = 'donald') ) order by p.PERSON_ID asc; There are function-based indexes on PERSON and PERSON_ALIAS as follows: CREATE INDEX PERSON_FORENAME_LOWER_FBIDX ON PERSON (LOWER(FORENAME) VARCHAR _PATTERN_OPS); CREATE INDEX PERSON_SURNAME_LOWER_FBIDX ON PERSON (LOWER(SURNAME) VARCHAR _PATTERN_OPS); CREATE INDEX PERSON_ALIAS_FORENAME_LOWER_FBIDX ON PERSON_ALIAS (LOWER(FORENAME) VARCHAR_PATTERN_OPS); CREATE INDEX PERSON_ALIAS_SURNAME_LOWER_FBIDX ON PERSON_ALIAS (LOWER(SURNAME) VARCHAR_PATTERN_OPS); The problem is that the above query doesn't use the indexes. The "or" clauses across the outer-join seem to be the culprit. If I rewrite the query as follows, Postgres will use the index: SELECT * from PERSON p where (p.PERSON_ID in ( select p2.PERSON_ID from TRAVELER.PERSON p2 join TRAVELER.OTHER_NAME pa on p2.PERSON_ID = pa.PERSON_ID where lower(p2.SURNAME) = 'duck' and lower(pa.FORENAME) = 'donald' ) or p.PERSON_ID in ( select p2.PERSON_ID from TRAVELER.PERSON p2 join TRAVELER.OTHER_NAME pa on p2.PERSON_ID = pa.PERSON_ID where lower(pa.SURNAME) = 'duck' and lower(p2.FORENAME) = 'donald' ) or p.PERSON_ID in ( select p2.PERSON_ID from TRAVELER.PERSON p2 where lower(p2.SURNAME) = 'duck' and lower(p2.FORENAME) = 'donald' ) or p.PERSON_ID in ( select p2.PERSON_ID from TRAVELER.OTHER_NAME pa where lower(pa.SURNAME) = 'duck' and lower(pa.FORENAME) = 'donald' )) order by p.PERSON_ID asc; So my question is this: Is there a way to get the Postgres optimizer "rewrite" the query execution plan to use the equivalent, but much more efficient latter form? And before you ask; yes, there are better ways of writing this query. But we're dealing with Java developers and Hibernate here. It's a legacy system, and the policy is to avoid hand-written SQL, so for the moment let's not go down that rabbit hole, and focus on the issue of what the optimizer can and cannot do.
Re: [PERFORM] In progress INSERT wrecks plans on table
On 26/04/13 14:56, Gavin Flower wrote: On 26/04/13 14:33, Mark Kirkwood wrote: Recently we encountered the following unhappy sequence of events: 1/ system running happily 2/ batch load into table begins 3/ very quickly (some) preexisting queries on said table go orders of magnitude slower 4/ database instance becomes unresponsive 5/ application outage After looking down a few false leads, We've isolated the cause to the following: The accumulating in-progress row changes are such that previously optimal plans are optimal no longer. Now this situation will fix itself when the next autoanalyze happens (and new plan will be chosen) - however that cannot occur until the batch load is completed and committed (approx 70 seconds). However during that time there is enough of a performance degradation for queries still using the old plan to cripple the server. Now that we know what is happening we can work around it. But I'm wondering - is there any way (or if not should there be one) to let postgres handle this automatically? I experimented with a quick hack to src/backend/commands/analyze.c (attached) that lets another session's ANALYZE see in progress rows - which works but a) may cause other problems and b) does not help autoaanalyze which has to wait for COMMIT + stats message. I've attached a (synthetic) test case that shows the issue, I'll reproduce the output below to hopefully make the point obvious: Table "public.plan" Column |Type | Modifiers +-+--- id | integer | not null typ| integer | not null dat| timestamp without time zone | val| text| not null Indexes: "plan_id" UNIQUE, btree (id) "plan_dat" btree (dat) "plan_typ" btree (typ) [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN - Index Scan using plan_dat on plan (cost=0.00..265.47 rows=55 width=117) (actual time=0.130..4.409 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 4.440 ms (5 rows) [Session 2] BEGIN; INSERT INTO plan SELECT id + 201,typ,current_date + id * '1 seconds'::interval ,val FROM plan ; [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --- Index Scan using plan_dat on plan (cost=0.00..551.35 rows=91 width=117) (actual time=0.131..202.699 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 202.729 ms (5 rows) [Session 2] COMMIT; [Session 1...wait for autoanalyze to finish then] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --- Bitmap Heap Scan on plan (cost=407.87..44991.95 rows=10116 width=117) (actual time=2.692..6.582 rows=75 loops=1) Recheck Cond: (typ = 3) Filter: (dat IS NOT NULL) Rows Removed by Filter: 19925 -> Bitmap Index Scan on plan_typ (cost=0.00..405.34 rows=20346 width=0) (actual time=2.573..2.573 rows=2 loops=1) Index Cond: (typ = 3) Total runtime: 6.615 ms Regards Mark Hmm... You need to specify: 1. version of Postgres 2. Operating system 3. changes to postgresql.conf 4. CPU/RAM etc 5. anything else that might be relevant While in general you are quite correct - in the above case (particularly as I've supplied a test case) it should be pretty obvious that any moderately modern version of postgres on any supported platform will exhibit this. I produced the above test case on Postgres 9.2.4 Ubuntu 13.04, with no changes to the default postgresql.conf Now our actual production server is a 32 CPU box with 512GB RAM, and 16 SAS SSD running Postgres 9.2.4 on Ubuntu 12.04. And yes there are quite a few changes from the defaults there - and I wasted quite a lot of time chasing issues with high CPU and RAM, and changing various configs to see if they helped - before identifying that the issue was in progress row changes and planner statistics. Also in the "real" case with much bigger datasets the difference between the plan being optimal and it *not* being optimal is a factor of 2000x elapsed time instead of a mere 50x ! regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] In progress INSERT wrecks plans on table
Recently we encountered the following unhappy sequence of events: 1/ system running happily 2/ batch load into table begins 3/ very quickly (some) preexisting queries on said table go orders of magnitude slower 4/ database instance becomes unresponsive 5/ application outage After looking down a few false leads, We've isolated the cause to the following: The accumulating in-progress row changes are such that previously optimal plans are optimal no longer. Now this situation will fix itself when the next autoanalyze happens (and new plan will be chosen) - however that cannot occur until the batch load is completed and committed (approx 70 seconds). However during that time there is enough of a performance degradation for queries still using the old plan to cripple the server. Now that we know what is happening we can work around it. But I'm wondering - is there any way (or if not should there be one) to let postgres handle this automatically? I experimented with a quick hack to src/backend/commands/analyze.c (attached) that lets another session's ANALYZE see in progress rows - which works but a) may cause other problems and b) does not help autoaanalyze which has to wait for COMMIT + stats message. I've attached a (synthetic) test case that shows the issue, I'll reproduce the output below to hopefully make the point obvious: Table "public.plan" Column |Type | Modifiers +-+--- id | integer | not null typ| integer | not null dat| timestamp without time zone | val| text| not null Indexes: "plan_id" UNIQUE, btree (id) "plan_dat" btree (dat) "plan_typ" btree (typ) [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN - Index Scan using plan_dat on plan (cost=0.00..265.47 rows=55 width=117) (actual time=0.130..4.409 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 4.440 ms (5 rows) [Session 2] BEGIN; INSERT INTO plan SELECT id + 201,typ,current_date + id * '1 seconds'::interval ,val FROM plan ; [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --- Index Scan using plan_dat on plan (cost=0.00..551.35 rows=91 width=117) (actual time=0.131..202.699 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 202.729 ms (5 rows) [Session 2] COMMIT; [Session 1...wait for autoanalyze to finish then] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --- Bitmap Heap Scan on plan (cost=407.87..44991.95 rows=10116 width=117) (actual time=2.692..6.582 rows=75 loops=1) Recheck Cond: (typ = 3) Filter: (dat IS NOT NULL) Rows Removed by Filter: 19925 -> Bitmap Index Scan on plan_typ (cost=0.00..405.34 rows=20346 width=0) (actual time=2.573..2.573 rows=2 loops=1) Index Cond: (typ = 3) Total runtime: 6.615 ms Regards Mark plan.tar.gz Description: application/gzip *** analyze.c.orig 2013-04-26 10:40:06.634942283 +1200 --- analyze.c 2013-04-26 11:36:13.537404101 +1200 *** *** 1173,1183 * has to adjust the numbers we send to the stats * collector to make this come out right.) */ ! if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(targtuple.t_data))) ! { ! sample_it = true; ! liverows += 1; ! } break; case HEAPTUPLE_DELETE_IN_PROGRESS: --- 1173,1181 * has to adjust the numbers we send to the stats * collector to make this come out right.) */ ! /* Amend so insert in progress tuples are counted */ ! sample_it = true; ! liverows += 1; break; case HEAPTUPLE_DELETE_IN_PROGRESS: -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] INDEX Performance Issue
Hi Jeff, I'ved tried this test using the -S flag './pgbench -c 4 -j 2 -T 600 -S pgbench' Desktop gives me ./pgbench -c 4 -j 2 -T 600 -S pgbench starting vacuum...end. transaction type: SELECT only scaling factor: 1 query mode: simple number of clients: 4 number of threads: 2 duration: 600 s number of transactions actually processed: 35261835 tps = 58769.715695 (including connections establishing) tps = 58770.258977 (excluding connections establishing) Server ./pgbench -c 4 -j 2 -T 600 -S pgbench starting vacuum...end. transaction type: SELECT only scaling factor: 1 query mode: simple number of clients: 4 number of threads: 2 duration: 600 s number of transactions actually processed: 22642303 tps = 37737.157641 (including connections establishing) tps = 37738.167325 (excluding connections establishing) On 8 April 2013 21:39, Jeff Janes wrote: > On Mon, Apr 8, 2013 at 12:31 PM, Mark Davidson wrote: > >> Thanks for your response Vasillis. I've run pgbench on both machines >> `./pgbench -c 10 -t 1 pgbench` getting 99.800650 tps on my local >> machine and 23.825332 tps on the server so quite a significant difference. >> > > These results are almost certainly being driven by how fast your machines > can fsync the WAL data. The type of query you originally posted does not > care about that at all, so these results are not useful to you. You could > run the "pgbench -S", which is getting closer to the nature of the work > your original query does (but still not all that close). > > Cheers, > > Jeff >
Re: [PERFORM] INDEX Performance Issue
Sorry Vasillis I missed you asking that I just did './pgbench -i pgbench' didn't specific set any values. I can try some specific ones if you can suggest any. On 8 April 2013 21:28, Vasilis Ventirozos wrote: > > > > On Mon, Apr 8, 2013 at 11:18 PM, Mark Davidson wrote: > >> Wow my results are absolutely appalling compared to both of those which >> is really interesting. Are you running postgres 9.2.4 on both instances? >> Any specific configuration changes? >> Thinking there must be something up with my setup to be getting such a >> low tps compared with you. >> > > Both installations are 9.2.4 and both db's have absolutely default > configurations, i can't really explain why there is so much difference > between our results, i can only imagine the initialization, thats why i > asked how you populated your pgbench database (scale factor / fill factor). > > Vasilis Ventirozos > > >> On 8 April 2013 21:02, Vasilis Ventirozos wrote: >> >>> >>> -c 10 means 10 clients so that should take advantage of all your cores >>> (see bellow) >>> >>> %Cpu0 : 39.3 us, 21.1 sy, 0.0 ni, 38.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu1 : 38.0 us, 25.0 sy, 0.0 ni, 26.0 id, 4.2 wa, 0.0 hi, 6.8 si, 0.0 st >>> %Cpu2 : 39.3 us, 20.4 sy, 0.0 ni, 39.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu3 : 40.0 us, 18.7 sy, 0.0 ni, 40.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu4 : 13.9 us, 7.1 sy, 0.0 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu5 : 13.1 us, 8.4 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu6 : 14.8 us, 6.4 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu7 : 15.7 us, 6.7 sy, 0.0 ni, 77.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> >>> i am pasting you the results of the same test on a i7-2600 16gb with a >>> sata3 SSD and the results from a VM with 2 cores and a normal 7200 rpm hdd >>> >>> -- DESKTOP >>> vasilis@Disorder ~ $ pgbench -c 10 -t 1 bench >>> starting vacuum...end. >>> transaction type: TPC-B (sort of) >>> scaling factor: 1 >>> query mode: simple >>> number of clients: 10 >>> number of threads: 1 >>> number of transactions per client: 1 >>> number of transactions actually processed: 10/10 >>> tps = 1713.338111 (including connections establishing) >>> tps = 1713.948478 (excluding connections establishing) >>> >>> -- VM >>> >>> postgres@pglab1:~/postgresql-9.2.4/contrib/pgbench$ ./pgbench -c 10 -t >>> 1 bench >>> starting vacuum...end. >>> transaction type: TPC-B (sort of) >>> scaling factor: 1 >>> query mode: simple >>> number of clients: 10 >>> number of threads: 1 >>> number of transactions per client: 1 >>> number of transactions actually processed: 10/10 >>> tps = 1118.976496 (including connections establishing) >>> tps = 1119.180126 (excluding connections establishing) >>> >>> i am assuming that you didn't populate your pgbench db with the default >>> values , if you tell me how you did i will be happy to re run the test and >>> see the differences. >>> >>> >>> >>> On Mon, Apr 8, 2013 at 10:31 PM, Mark Davidson wrote: >>> >>>> Thanks for your response Vasillis. I've run pgbench on both machines >>>> `./pgbench -c 10 -t 1 pgbench` getting 99.800650 tps on my local >>>> machine and 23.825332 tps on the server so quite a significant difference. >>>> Could this purely be down to the CPU clock speed or is it likely >>>> something else causing the issue? >>>> I have run ANALYZE on both databases and tried the queries a number of >>>> times on each to make sure the results are consistent, this is the case. >>>> >>>> >>>> On 8 April 2013 18:19, Vasilis Ventirozos wrote: >>>> >>>>> >>>>> Hello Mark, >>>>> PostgreSQL currently doesn't support parallel query so a faster cpu >>>>> even if it has less cores would be faster for a single query, about >>>>> benchmarking you can try pgbench that you will find in the contrib, >>>>> the execution plan may be different because of different statistics, >>>>> have you analyzed both databases when you compared the execution plans ? >>>>> >>>>> Vasilis Ventirozos >>>>> >>>>> >>>>> Been trying to progress with this t
Re: [PERFORM] INDEX Performance Issue
Wow my results are absolutely appalling compared to both of those which is really interesting. Are you running postgres 9.2.4 on both instances? Any specific configuration changes? Thinking there must be something up with my setup to be getting such a low tps compared with you. On 8 April 2013 21:02, Vasilis Ventirozos wrote: > > -c 10 means 10 clients so that should take advantage of all your cores > (see bellow) > > %Cpu0 : 39.3 us, 21.1 sy, 0.0 ni, 38.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu1 : 38.0 us, 25.0 sy, 0.0 ni, 26.0 id, 4.2 wa, 0.0 hi, 6.8 si, 0.0 st > %Cpu2 : 39.3 us, 20.4 sy, 0.0 ni, 39.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu3 : 40.0 us, 18.7 sy, 0.0 ni, 40.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu4 : 13.9 us, 7.1 sy, 0.0 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu5 : 13.1 us, 8.4 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu6 : 14.8 us, 6.4 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > %Cpu7 : 15.7 us, 6.7 sy, 0.0 ni, 77.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > > i am pasting you the results of the same test on a i7-2600 16gb with a > sata3 SSD and the results from a VM with 2 cores and a normal 7200 rpm hdd > > -- DESKTOP > vasilis@Disorder ~ $ pgbench -c 10 -t 1 bench > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > number of transactions per client: 1 > number of transactions actually processed: 10/10 > tps = 1713.338111 (including connections establishing) > tps = 1713.948478 (excluding connections establishing) > > -- VM > > postgres@pglab1:~/postgresql-9.2.4/contrib/pgbench$ ./pgbench -c 10 -t > 1 bench > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 1 > query mode: simple > number of clients: 10 > number of threads: 1 > number of transactions per client: 1 > number of transactions actually processed: 10/10 > tps = 1118.976496 (including connections establishing) > tps = 1119.180126 (excluding connections establishing) > > i am assuming that you didn't populate your pgbench db with the default > values , if you tell me how you did i will be happy to re run the test and > see the differences. > > > > On Mon, Apr 8, 2013 at 10:31 PM, Mark Davidson wrote: > >> Thanks for your response Vasillis. I've run pgbench on both machines >> `./pgbench -c 10 -t 1 pgbench` getting 99.800650 tps on my local >> machine and 23.825332 tps on the server so quite a significant difference. >> Could this purely be down to the CPU clock speed or is it likely >> something else causing the issue? >> I have run ANALYZE on both databases and tried the queries a number of >> times on each to make sure the results are consistent, this is the case. >> >> >> On 8 April 2013 18:19, Vasilis Ventirozos wrote: >> >>> >>> Hello Mark, >>> PostgreSQL currently doesn't support parallel query so a faster cpu even >>> if it has less cores would be faster for a single query, about benchmarking >>> you can try pgbench that you will find in the contrib, >>> the execution plan may be different because of different statistics, >>> have you analyzed both databases when you compared the execution plans ? >>> >>> Vasilis Ventirozos >>> >>> >>> Been trying to progress with this today. Decided to setup the database >>>> on my local machine to try a few things and I'm getting much more sensible >>>> results and a totally different query plan >>>> http://explain.depesz.com/s/KGd in this case the query took about a >>>> minute but does sometimes take around 80 seconds. >>>> >>>> The config is exactly the same between the two database. The databases >>>> them selves are identical with all indexes the same on the tables. >>>> >>>> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB RAM >>>> and the database is just on a SATA HDD which is a Western Digital >>>> WD5000AAKS. >>>> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the >>>> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0 >>>> >>>> Could anyone offer any reasoning as to why the plan would be so >>>> different across the two machines? I would have thought that the server >>>> would perform a lot better since it has more cores or is postgres more >>>> affected by the CPU speed? Could anyone suggest a way to bench mark the >>>> machines for their postgr