[PERFORM] pgtune + configurations with 9.3
Greetings all, I'm trying to wrap my head around updating my configuration files, which have been probably fairly static since before 8.4. I've got some beefy hardware but have some tables that are over 57GB raw and end up at 140GB size after indexes are applied. One index creation took 7 hours today. So it's time to dive in and see where i'm lacking and what I should be tweaking. I looked at pgtune again today and the numbers it's spitting out took me back, they are huge. From all historical conversations and attempts a few of these larger numbers netted reduced performance vs better performance (but that was on older versions of Postgres). So I come here today to seek out some type of affirmation that these numbers look good and I should look at putting them into my config, staged and or in one fell swoop. I will start at the same time migrating my config to the latest 9.3 template... Postgres Version: 9.3.4, Slony 2.1.3 (migrating to 2.2). CentOS 6.x, 2.6.32-431.5.1.el6.x86_64 Big HP Boxen. 32 core, 256GB of Ram DB is roughly 175GB in size but many tables are hundreds of millions of rows. The pgtune configurations that were spit out based on the information above; max_connections = 300 shared_buffers = 64GB effective_cache_size = 192GB work_mem = 223696kB maintenance_work_mem = 2GB checkpoint_segments = 32 checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100 *my current configuration:* max_connections = 300 shared_buffers = 2000MB effective_cache_size = 7GB work_mem = 6GB maintenance_work_mem = 10GB-- bumped this to try to get my reindexes done checkpoint_segments = 100 #wal_buffers = 64kB #default_statistics_target = 10 Here is my complete configuration (This is my slon slave server, so fsync is off and archive is off, but on my primary fsync=on and archive=on). listen_addresses = '*' max_connections = 300 shared_buffers = 2000MB max_prepared_transactions = 0 work_mem = 6GB maintenance_work_mem = 10GB fsync = off checkpoint_segments = 100 checkpoint_timeout = 10min checkpoint_warning = 3600s wal_level archive archive_mode = off archive_command = 'tar -czvpf /pg_archives/%f.tgz %p' archive_timeout = 10min random_page_cost = 2.0 effective_cache_size = 7GB log_destination = 'stderr' logging_collector = on log_directory = '/data/logs' log_filename = 'pgsql-%m-%d.log' log_truncate_on_rotation = on log_rotation_age = 1d log_min_messages = info log_min_duration_statement = 15s log_line_prefix = '%t %d %u %r %p %m' log_lock_waits = on log_timezone = 'US/Pacific' autovacuum_max_workers = 3 autovacuum_vacuum_threshold = 1000 autovacuum_analyze_threshold = 2000 datestyle = 'iso, mdy' timezone = 'US/Pacific' lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' deadlock_timeout = 5s Also while it doesn't matter in 9.3 anymore apparently my sysctl.conf has kernel.shmmax = 68719476736 kernel.shmall = 4294967296 And PGTune recommended; kernel.shmmax=137438953472 kernel.shmall=33554432 Also of note in my sysctl.conf config: vm.zone_reclaim_mode = 0 vm.swappiness = 10 Thanks for the assistance, watching these index creations crawl along when you know you have so many more compute cycles to provide makes one go crazy.' Tory
Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade
I just had a thought- I know some of these tables are in need of a vacuuming. Could it be that the dump is dumping a bunch of garbage that the restore has to sift through on the restore? I don't know enough details to know if this is a dumb thought or not. The restore to RDS took roughly the same amount of time. My next move is to try on a fast instance store, and also do a postgres 9 restore of a pure SQL dump, but that won't really be a great test since I use custom format. I'm assuming here that I can't take the custom dump from 9.2 and apply it to 9.0, or can I? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5825052.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade
jmcdonagh joseph.e.mcdon...@gmail.com writes: I just had a thought- I know some of these tables are in need of a vacuuming. Could it be that the dump is dumping a bunch of garbage that the restore has to sift through on the restore? I don't know enough details to know if this is a dumb thought or not. No. However it's true that the dump will take a bit longer having to scan a bloated table rather than a tight one. Dump will only output the live rows. psql or pg_restore whatever you're using on the target side will not have to step over any junk. HTH The restore to RDS took roughly the same amount of time. My next move is to try on a fast instance store, and also do a postgres 9 restore of a pure SQL dump, but that won't really be a great test since I use custom format. I'm assuming here that I can't take the custom dump from 9.2 and apply it to 9.0, or can I? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Incredibly-slow-restore-times-after-9-0-9-2-upgrade-tp5824701p5825052.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consult...@comcast.net p: 312.241.7800 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Incredibly slow restore times after 9.09.2 upgrade
On 29.10.2014 16:12, jmcdonagh wrote: Hi Tomas- thank you for your thoughtful response! Tomas Vondra wrote On 28.10.2014 21:55, jmcdonagh wrote: Hi, we have a nightly job that restores current production data to the development databases in a 'warm spare' database so that if the developers need fresh data, it's ready during the day. When we moved from 9.0 to 9.2 suddenly the restores began to take from a few hours to more like 15 hours or so. We're in Amazon EC2, I've tried new EBS volumes, warmed them up, threw IOPS at them, pretty much all the standard stuff to get more disk performance. So, if I understand it correctly, you've been restoring into 9.0, then you switched to 9.2 and it's much slower? Yes- but since the move was done utilizing snapshots so the move involves new volumes, but I have created new volumes since then to rule out a single bad volume. My advice would be to do some basic low-level performance tests to rule this out. Use dd or (better) fio to test basic I/O performance, it's much easier to spot issues that way. Tomas Vondra wrote Is the 9.2 configured equally to 9.0? If you do something like this SELECT name, setting FROM pg_settings WHERE source = 'configuration file'; on both versions, what do you get? I no longer have the 9.0 box up but we do track configuration via puppet and git. The only configuration change made for 9.2 is: -#standard_conforming_strings = off +standard_conforming_strings = off Compared to 9.0, I suppose? Anyway, post the non-default config values at least for 9.2, please. Cause we have an old app that needs this setting on otherwise we'd spend a lot of time trying to fix it. I doubt standard_conforming_strings has anything to do with the issues. Tomas Vondra wrote Here's the thing, the disk isn't saturated. The behavior I'm seeing seems very odd to me; I'm seeing the source disk which holds the dump saturated by reads, which is great, but then I just see nothing being written to the postgres volume. Just nothing happening, then a small burst. There is no write queue backup on the destination disk either. if I look at pg_stat_activity I'll see something like: COPY salesforce_reconciliation (salesforce_id, email, advisor_salesforce_id, processed) FROM stdin and even for small tables, that seems to take a very long time even though the destination disk is almost at 0 utilization. So, where's the bottleneck? Clearly, there's one, so is it a CPU, a disk or something else? Or maybe network, because you're using EBS? What do you mean by 'utilization'? How do you measure that? The bottleneck is I/O somehow. I say somehow, because I see iowait averaging about 50% between two CPUs, but there is just no writes to the destination EBS volume really happening, just reads from the disk where the source dump is located, then bursts of writes to the destination volume every so often. It's kind of puzzling. This is happening on multiple database servers, in multiple availability zones. Driving me bonkers. What I mean by utilization is util% from iostat -m -x 1. I find this rather contradictory. At one moment you say the disk isn't saturated, the next moment you say you're I/O bound. Also, iowait (as reported e.g. by 'top') is tricky to interpret correctly, especially on multi-cpu systems (nice intro to the complexity [1]). It's really difficult to interpret the 50% iowait without more info about what's happening on the machine. IMHO, the utilization (as reported by iotop) is much easier to interpret, because it means '% of time the device was servicing requests'. It has issues too, because 100% does not mean 'saturated' (especially on RAID arrays that can service multiple requests in parallel), but it's better than iowait. If I had to guess based from your info, I'd bet you're CPU bound, so there's very little idle time and about 50% of it is spent waiting for I/O requests (hence the 50% iowait). But in total the amount of I/O is very small, so %util is ~0. Please, post a few lines of 'iostat -x -k 1' output. Samples from 'top' and 'vmstat 1' would be handy too. regards Tomas [1] http://veithen.blogspot.cz/2013/11/iowait-linux.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance