Re: [PERFORM] Can't get Dell PE T420 (Perc H710) perform better than a MacMini with PostgreSQL
Hi Pietro, On Wed, Apr 1, 2015 at 3:56 PM, Pietro Pugni pietro.pu...@gmail.com wrote: T420: went from 311seconds (default postgresql.conf) to 195seconds doing tuning adjustments over RAID, kernel and postgresql.conf; MacMini: 40seconds. I'am afraid, the matter is, that PostgreSQL is not configured properly (and so do operating system and probably controller, however pg_test_fsync shows that things are not so bad there as with postgresql.conf). It is pretty useless to benchmark a database using out-of-the-box configuration. You need at least configure shared memory related, checkpoints-related and autovacuum-related settings. And as a first step, please compare postgresql.conf on Mac and on the server: sometimes (with some mac installers) default postgresql.conf can be not the same as on server. Best regards, Ilya -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very slow checkpoints
On Wed, Mar 18, 2015 at 12:58 PM, Steven Jones steven.jones1...@outlook.com wrote: Yes BBU is on the controller; 1024Mb. It is a HP P410i controller, with write caching turned on the controller; off on disk level. vm.dirty_background_bytes=67108864 and vm.dirty_bytes=536870912 looks resonable for 512MB BBU, you can calculate them for 1024 or recalculate them for dirty_background_ratio By the way, which kernel do you use? We don't at this stage have any graphs, but we will set it up over the next 24hrs at least. Do not forget to have iostat statistics on them, at least latency, %iowait and %util, such parameters are very helpful. And I am always suspicious about zfs under heavy writes. It is reliable and quite comfortable in terms of configuration, but for speed ext4 or xfs with disabled barrier looks more reasonable -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very slow checkpoints
On Wed, Mar 18, 2015 at 1:21 PM, Steven Jones steven.jones1...@outlook.com wrote: #autovacuum = on# Enable autovacuum subprocess? 'on' #log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and #autovacuum_max_workers = 3 # max number of autovacuum subprocesses #autovacuum_naptime = 1min # time between autovacuum runs autovacuum_vacuum_threshold = 500 # min number of row updates before autovacuum_analyze_threshold = 500 # min number of row updates before #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze If I were you, Ill use _scale_factor settings instead of threshold, because it makes your autovacuum aggressive enough (you need it on such workload) without firing too frequently (vacuuming has its price). autovacuum_vacuum_scale_factor = 0.01 and autovacuum_analyze_scale_factor = 0.05 will be OK And if you see all your autovacuum workers active all the time (more than 80% of the time for example) it is a reason to increase autovacuum_max_workers -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very slow checkpoints
On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones steven.jones1...@outlook.com wrote: - typical DW - relatively constant periodic data loads - i.e. heavy write - we receive large CSV files ~ 5-10Gb every 15 minutes spread out across 5-7 minutes - DB size is ~2.5Tb; rotating load of 30 days keeps the database stable And an important addition: how your autovacuum is configured? -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very slow checkpoints
Hi Steven, On Wed, Mar 18, 2015 at 12:21 PM, Steven Jones steven.jones1...@outlook.com wrote: - system is HP blade; 128Gb RAM, 2x 8-core, 12x 10k RPM RAID1+0 (database) Have you BBU on your controller? And how your controller configured, I mean cache mode, io mode, disk write cache mode. You have 15K SAS (which form factor?) under WAL and 10K SAS under database, am I correct? Full iostat/iotop, configuration, checkpoint stats, etc. are pasted below for completeness. Highlights are: checkpoint_segments=512 shared_buffers=16GB checkpoint_timeout=15min checkpoint_completion_target=0.1 It looks like your checkpoint settings are a bit strange besides of everything else. If you chose high value for checkpoint_segments, your aim is to avoid checkpoints by timeout (or vice verse). If you have checkpoint_segments=512, your checkpoint_timeout should be about 60min. And anyway - checkpoint_completion_target=0.9 or 0.7 in order to spread disk load between checkpoints. --- sysctl settings for dirty pages vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 5 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 10 vm.dirty_writeback_centisecs = 500 Values for this settings are really dependent of RAID (and BBU size). And about further problem description: have you any graphical representation of your % disc utilization? Best regards, Ilya -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MusicBrainz postgres performance issues
On Mar 15, 2015, at 13:45, Josh Krupka jkru...@gmail.com wrote: Hmm that's definitely odd that it's swapping since it has plenty of free memory at the moment. Is it still under heavy load right now? Has the output of free consistently looked like that during your trouble times? And it seems better to disable swapiness -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MusicBrainz postgres performance issues
Hi! What shows your pg_stat_bgwriter for one day? On Mar 15, 2015, at 11:54, Robert Kaye r...@musicbrainz.org wrote: Hi! We at MusicBrainz have been having trouble with our Postgres install for the past few days. I’ve collected all the relevant information here: http://blog.musicbrainz.org/2015/03/15/postgres-troubles/ If anyone could provide tips, suggestions or other relevant advice for what to poke at next, we would love it. Thanks! -- --ruaok Robert Kaye -- r...@musicbrainz.org --http://musicbrainz.org
Re: [PERFORM] MusicBrainz postgres performance issues
On Sun, Mar 15, 2015 at 8:46 PM, Andres Freund and...@2ndquadrant.com wrote: That imo doesn't really have anything to do with it. The primary benefit of a BBU with writeback caching is accelerating (near-)synchronous writes. Like the WAL. My point was, that having no proper raid controller (today bbu surely needed for the controller to be a proper one) + heavy writes of any kind, it is absolutely impossible to live with large shared_buffers and without io problems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MusicBrainz postgres performance issues
On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote: shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so. I think that's a outdated wisdom, i.e. not generally true. Quite agreed. With note, that proper configured controller with BBU is needed. A new enough kernel, a sane filesystem (i.e. not ext3) and sane checkpoint configuration takes care of most of the other disadvantages. Most likely. And better to be sure that filesystem mounted without barrier. And I agree with Scott - 64MB work mem AND max_connections = 500 is a recipe for disaster. The problem could be in session mode of pgbouncer. If you can work with transaction mode - do it. Best regards, Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Server vacuuming the same table again and again
Hi Dmitry, On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов skau...@gmail.com wrote: cancelled autovacuum and it seems to help. In the morning autovacuum was back. And then it finished and I gone to work. Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done. What you really need to do as a first thing - configure your autovacuum aggressively enough and then mayde ionice autovacuum instead of mission critical ckeckpointer or bgwriter. Which exact values have you in the following settings: autovacuum_analyze_scale_factor autovacuum_analyze_threshold autovacuum_freeze_max_age autovacuum_max_workers autovacuum_naptime autovacuum_vacuum_cost_delay autovacuum_vacuum_cost_limit autovacuum_vacuum_scale_factor autovacuum_vacuum_threshold log_autovacuum_min_duration ? Best regards, Ilya Best regards, Dmitriy Shalashov -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Server vacuuming the same table again and again
Dmitry, How is you filesystem under database exactly mount? (mount -l) And just in case, while increasing checkpoint_segments, better to increase checkpoint_timeout, otherwise all checkpoints will be still frequent because segment threshold will be never reached. You could monitor your pg_stat_bgwriter to understand which type of checkpoint happens more frequent. On Fri, Apr 25, 2014 at 10:22 AM, Дмитрий Шалашов skau...@gmail.com wrote: I understand that autovacuum has to be done, but not right after previous autovacuum? And then again and again. That is exactly what happen: your autovacuum is not aggresive enough and that is why it runs constantly instead of doing it s job by small portions. you should try something like this: autovacuum | on autovacuum_analyze_scale_factor | 0.05 autovacuum_analyze_threshold| 5 autovacuum_freeze_max_age | 2 autovacuum_max_workers | 10 # set 10 for example and then you could see - if they all working constantly, maybe you need more. or less if not. autovacuum_multixact_freeze_max_age | 4 autovacuum_naptime | 1 autovacuum_vacuum_cost_delay| 10 autovacuum_vacuum_cost_limit| -1 autovacuum_vacuum_scale_factor | 0.01 autovacuum_vacuum_threshold | 10 log_autovacuum_min_duration | -1 Best regards, Ilya Best regards, Dmitriy Shalashov 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com: Hi Dmitry, On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов skau...@gmail.com wrote: cancelled autovacuum and it seems to help. In the morning autovacuum was back. And then it finished and I gone to work. Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done. What you really need to do as a first thing - configure your autovacuum aggressively enough and then mayde ionice autovacuum instead of mission critical ckeckpointer or bgwriter. Which exact values have you in the following settings: autovacuum_analyze_scale_factor autovacuum_analyze_threshold autovacuum_freeze_max_age autovacuum_max_workers autovacuum_naptime autovacuum_vacuum_cost_delay autovacuum_vacuum_cost_limit autovacuum_vacuum_scale_factor autovacuum_vacuum_threshold log_autovacuum_min_duration ? Best regards, Ilya Best regards, Dmitriy Shalashov -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Server vacuuming the same table again and again
Dmitry, On Fri, Apr 25, 2014 at 11:31 AM, Дмитрий Шалашов skau...@gmail.com wrote: Thanks, we'll try new autovacuum settings! I think things with vacuum will be much better. If not, try to find out if you have long running transaction (several minutes or more) and try to avoid such them. First, I have to say that load comes and go in waves - we don't yet understood why. All new waves have that behaviour - free disk, idle cpu. First such wave was before checkpoit_segments change, next waves after. That could be a complicate problem caused by many things from suboptimal sql-queries to network issues, could be not easy to guess. - how many locks you have during the wave in comparison with normal workload? - do you use some connection pooling (pgbouncer etc)? - how about long running transactions I have mentioned above? - are you using pg_stat_statements or any other method for detecting slow queries? Best regards, Dmitriy Shalashov 2014-04-25 13:22 GMT+04:00 Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com: On Fri, Apr 25, 2014 at 10:29 AM, Дмитрий Шалашов skau...@gmail.com wrote: Previously during load disk was 100% busy; now we have around 100 active state queries, 100% loaded proc, but disk is virtually idle... That was happen after changing checkpoit_segments setting? Best regards, Dmitriy Shalashov 2014-04-25 12:22 GMT+04:00 Дмитрий Шалашов skau...@gmail.com: Hi Ilya! Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done. I understand that autovacuum has to be done, but not right after previous autovacuum? And then again and again. And after cancelling that first autovacuum I started another one by hand; from there no autovacuum was cancelled. ionice autovacuum instead of mission critical ckeckpointer or bgwriter Yeah, that was desperate. I restarted server when I had a chance - to drop my ionice settings back to defaults. Which exact values have you in the following settings: autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 50 autovacuum_freeze_max_age = 2 autovacuum_max_workers = 3 autovacuum_naptime = 60 autovacuum_vacuum_cost_delay = 20 autovacuum_vacuum_cost_limit = -1 autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 50 log_autovacuum_min_duration = 0 All defaults except last one I believe. Minwhile I noticed in the night logs: checkpoints are occurring too frequently (138 seconds apart) Consider increasing the configuration parameter checkpoint_segments. Increased checkpoint_segments to 256 and reloaded config. Best regards, Dmitriy Shalashov 2014-04-25 12:12 GMT+04:00 Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com: Hi Dmitry, On Fri, Apr 25, 2014 at 9:47 AM, Дмитрий Шалашов skau...@gmail.com wrote: cancelled autovacuum and it seems to help. In the morning autovacuum was back. And then it finished and I gone to work. Actually, thise two things are tightly bound and there is no chance to avoid vacuum, you can only postpone it, this kind of work eventually supposed to be done. What you really need to do as a first thing - configure your autovacuum aggressively enough and then mayde ionice autovacuum instead of mission critical ckeckpointer or bgwriter. Which exact values have you in the following settings: autovacuum_analyze_scale_factor autovacuum_analyze_threshold autovacuum_freeze_max_age autovacuum_max_workers autovacuum_naptime autovacuum_vacuum_cost_delay autovacuum_vacuum_cost_limit autovacuum_vacuum_scale_factor autovacuum_vacuum_threshold log_autovacuum_min_duration ? Best regards, Ilya Best regards, Dmitriy Shalashov -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PGSQL, checkpoints, and file system syncs
Hi Reza, vm.dirty_bytes indeed makes sense, but just in case: how exactly is your ext4 mount? Particularly, have you disabled barrier? Ilya On Thu, Apr 3, 2014 at 8:11 PM, Reza Taheri rtah...@vmware.com wrote: Try setting the vm.dirty_bytes sysctl. Something like 256MB might be a good starting point. -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why shared_buffers max is 8GB?
Hi Alexey, On Wed, Mar 26, 2014 at 1:21 PM, Alexey Vasiliev leopard...@inbox.ru wrote: I read from several sources, what maximum shared_buffers is 8GB. I believe that was an issue on some older versions, and thats why was mentioned in several talks. Today it is a sort of apocrypha. Does this true? If yes, why exactly this number is maximum number of shared_buffers for good performance (on Linux 64-bits)? 25% of available RAM is a good idea to start. Sometimes, if you have heavy workload _and_ it is possible to reside whole database in memory, better to use something larger, about ~75% of RAM. Best regards, Ilya -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why shared_buffers max is 8GB?
On Wed, Mar 26, 2014 at 5:14 PM, Shaun Thomas stho...@optionshouse.com wrote: * Checkpoints must commit dirty shared buffers to disk. The larger this is, the more risk you have when checkpoints come, up to and including an unresponsive database. Writing to disks isn't free, and sadly this is still on the slower side unless all of your storage is SSD-based. You don't want to set this too much higher than your disk write cache. We use on some heavy working machines 48GB of shared buffers (and sometimes more - depends on amount of RAM). Of course that works only with good enough hardware raid with large bbu, well tuned linux (dirty bytes appropriate to raid cache size etc) and aggressively tuned both checkpoints and background writer: bgwriter_delay | 10 bgwriter_lru_maxpages | 1000 bgwriter_lru_multiplier | 10 checkpoint_completion_target | 0.9 checkpoint_segments | 300 checkpoint_timeout | 3600 and it really makes sense -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow join not using index properly
: 23558kB Buffers: shared hit=3991 - Seq Scan on collection_data (cost=0.00..10020.40 rows=602940 width=8) (actual time=0.006..146.447 rows=603066 loops=1) Buffers: shared hit=3991 Total runtime: 27854.200 ms I also tried including the MIN( a1.context_key ) in the first select statement as you had written it, but upon doing that it became necessary to add a GROUP BY clause, and doing that changed the final number of rows selected: ERROR: column a1.ancestor_key must appear in the GROUP BY clause or be used in an aggregate function LINE 4: min( a1.context_key ), ancestor_key ^ Including the LIMIT 1 at the end of the first select statement gave a syntax error that I couldn't seem to get past, so I think it might be invalid: ERROR: syntax error at or near UNION LINE 8: UNION ( ^ So I landed on the version that I posted above, which seems to select the same set in all of the cases that I tried. Anyway, thanks again for taking a stab at helping, I do appreciate it. If you have any other ideas that might be of help I'd certainly be happy to hear them. Take care, /Stefan On Thu, Mar 20, 2014 at 11:02 PM, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Hi Stefan! Probably you need to rewrite your query like this (check it first): with RECURSIVE qq(cont_key, anc_key) as ( select min(a1.context_key), ancestor_key from virtual_ancestors a1 union select (SELECT a1.context_key, ancestor_key FROM virtual_ancestors a1 where context_key cont_key order by context_key limit 1) from qq where cont_key is not null ) select a1.cont_key from qq a1, collection_data, virtual_ancestors a2 WHERE a1.anc_key = collection_data.context_key AND collection_data.collection_context_key = a2.context_key AND a2.ancestor_key = ? best regards, Ilya On Fri, Mar 21, 2014 at 12:56 AM, Stefan Amshey srams...@gmail.com wrote: We have a slow performing query that we are trying to improve, and it appears to be performing a sequential scan at a point where it should be utilizing an index. Can anyone tell me why postgres is opting to do it this way? The original query is as follows: SELECT DISTINCT a1.context_key FROM virtual_ancestors a1, collection_data, virtual_ancestors a2 WHERE a1.ancestor_key = collection_data.context_key AND collection_data.collection_context_key = a2.context_key AND a2.ancestor_key = ? The key relationships should all using indexed columns, but the query plan that postgres comes up with ends up performing a sequential scan on the collection_data table (in this case about 602k rows) where we would have expected it to utilize the index: HashAggregate (cost=60905.73..60935.73 rows=3000 width=4) (actual time=3366.165..3367.354 rows=3492 loops=1) Buffers: shared hit=16291 read=1222 - Nested Loop (cost=17546.26..60898.23 rows=3000 width=4) (actual time=438.332..3357.918 rows=13037 loops=1) Buffers: shared hit=16291 read=1222 - Hash Join (cost=17546.26..25100.94 rows=98 width=4) (actual time=408.554..415.767 rows=2092 loops=1) Hash Cond: (a2.context_key = collection_data.collection_context_key) Buffers: shared hit=4850 read=3 - Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a2 (cost=0.00..233.32 rows=270 width=4) (actual time=8.532..10.703 rows=1960 loops=1) Index Cond: (ancestor_key = 1072173) Heap Fetches: 896 Buffers: shared hit=859 read=3 - Hash (cost=10015.56..10015.56 rows=602456 width=8) (actual time=399.708..399.708 rows=602570 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 23538kB Buffers: shared hit=3991 sequential scan occurs here ## - Seq Scan on collection_data (cost=0.00..10015.56 rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1) Buffers: shared hit=3991 - Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a1 (cost=0.00..360.70 rows=458 width=8) (actual time=1.339..1.403 rows=6 loops=2092) Index Cond: (ancestor_key = collection_data.context_key) Heap Fetches: 7067 Buffers: shared hit=11441 read=1219 Total runtime: 3373.058 ms The table definitions are as follows: Table public.virtual_ancestors Column| Type | Modifiers --+--+--- ancestor_key | integer | not null context_key | integer | not null degree | smallint | not null Indexes: virtual_ancestors_pkey PRIMARY KEY
Re: [PERFORM] pg_dump vs pg_basebackup
Hi gianfranco, How exactly large is your database and how heavy is a workload on it? Usually if you have more than ~200Gb, better to use pg_basebackup because pg_dump will take too long time. And please take in mind, that pg_dump makes dump, which is actually not the same thing as a backup. Best regards, Ilya On Tue, Mar 25, 2014 at 5:45 AM, gianfranco caca limpc...@yahoo.com wrote: Hai, Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database. Thanks -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump vs pg_basebackup
Yes, you need to set recovery_target_time in your recovery.conf while performing recovery (http://www.postgresql.org/docs/9.3/static/recovery-target-settings.html). That could be a tricky thing - depends on that exactly you need. All those transactions, which were not committed at given timestamp, will be rollbacked, so read url above carefully. On Tue, Mar 25, 2014 at 8:19 AM, gianfranco caca limpc...@yahoo.com wrote: Hai ilya, Thanks for the respond. The database is estimated over 100gb and the workload will be high. Can we use a pg_basebackup with pitr to restore based on transaction time? Thanks On Tuesday, 25 March 2014, 15:13, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Hi gianfranco, How exactly large is your database and how heavy is a workload on it? Usually if you have more than ~200Gb, better to use pg_basebackup because pg_dump will take too long time. And please take in mind, that pg_dump makes dump, which is actually not the same thing as a backup. Best regards, Ilya On Tue, Mar 25, 2014 at 5:45 AM, gianfranco caca limpc...@yahoo.com wrote: Hai, Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database. Thanks -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump vs pg_basebackup
Joshua, that is really good point: an alternative is to use pg_basebackup through ssh tunnel with compression, but rsync is much simpler. On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake j...@commandprompt.com wrote: On 03/25/2014 05:05 AM, Claudio Freire wrote: On Tue, Mar 25, 2014 at 4:39 AM, David Johnston pol...@yahoo.com wrote: Hai, Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database. Honestly, Neither is particularly good at backing up large databases. I would look into PITR with rsync. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc Political Correctness is for cowards. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump vs pg_basebackup
Joshua, On Tue, Mar 25, 2014 at 4:22 PM, Joshua D. Drake j...@commandprompt.com wrote: The advantage is that you can create backups that don't have to be restored, just started. You can also use the differential portions of rsync to do it multiple times a day without much issue. Are you sure, that it is a nice idea on a database with heavy write workload? And also Im not sure, that differential backups using rsync will be recoverable, if you have actually meant that. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc Political Correctness is for cowards. -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump vs pg_basebackup
Magnus, That is correct, but I'am afraid that such all-in-one functionality also hides from one how backup really works. Probably such sort of knowledge is so essential for a DBA, that it is better to learn both methods, at least to be able to choose correctly? But maybe it is a rhetorical question. On Tue, Mar 25, 2014 at 4:21 PM, Magnus Hagander mag...@hagander.net wrote: I would say that's the one thing that rsync is *not*. pg_basebackup takes care of a lot of things under the hood. rsync is a lot more complicated, in particular in failure scenarios, since you have to manually deal with pg_start/stop_backup(). There are definitely reasons you'd prefer rsync over pg_basebackup, but I don't believe simplicity is one of them. //Magnus On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Joshua, that is really good point: an alternative is to use pg_basebackup through ssh tunnel with compression, but rsync is much simpler. On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake j...@commandprompt.com wrote: On 03/25/2014 05:05 AM, Claudio Freire wrote: On Tue, Mar 25, 2014 at 4:39 AM, David Johnston pol...@yahoo.com wrote: Hai, Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database. Honestly, Neither is particularly good at backing up large databases. I would look into PITR with rsync. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc Political Correctness is for cowards. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump vs pg_basebackup
OK, agreed. Ive got your point;-) On Tue, Mar 25, 2014 at 4:40 PM, Magnus Hagander mag...@hagander.net wrote: Oh, I agree it's good that you should know both methods. I only disagree with that the choice of rsync be made with the argument of simplicity. Simplicity is one of the main reasons to choose the *other* method (pg_basebackup), and the rsync method is for more advanced usecases. But it's definitely good to know both! //Magnus On Tue, Mar 25, 2014 at 4:37 PM, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Magnus, That is correct, but I'am afraid that such all-in-one functionality also hides from one how backup really works. Probably such sort of knowledge is so essential for a DBA, that it is better to learn both methods, at least to be able to choose correctly? But maybe it is a rhetorical question. On Tue, Mar 25, 2014 at 4:21 PM, Magnus Hagander mag...@hagander.net wrote: I would say that's the one thing that rsync is *not*. pg_basebackup takes care of a lot of things under the hood. rsync is a lot more complicated, in particular in failure scenarios, since you have to manually deal with pg_start/stop_backup(). There are definitely reasons you'd prefer rsync over pg_basebackup, but I don't believe simplicity is one of them. //Magnus On Tue, Mar 25, 2014 at 4:18 PM, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Joshua, that is really good point: an alternative is to use pg_basebackup through ssh tunnel with compression, but rsync is much simpler. On Tue, Mar 25, 2014 at 3:56 PM, Joshua D. Drake j...@commandprompt.com wrote: On 03/25/2014 05:05 AM, Claudio Freire wrote: On Tue, Mar 25, 2014 at 4:39 AM, David Johnston pol...@yahoo.com wrote: Hai, Can anyone tell me the difference and performance between pgdump and pg_basebackup if I want to backup a large database. Honestly, Neither is particularly good at backing up large databases. I would look into PITR with rsync. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc Political Correctness is for cowards. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow join not using index properly
persists despite re-clustering the index. We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit Interestingly, on an instance running PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit where I copied the 2 tables over to a temporary database, the plan comes out differently: HashAggregate (cost=39692.03..39739.98 rows=4795 width=4) (actual time=73.285..75.141 rows=3486 loops=1) Buffers: shared hit=22458 - Nested Loop (cost=0.00..39680.05 rows=4795 width=4) (actual time=0.077..63.116 rows=13007 loops=1) Buffers: shared hit=22458 - Nested Loop (cost=0.00..32823.38 rows=164 width=4) (actual time=0.056..17.685 rows=2084 loops=1) Buffers: shared hit=7529 - Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a2 (cost=0.00..1220.85 rows=396 width=4) (actual time=0.025..2.732 rows=1954 loops=1) Index Cond: (ancestor_key = 1072173) Heap Fetches: 1954 Buffers: shared hit=1397 Note the index scan here - this is what it SHOULD be doing ## - Index Scan using collection_data_context_key_index on collection_data (cost=0.00..79.24 rows=56 width=8) (actual time=0.004..0.005 rows=1 loops=1954) Index Cond: (collection_context_key = a2.context_key) Buffers: shared hit=6132 - Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a1 (cost=0.00..35.40 rows=641 width=8) (actual time=0.007..0.015 rows=6 loops=2084) Index Cond: (ancestor_key = collection_data.context_key) Heap Fetches: 13007 Buffers: shared hit=14929 Total runtime: 76.431 ms Why can't I get the Postgres 9.2.5 instance to use the optimal plan? Thanks in advance! /Stefan -- - Stefan Amshey -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Problem with ExclusiveLock on inserts
Vladimir, pgbouncer works with pl/proxy in transaction pooling mode. A wide spread phrase that statement mode is for plproxy does not mean any limitations for transaction pooling mode until you have atocommit on client. Anyway, try to reduce connections. try to set your autovacuum a bit more aggressive: autovacuum_analyze_scale_factor=0.05 #or like that autovacuum_analyze_threshold=5 autovacuum_freeze_max_age=2 autovacuum_max_workers=20 # that is fine for slow disks autovacuum_naptime=1 autovacuum_vacuum_cost_delay=5 # or at least 10 autovacuum_vacuum_cost_limit =-1 autovacuum_vacuum_scale_factor=0.01 # this setting is to be really aggressive, otherwise you simply postpone huge vacuums and related disk io, smaller portions are better autovacuum_vacuum_threshold=20 probably you will also need some ionice for autovacuum workers On Thu, Feb 13, 2014 at 11:26 AM, Бородин Владимир r...@simply.name wrote: 13.02.2014, в 13:29, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Vladimir, And, any effect on your problem? It worked without problems longer than previous configuration but repeated again several minutes ago :( On Thu, Feb 13, 2014 at 9:35 AM, Бородин Владимир r...@simply.name wrote: I have limited max connections to 1000, reduced shared buffers to 8G and restarted postgres. 1000 is still to much in most cases. With pgbouncer in transaction pooling mode normaly pool size 8-32, max_connections = 100 (default value) and client_connections 500-1500 looks more reasonable. Clients for this db are plproxy hosts. As far as I know plproxy can work only with statement pooling. I have also noticed that this big tables stopped vacuuming automatically a couple of weeks ago. It could be the reason of the problem, I will now try to tune autovacuum parameters to turn it back. But yesterday I ran vacuum analyze for all relations manually but that did not help. How do your autovacuum parameters look like now? They were all default except for vacuum_defer_cleanup_age = 10. I have increased autovacuum_max_workers = 20 because I have 10 databases with about 10 tables each. That did not make better (I haven't seen more than two auto vacuum workers simultaneously). Then I have tried to set vacuum_cost_limit = 1000. Still not vacuuming big tables. Right now the parameters look like this: root@rpopdb01e ~ # fgrep vacuum /var/lib/pgsql/9.3/data/conf.d/postgresql.conf #vacuum_cost_delay = 0 # 0-100 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits vacuum_cost_limit = 1000# 1-1 credits vacuum_defer_cleanup_age = 10 # number of xacts by which cleanup is delayed autovacuum = on # Enable autovacuum subprocess? 'on' log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and autovacuum_max_workers = 20 # max number of autovacuum subprocesses #autovacuum_naptime = 1min # time between autovacuum runs #autovacuum_vacuum_threshold = 50 # min number of row updates before # vacuum #autovacuum_analyze_threshold = 50 # min number of row updates before #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze #autovacuum_freeze_max_age = 2 # maximum XID age before forced vacuum #autovacuum_vacuum_cost_delay = 20ms# default vacuum cost delay for # autovacuum, in milliseconds; # -1 means use vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovacuum, -1 means use # vacuum_cost_limit #vacuum_freeze_min_age = 5000 #vacuum_freeze_table_age = 15000 root@rpopdb01e ~ # 13.02.2014, в 0:14, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): On Wed, Feb 12, 2014 at 8:57 PM, Бородин Владимир r...@simply.name wrote: Yes, this is legacy, I will fix it. We had lots of inactive connections but right now we use pgbouncer for this. When the workload is normal we have some kind of 80-120 backends. Less than 10 of them are in active state. Having problem with locks we get lots of sessions (sometimes more than 1000 of them are in active state). According to vmstat the number of context switches is not so big (less than 20k), so I don't think it is the main reason. Yes, it can aggravate the problem, but imho not create it. I'am afraid that is the problem. More than 1000 backends, most of them are simply waiting. I don't understand the correlation of shared buffers size and synchronous_commit. Could you please
Re: [PERFORM] Problem with ExclusiveLock on inserts
Vladimir, And, any effect on your problem? On Thu, Feb 13, 2014 at 9:35 AM, Бородин Владимир r...@simply.name wrote: I have limited max connections to 1000, reduced shared buffers to 8G and restarted postgres. 1000 is still to much in most cases. With pgbouncer in transaction pooling mode normaly pool size 8-32, max_connections = 100 (default value) and client_connections 500-1500 looks more reasonable. I have also noticed that this big tables stopped vacuuming automatically a couple of weeks ago. It could be the reason of the problem, I will now try to tune autovacuum parameters to turn it back. But yesterday I ran vacuum analyze for all relations manually but that did not help. How do your autovacuum parameters look like now? 13.02.2014, в 0:14, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): On Wed, Feb 12, 2014 at 8:57 PM, Бородин Владимир r...@simply.name wrote: Yes, this is legacy, I will fix it. We had lots of inactive connections but right now we use pgbouncer for this. When the workload is normal we have some kind of 80-120 backends. Less than 10 of them are in active state. Having problem with locks we get lots of sessions (sometimes more than 1000 of them are in active state). According to vmstat the number of context switches is not so big (less than 20k), so I don't think it is the main reason. Yes, it can aggravate the problem, but imho not create it. I'am afraid that is the problem. More than 1000 backends, most of them are simply waiting. I don't understand the correlation of shared buffers size and synchronous_commit. Could you please explain your statement? You need to fsync your huge shared buffers any time your database performs checkpoint. By default it usually happens too often because checkpoint_timeout is 5min by default. Without bbu, on software raid that leads to io spike and you commit waits for wal. 12.02.2014, в 23:37, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): another thing which is arguable - concurrency degree. How many of your max_connections = 4000 are actually running? 4000 definitely looks like an overkill and they could be a serious source of concurrency, especially then you have had barrier enabled and software raid. Plus for 32Gb of shared buffers with synchronous_commit = on especially on heavy workload one should definitely have bbu, otherwise performance will be poor. On Wed, Feb 12, 2014 at 8:20 PM, Бородин Владимир r...@simply.name wrote: Oh, I haven't thought about barriers, sorry. Although I use soft raid without batteries I have turned barriers off on one cluster shard to try. root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime) root@rpopdb01e ~ # mount -o remount,nobarrier /dev/md2 root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime,nobarrier) root@rpopdb01e ~ # 12.02.2014, в 21:56, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): My question was actually about barrier option, by default it is enabled on RHEL6/ext4 and could cause serious bottleneck on io before disks are actually involved. What says mount without arguments? On Feb 12, 2014, at 18:43, Бородин Владимир r...@simply.name wrote: root@rpopdb01e ~ # fgrep data /etc/fstab UUID=f815fd3f-e4e4-43a6-a6a1-bce1203db3e0 /var/lib/pgsql/9.3/data ext4 noatime,nodiratime 0 1 root@rpopdb01e ~ # According to iostat the disks are not the bottleneck. 12.02.2014, в 21:30, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Hi Vladimir, Just in case: how is your ext4 mount? Best regards, Ilya On Feb 12, 2014, at 17:59, Бородин Владимир r...@simply.name wrote: Hi all. Today I have started getting errors like below in logs (seems that I have not changed anything for last week). When it happens the db gets lots of connections in state active, eats 100% cpu and clients get errors (due to timeout). 2014-02-12 15:44:24.562 MSK,rpop,rpopdb_p6,30061,localhost:58350,52fb5e53.756d,1,SELECT waiting,2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,0,process 30061 still waiting for ExclusiveLock on extension of relation 26118 of database 24590 after 1000.082 ms,SQL statement insert into rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl) I have read several topics [1, 2, 3, 4] with similar problems but haven't find a good solution. Below is some more diagnostics. I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4. Host is running with the following CPU (32 cores) and memory: root@rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz root@rpopdb01e ~ # free -m total used free sharedbuffers cached Mem:129028 123558 5469 0135 119504 -/+ buffers/cache: 3918 125110 Swap
Re: [PERFORM] Problem with ExclusiveLock on inserts
Vladimir, pgbouncer works with pl/proxy in transaction pooling mode. A wide spread phrase that statement mode is for plproxy does not mean any limitations for transaction pooling mode until you have atocommit on client. Anyway, try to reduce connections. try to set your autovacuum a bit more aggressive: autovacuum_analyze_scale_ factor=0.05 #or like that autovacuum_analyze_threshold=5 autovacuum_freeze_max_age=2 autovacuum_max_workers=20 # that is fine for slow disks autovacuum_naptime=1 autovacuum_vacuum_cost_delay=5 # or at least 10 autovacuum_vacuum_cost_limit =-1 autovacuum_vacuum_scale_factor=0.01 # this setting is to be really aggressive, otherwise you simply postpone huge vacuums and related disk io, smaller portions are better autovacuum_vacuum_threshold=20 probably you will also need some ionice for autovacuum workers On Thu, Feb 13, 2014 at 1:13 PM, Ilya Kosmodemiansky ilya.kosmodemian...@postgresql-consulting.com wrote: Vladimir, pgbouncer works with pl/proxy in transaction pooling mode. A wide spread phrase that statement mode is for plproxy does not mean any limitations for transaction pooling mode until you have atocommit on client. Anyway, try to reduce connections. try to set your autovacuum a bit more aggressive: autovacuum_analyze_scale_factor=0.05 #or like that autovacuum_analyze_threshold=5 autovacuum_freeze_max_age=2 autovacuum_max_workers=20 # that is fine for slow disks autovacuum_naptime=1 autovacuum_vacuum_cost_delay=5 # or at least 10 autovacuum_vacuum_cost_limit =-1 autovacuum_vacuum_scale_factor=0.01 # this setting is to be really aggressive, otherwise you simply postpone huge vacuums and related disk io, smaller portions are better autovacuum_vacuum_threshold=20 probably you will also need some ionice for autovacuum workers On Thu, Feb 13, 2014 at 11:26 AM, Бородин Владимир r...@simply.namewrote: 13.02.2014, в 13:29, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Vladimir, And, any effect on your problem? It worked without problems longer than previous configuration but repeated again several minutes ago :( On Thu, Feb 13, 2014 at 9:35 AM, Бородин Владимир r...@simply.name wrote: I have limited max connections to 1000, reduced shared buffers to 8G and restarted postgres. 1000 is still to much in most cases. With pgbouncer in transaction pooling mode normaly pool size 8-32, max_connections = 100 (default value) and client_connections 500-1500 looks more reasonable. Clients for this db are plproxy hosts. As far as I know plproxy can work only with statement pooling. I have also noticed that this big tables stopped vacuuming automatically a couple of weeks ago. It could be the reason of the problem, I will now try to tune autovacuum parameters to turn it back. But yesterday I ran vacuum analyze for all relations manually but that did not help. How do your autovacuum parameters look like now? They were all default except for vacuum_defer_cleanup_age = 10. I have increased autovacuum_max_workers = 20 because I have 10 databases with about 10 tables each. That did not make better (I haven't seen more than two auto vacuum workers simultaneously). Then I have tried to set vacuum_cost_limit = 1000. Still not vacuuming big tables. Right now the parameters look like this: root@rpopdb01e ~ # fgrep vacuum /var/lib/pgsql/9.3/data/conf.d/postgresql.conf #vacuum_cost_delay = 0 # 0-100 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits vacuum_cost_limit = 1000# 1-1 credits vacuum_defer_cleanup_age = 10 # number of xacts by which cleanup is delayed autovacuum = on # Enable autovacuum subprocess? 'on' log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and autovacuum_max_workers = 20 # max number of autovacuum subprocesses #autovacuum_naptime = 1min # time between autovacuum runs #autovacuum_vacuum_threshold = 50 # min number of row updates before # vacuum #autovacuum_analyze_threshold = 50 # min number of row updates before #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze #autovacuum_freeze_max_age = 2 # maximum XID age before forced vacuum #autovacuum_vacuum_cost_delay = 20ms# default vacuum cost delay for # autovacuum, in milliseconds; # -1 means use vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovacuum, -1 means use
Re: [PERFORM] Problem with ExclusiveLock on inserts
Hi Vladimir, Just in case: how is your ext4 mount? Best regards, Ilya On Feb 12, 2014, at 17:59, Бородин Владимир r...@simply.name wrote: Hi all. Today I have started getting errors like below in logs (seems that I have not changed anything for last week). When it happens the db gets lots of connections in state active, eats 100% cpu and clients get errors (due to timeout). 2014-02-12 15:44:24.562 MSK,rpop,rpopdb_p6,30061,localhost:58350,52fb5e53.756d,1,SELECT waiting,2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,0,process 30061 still waiting for ExclusiveLock on extension of relation 26118 of database 24590 after 1000.082 ms,SQL statement insert into rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl) I have read several topics [1, 2, 3, 4] with similar problems but haven't find a good solution. Below is some more diagnostics. I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4. Host is running with the following CPU (32 cores) and memory: root@rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo model name: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz root@rpopdb01e ~ # free -m total used free sharedbuffers cached Mem:129028 123558 5469 0135 119504 -/+ buffers/cache: 3918 125110 Swap:16378 0 16378 root@rpopdb01e ~ # PGDATA lives on RAID6 array of 8 ssd-disks with ext4, iostat and atop say the disks are really free. Right now PGDATA takes only 95G. The settings changed in postgresql.conf are here [5]. When it happens the last query from here [6] shows that almost all queries are waiting for ExclusiveLock, but they do a simple insert. (extend,26647,26825,,,) |5459 | ExclusiveLock | 1 | (extend,26647,26825,,,) | 8053 | ExclusiveLock | 5459,8053 (extend,26647,26828,,,) |5567 | ExclusiveLock | 1 | (extend,26647,26828,,,) | 5490 | ExclusiveLock | 5567,5490 (extend,24584,25626,,,) |5611 | ExclusiveLock | 1 | (extend,24584,25626,,,) | 3963 | ExclusiveLock | 5611,3963 I have several databases running on one host with one postmaster process and ExclusiveLock is being waited by many oids. I suppose the only common thing for all of them is that they are bigger than others and they almost do not get updates and deletes (only inserts and reads). Some more info about one of such tables is here [7]. I have tried to look at the source code (src/backend/access/heap/hio.c) to understand when the exclusive lock can be taken, but I could only read comments :) I have also examined FSM for this tables and their indexes and found that for most of them there are free pages but there are, for example, such cases: rpopdb_p0=# select count(*) from pg_freespace('rpop.rpop_uidl') where avail != 0; count 115953 (1 row) rpopdb_p0=# select count(*) from pg_freespace('rpop.pk_rpop_uidl') where avail != 0; count --- 0 (1 row) rpopdb_p0=# \dS+ rpop.rpop_uidl Table rpop.rpop_uidl Column | Type | Modifiers | Storage | Stats target | Description ++---+--+--+- popid | bigint | not null | plain| | uidl | character varying(200) | not null | extended | | Indexes: pk_rpop_uidl PRIMARY KEY, btree (popid, uidl) Has OIDs: no rpopdb_p0=# My questions are: 1. Do we consume 100% cpu (in system) trying to get page from FSM? Or does it happen during exclusive lock acquiring? How can I dig it? 2. How much space do we extend to the relation when we get exclusive lock on it? 3. Why extended page is not visible for other backends? 4. Is there any possibility of situation where backend A got exclusive lock on some relation to extend it. Then OS CPU scheduler made a context switch to backend B while backend B is waiting for exclusive lock on the same relation. And so on for many backends. 5. (and the main question) what can I do to get rid of such situations? It is a production cluster and I do not have any ideas what to do with this situation :( Any help would be really appropriate. [1] http://www.postgresql.org/message-id/8bca3aa10906011613l8ac2423h8153bbd2513dc...@mail.gmail.com [2] http://pgsql.performance.narkive.com/IrkPbl3f/postgresql-9-2-3-performance-problem-caused-exclusive-locks [3] http://www.postgresql.org/message-id/50a2c93e.9070...@dalibo.com [4] http://www.postgresql.org/message-id/cal_0b1sypyeoynkynv95nnv2d+4jxtug3hkkf6fahfw7gvg...@mail.gmail.com [5] http://pastebin.com/raw.php?i=Bd40Vn6h [6] http://wiki.postgresql.org/wiki/Lock_dependency_information [7 http://pastebin.com/raw.php?i=eGrtG524] -- Vladimir
Re: [PERFORM] Problem with ExclusiveLock on inserts
My question was actually about barrier option, by default it is enabled on RHEL6/ext4 and could cause serious bottleneck on io before disks are actually involved. What says mount without arguments? On Feb 12, 2014, at 18:43, Бородин Владимир r...@simply.name wrote: root@rpopdb01e ~ # fgrep data /etc/fstab UUID=f815fd3f-e4e4-43a6-a6a1-bce1203db3e0 /var/lib/pgsql/9.3/data ext4 noatime,nodiratime 0 1 root@rpopdb01e ~ # According to iostat the disks are not the bottleneck. 12.02.2014, в 21:30, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Hi Vladimir, Just in case: how is your ext4 mount? Best regards, Ilya On Feb 12, 2014, at 17:59, Бородин Владимир r...@simply.name wrote: Hi all. Today I have started getting errors like below in logs (seems that I have not changed anything for last week). When it happens the db gets lots of connections in state active, eats 100% cpu and clients get errors (due to timeout). 2014-02-12 15:44:24.562 MSK,rpop,rpopdb_p6,30061,localhost:58350,52fb5e53.756d,1,SELECT waiting,2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,0,process 30061 still waiting for ExclusiveLock on extension of relation 26118 of database 24590 after 1000.082 ms,SQL statement insert into rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl) I have read several topics [1, 2, 3, 4] with similar problems but haven't find a good solution. Below is some more diagnostics. I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4. Host is running with the following CPU (32 cores) and memory: root@rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz root@rpopdb01e ~ # free -m total used free sharedbuffers cached Mem:129028 123558 5469 0135 119504 -/+ buffers/cache: 3918 125110 Swap:16378 0 16378 root@rpopdb01e ~ # PGDATA lives on RAID6 array of 8 ssd-disks with ext4, iostat and atop say the disks are really free. Right now PGDATA takes only 95G. The settings changed in postgresql.conf are here [5]. When it happens the last query from here [6] shows that almost all queries are waiting for ExclusiveLock, but they do a simple insert. (extend,26647,26825,,,) |5459 | ExclusiveLock | 1 | (extend,26647,26825,,,) | 8053 | ExclusiveLock | 5459,8053 (extend,26647,26828,,,) |5567 | ExclusiveLock | 1 | (extend,26647,26828,,,) | 5490 | ExclusiveLock | 5567,5490 (extend,24584,25626,,,) |5611 | ExclusiveLock | 1 | (extend,24584,25626,,,) | 3963 | ExclusiveLock | 5611,3963 I have several databases running on one host with one postmaster process and ExclusiveLock is being waited by many oids. I suppose the only common thing for all of them is that they are bigger than others and they almost do not get updates and deletes (only inserts and reads). Some more info about one of such tables is here [7]. I have tried to look at the source code (src/backend/access/heap/hio.c) to understand when the exclusive lock can be taken, but I could only read comments :) I have also examined FSM for this tables and their indexes and found that for most of them there are free pages but there are, for example, such cases: rpopdb_p0=# select count(*) from pg_freespace('rpop.rpop_uidl') where avail != 0; count 115953 (1 row) rpopdb_p0=# select count(*) from pg_freespace('rpop.pk_rpop_uidl') where avail != 0; count --- 0 (1 row) rpopdb_p0=# \dS+ rpop.rpop_uidl Table rpop.rpop_uidl Column | Type | Modifiers | Storage | Stats target | Description ++---+--+--+- popid | bigint | not null | plain| | uidl | character varying(200) | not null | extended | | Indexes: pk_rpop_uidl PRIMARY KEY, btree (popid, uidl) Has OIDs: no rpopdb_p0=# My questions are: 1. Do we consume 100% cpu (in system) trying to get page from FSM? Or does it happen during exclusive lock acquiring? How can I dig it? 2. How much space do we extend to the relation when we get exclusive lock on it? 3. Why extended page is not visible for other backends? 4. Is there any possibility of situation where backend A got exclusive lock on some relation to extend it. Then OS CPU scheduler made a context switch to backend B while backend B is waiting for exclusive lock on the same relation. And so on for many backends. 5. (and the main question) what can I do to get rid of such situations? It is a production cluster and I do not have any ideas what to do with this situation :( Any help would be really appropriate. [1] http://www.postgresql.org/message-id
Re: [PERFORM] Problem with ExclusiveLock on inserts
another thing which is arguable - concurrency degree. How many of your max_connections = 4000 are actually running? 4000 definitely looks like an overkill and they could be a serious source of concurrency, especially then you have had barrier enabled and software raid. Plus for 32Gb of shared buffers with synchronous_commit = on especially on heavy workload one should definitely have bbu, otherwise performance will be poor. On Wed, Feb 12, 2014 at 8:20 PM, Бородин Владимир r...@simply.name wrote: Oh, I haven't thought about barriers, sorry. Although I use soft raid without batteries I have turned barriers off on one cluster shard to try. root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime) root@rpopdb01e ~ # mount -o remount,nobarrier /dev/md2 root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime,nobarrier) root@rpopdb01e ~ # 12.02.2014, в 21:56, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): My question was actually about barrier option, by default it is enabled on RHEL6/ext4 and could cause serious bottleneck on io before disks are actually involved. What says mount without arguments? On Feb 12, 2014, at 18:43, Бородин Владимир r...@simply.name wrote: root@rpopdb01e ~ # fgrep data /etc/fstab UUID=f815fd3f-e4e4-43a6-a6a1-bce1203db3e0 /var/lib/pgsql/9.3/data ext4 noatime,nodiratime 0 1 root@rpopdb01e ~ # According to iostat the disks are not the bottleneck. 12.02.2014, в 21:30, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Hi Vladimir, Just in case: how is your ext4 mount? Best regards, Ilya On Feb 12, 2014, at 17:59, Бородин Владимир r...@simply.name wrote: Hi all. Today I have started getting errors like below in logs (seems that I have not changed anything for last week). When it happens the db gets lots of connections in state active, eats 100% cpu and clients get errors (due to timeout). 2014-02-12 15:44:24.562 MSK,rpop,rpopdb_p6,30061,localhost:58350,52fb5e53.756d,1,SELECT waiting,2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,0,process 30061 still waiting for ExclusiveLock on extension of relation 26118 of database 24590 after 1000.082 ms,SQL statement insert into rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl) I have read several topics [1, 2, 3, 4] with similar problems but haven't find a good solution. Below is some more diagnostics. I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4. Host is running with the following CPU (32 cores) and memory: root@rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz root@rpopdb01e ~ # free -m total used free sharedbuffers cached Mem:129028 123558 5469 0135 119504 -/+ buffers/cache: 3918 125110 Swap:16378 0 16378 root@rpopdb01e ~ # PGDATA lives on RAID6 array of 8 ssd-disks with ext4, iostat and atop say the disks are really free. Right now PGDATA takes only 95G. The settings changed in postgresql.conf are here [5]. When it happens the last query from here [6] shows that almost all queries are waiting for ExclusiveLock, but they do a simple insert. (extend,26647,26825,,,) |5459 | ExclusiveLock | 1 | (extend,26647,26825,,,) | 8053 | ExclusiveLock | 5459,8053 (extend,26647,26828,,,) |5567 | ExclusiveLock | 1 | (extend,26647,26828,,,) | 5490 | ExclusiveLock | 5567,5490 (extend,24584,25626,,,) |5611 | ExclusiveLock | 1 | (extend,24584,25626,,,) | 3963 | ExclusiveLock | 5611,3963 I have several databases running on one host with one postmaster process and ExclusiveLock is being waited by many oids. I suppose the only common thing for all of them is that they are bigger than others and they almost do not get updates and deletes (only inserts and reads). Some more info about one of such tables is here [7]. I have tried to look at the source code (src/backend/access/heap/hio.c) to understand when the exclusive lock can be taken, but I could only read comments :) I have also examined FSM for this tables and their indexes and found that for most of them there are free pages but there are, for example, such cases: rpopdb_p0=# select count(*) from pg_freespace('rpop.rpop_uidl') where avail != 0; count 115953 (1 row) rpopdb_p0=# select count(*) from pg_freespace('rpop.pk_rpop_uidl') where avail != 0; count --- 0 (1 row) rpopdb_p0=# \dS+ rpop.rpop_uidl Table rpop.rpop_uidl Column | Type | Modifiers | Storage | Stats target | Description ++---+--+--+- popid | bigint | not null | plain| | uidl
Re: [PERFORM] Problem with ExclusiveLock on inserts
On Wed, Feb 12, 2014 at 8:57 PM, Бородин Владимир r...@simply.name wrote: Yes, this is legacy, I will fix it. We had lots of inactive connections but right now we use pgbouncer for this. When the workload is normal we have some kind of 80-120 backends. Less than 10 of them are in active state. Having problem with locks we get lots of sessions (sometimes more than 1000 of them are in active state). According to vmstat the number of context switches is not so big (less than 20k), so I don't think it is the main reason. Yes, it can aggravate the problem, but imho not create it. I'am afraid that is the problem. More than 1000 backends, most of them are simply waiting. I don't understand the correlation of shared buffers size and synchronous_commit. Could you please explain your statement? You need to fsync your huge shared buffers any time your database performs checkpoint. By default it usually happens too often because checkpoint_timeout is 5min by default. Without bbu, on software raid that leads to io spike and you commit waits for wal. 12.02.2014, в 23:37, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): another thing which is arguable - concurrency degree. How many of your max_connections = 4000 are actually running? 4000 definitely looks like an overkill and they could be a serious source of concurrency, especially then you have had barrier enabled and software raid. Plus for 32Gb of shared buffers with synchronous_commit = on especially on heavy workload one should definitely have bbu, otherwise performance will be poor. On Wed, Feb 12, 2014 at 8:20 PM, Бородин Владимир r...@simply.name wrote: Oh, I haven't thought about barriers, sorry. Although I use soft raid without batteries I have turned barriers off on one cluster shard to try. root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime) root@rpopdb01e ~ # mount -o remount,nobarrier /dev/md2 root@rpopdb01e ~ # mount | fgrep data /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime,nobarrier) root@rpopdb01e ~ # 12.02.2014, в 21:56, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): My question was actually about barrier option, by default it is enabled on RHEL6/ext4 and could cause serious bottleneck on io before disks are actually involved. What says mount without arguments? On Feb 12, 2014, at 18:43, Бородин Владимир r...@simply.name wrote: root@rpopdb01e ~ # fgrep data /etc/fstab UUID=f815fd3f-e4e4-43a6-a6a1-bce1203db3e0 /var/lib/pgsql/9.3/data ext4 noatime,nodiratime 0 1 root@rpopdb01e ~ # According to iostat the disks are not the bottleneck. 12.02.2014, в 21:30, Ilya Kosmodemiansky hydrobi...@gmail.com написал(а): Hi Vladimir, Just in case: how is your ext4 mount? Best regards, Ilya On Feb 12, 2014, at 17:59, Бородин Владимир r...@simply.name wrote: Hi all. Today I have started getting errors like below in logs (seems that I have not changed anything for last week). When it happens the db gets lots of connections in state active, eats 100% cpu and clients get errors (due to timeout). 2014-02-12 15:44:24.562 MSK,rpop,rpopdb_p6,30061,localhost:58350,52fb5e53.756d,1,SELECT waiting,2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,0,process 30061 still waiting for ExclusiveLock on extension of relation 26118 of database 24590 after 1000.082 ms,SQL statement insert into rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl) I have read several topics [1, 2, 3, 4] with similar problems but haven't find a good solution. Below is some more diagnostics. I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4. Host is running with the following CPU (32 cores) and memory: root@rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz root@rpopdb01e ~ # free -m total used free sharedbuffers cached Mem:129028 123558 5469 0135 119504 -/+ buffers/cache: 3918 125110 Swap:16378 0 16378 root@rpopdb01e ~ # PGDATA lives on RAID6 array of 8 ssd-disks with ext4, iostat and atop say the disks are really free. Right now PGDATA takes only 95G. The settings changed in postgresql.conf are here [5]. When it happens the last query from here [6] shows that almost all queries are waiting for ExclusiveLock, but they do a simple insert. (extend,26647,26825,,,) |5459 | ExclusiveLock | 1 | (extend,26647,26825,,,) | 8053 | ExclusiveLock | 5459,8053 (extend,26647,26828,,,) |5567 | ExclusiveLock | 1 | (extend,26647,26828,,,) | 5490 | ExclusiveLock | 5567,5490 (extend,24584,25626,,,) |5611 | ExclusiveLock | 1 | (extend,24584,25626,,,) | 3963 | ExclusiveLock | 5611,3963 I have several databases running on one host