Re: [PERFORM] Postgres Replaying WAL slowly
On 06/29/2014 03:43 PM, Soni M wrote: top and sar says 100% cpu usage of one core, no sign of I/O wait. Hmm, I wonder what it's doing then... If you have perf installed on the system, you can do perf top to get a quick overlook of where the CPU time is spent. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GIST optimization to limit calls to operator on sub nodes
Le 29/06/2014 22:14, Emre Hasegeli a écrit : Pujol Mathieu mathieu.pu...@realfusio.com: Hello, I already post my question in the General Mailing list, but without succeed so I try this one that seems to me more specialized. My question is about GIST index. I made my own index to handle specific data and operators. It works pretty fine but I wonder if it was possible to optimize it. When I run my operator on a GIST node (in the method gist_range_consistent) it returns NotConsistent / MaybeConsistent / FullyConsistent. NotConsistent - means that all subnodes could be ignored, gist_range_consistent return false MaybeConsistent - means that at least one subnode/leaf will be consistent, gist_range_consistent return true FullyConsistent - means that all subnodes/leaves will be consistent, gist_range_consistent return true So like with the recheck flag I would like to know if there is a way to notify postgres that it is not necessary to rerun my operator on subnodes, to speedup the search. I do not think it is possible at the moment. The GiST framework can be extended to support this use case. I am not sure about the speedup. Most of the consistent functions do not seem very expensive compared to other operations of the GiST framework. I would be happy to test it, if you would implement. Thanks for your reply. I am not sure to have time to develop inside the framework, but if I try I'll let you know my results. In my case the consistent function is costly and the number of row important so this optimization could save several hundred tests on a single request. -- Mathieu PUJOL Ingénieur Réalité Virtuelle REAL FUSIO - 3D Computer Graphics 10, rue des arts - 31000 TOULOUSE - FRANCE mathieu.pu...@realfusio.com - http://www.realfusio.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Guidelines on best indexing strategy for varying searches on 20+ columns
Thanks for your suggestions, very useful. See comments inline: Den 25/06/2014 kl. 23.48 skrev Merlin Moncure mmonc...@gmail.com: On Wed, Jun 25, 2014 at 3:48 AM, Niels Kristian Schjødt nielskrist...@autouncle.com wrote: Hi, I’m running a search engine for cars. It’s backed by a postgresql 9.3 installation. Now I’m unsure about the best approach/strategy on doing index optimization for the fronted search. The problem: The table containing the cars holds a around 1,5 million rows. People that searches for cars needs different criteria to search by. Some search by brand/model, some by year, some by mileage, some by price and some by special equipment etc. etc. - and often they combine a whole bunch of criteria together. Of cause some, like brand/mode and price, are used more frequently than others. In total we offer: 9 category criteria like brand/model or body type, plus 5 numeric criteria like price or mileage, plus 12 boolean criteria like equipment. Lastly people can order the results by different columns (year, price, mileage and a score we create about the cars). By default we order by our own generated score. What I’ve done so far: I have analyzed the usage of the criteria “lightly”, and created a few indexes (10). Among those, are e.g. indexes on price, mileage and a combined index on brand/model. Since we are only interested in showing results for cars which is actually for sale, the indexes are made as partial indexes on a sales state column. Questions: 1. How would you go about analyzing and determining what columns should be indexed, and how? mainly frequency of access. 2. What is the best strategy when optimizing indexes for searches happening on 20 + columns, where the use and the combinations varies a lot? (To just index everything, to index some of the columns, to do combined indexes, to only do single column indexes etc. etc.) don't make 20 indexes. consider installing pg_trgm (for optimized LIKE searching) or hstore (for optmized key value searching) and then using GIST/GIN for multiple attribute search. with 9.4 we have another fancy technique to explore: jsonb searching via GIST/GIN. Interesting, do you have any good resources on this approach? 3. I expect that it does not make sense to index all columns? well, maybe. if you only ever search one column at a time, then it might make sense. but if you need to search arbitrary criteria and frequently combine a large number, then no -- particularly if your dataset is very large and individual criteria are not very selective. So, to just clarify: I’m often combining a large number of search criteria and the individual criteria is often not very selective, in that case, are you arguing for or against indexing all columns? :-) 4. I expect it does not make sense to index boolean columns? in general, no. an important exception is if you are only interested in true or false and the number of records that have that interesting value is tiny relative to the size of the table. in that case, a partial index can be used for massive optimization. Thanks, hadn’t been thinking about using partial indexes here as an option. 5. Is it better to do a combined index on 5 frequently used columns rather than having individual indexes on each of them? Only if you search those 5 columns together a significant portion of the time. 6. Would it be a goof idea to have all indexes sorted by my default sorting? index order rarely matters. if you always search values backwards and the table is very large you may want to consider it. unfortunately this often doesn't work for composite indexes so sometimes we must explore the old school technique of reversing the value. 7. Do you have so experiences with other approaches that could greatly improve performance (e.g. forcing indexes to stay in memory etc.)? as noted above, fancy indexing is the first place to look. start with pg_trgm (for like optmization), hstore, and the new json stuff. the big limitation you will hit is that that most index strategies, at least fo the prepackaged stuff will support '=', or partial string (particularly with pg_trgm like), but not or : for range operations you have to post process the search or try to work the index from another angle. merlin
Re: [PERFORM] GIST optimization to limit calls to operator on sub nodes
Le 29/06/2014 22:30, Tom Lane a écrit : Emre Hasegeli e...@hasegeli.com writes: Pujol Mathieu mathieu.pu...@realfusio.com: I made my own index to handle specific data and operators. It works pretty fine but I wonder if it was possible to optimize it. When I run my operator on a GIST node (in the method gist_range_consistent) it returns NotConsistent / MaybeConsistent / FullyConsistent. NotConsistent - means that all subnodes could be ignored, gist_range_consistent return false MaybeConsistent - means that at least one subnode/leaf will be consistent, gist_range_consistent return true FullyConsistent - means that all subnodes/leaves will be consistent, gist_range_consistent return true So like with the recheck flag I would like to know if there is a way to notify postgres that it is not necessary to rerun my operator on subnodes, to speedup the search. I do not think it is possible at the moment. The GiST framework can be extended to support this use case. I am not sure about the speedup. Most of the consistent functions do not seem very expensive compared to other operations of the GiST framework. I would be happy to test it, if you would implement. I don't actually understand what's being requested here that the NotConsistent case doesn't already cover. regards, tom lane Hi, The NotConsistent case is correctly covered, the sub nodes are not tested because I know that no child could pass the consistent_test. The MaybeConsistent case is also correctly covered, all sub nodes are tested because I don't know which sub nodes will pass the consistent_test. My problem is with the FullyConsistent, because when I test a node I can know that all it's childs nodes and leaves will pass the test, so I want to notify GIST framework that it can't skip consistent test on those nodes. Like we can notify it when testing a leaf that it could skip consistent test on the row. Maybe I miss something on the API to do that. On my tests, the recheck_flag works only for leaves. Thanks Mathieu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Volatility - docs vs behaviour?
Hi all The docs say: For best optimization results, you should label your functions with the strictest volatility category that is valid for them. http://www.postgresql.org/docs/current/interactive/xfunc-volatility.html ... but I recall discussion here suggesting that in fact IMMUTABLE functions may not be inlined where you'd expect, e.g. http://www.postgresql.org/message-id/CAFj8pRBF3Qr7WtQwO1H_WN=hhfgk0semwhde+odz3iyv-tr...@mail.gmail.com That's always seemed counter to my expectations. Am I just misunderstanding? Tom's comment seemed to confirm what Pavel was saying. I know STRICT can prevent inlining (unfortunately, though necessarily), but it seems inexplicable that IMMUTABLE should. If it can, then the documentation is wrong. Which is it? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GIST optimization to limit calls to operator on sub nodes
Pujol Mathieu mathieu.pu...@realfusio.com writes: Le 29/06/2014 22:30, Tom Lane a écrit : I don't actually understand what's being requested here that the NotConsistent case doesn't already cover. The NotConsistent case is correctly covered, the sub nodes are not tested because I know that no child could pass the consistent_test. The MaybeConsistent case is also correctly covered, all sub nodes are tested because I don't know which sub nodes will pass the consistent_test. My problem is with the FullyConsistent, because when I test a node I can know that all it's childs nodes and leaves will pass the test, so I want to notify GIST framework that it can't skip consistent test on those nodes. Like we can notify it when testing a leaf that it could skip consistent test on the row. Maybe I miss something on the API to do that. On my tests, the recheck_flag works only for leaves. Hm ... that doesn't seem like a case that'd come up often enough to be worth complicating the APIs for, unless maybe you are expecting a lot of exact-duplicate index entries. If you are, you might find that GIN is a better fit for your problem than GIST --- it's designed to be efficient for lots-of-duplicates. Another view of this is that if you can make exact satisfaction checks at upper-page entries, you're probably storing too much information in the index entries (and thereby bloating the index). The typical tradeoff in GIST indexes is something like storing bounding boxes for geometric objects --- which is necessarily lossy, but it results in small indexes that are fast to search. It's particularly important for upper-page entries to be small, so that fanout is high and you have a better chance of keeping all the upper pages in cache. If you've got a compelling example where this actually makes sense, I'd be curious to hear the details. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 0.41% perf [.] 0x0005f225 0.39% libc-2.12.so [.] __strstr_sse2 0.22% libc-2.12.so [.] memchr 0.22% [kernel] [k] kallsyms_expand_symbol 0.18% perf [.] symbols__insert 0.18% [kernel] [k] format_decode 0.15% libc-2.12.so [.] __GI___strcmp_ssse3 0.13% [kernel] [k] string 0.12% [kernel] [k] number 0.12% [kernel] [k] vsnprintf 0.12% libc-2.12.so [.] _IO_vfscanf 0.11% perf [.] dso__find_symbol 0.11% [kernel] [k] _spin_unlock_irqrestore 0.10% perf [.] hex2u64 0.10% postgres [.] hash_search_with_hash_value 0.09% perf [.] rb_next 0.08% libc-2.12.so [.] memcpy 0.07% libc-2.12.so [.] __strchr_sse2 0.07% [kernel] [k] clear_page 0.06% [kernel] [k] strnlen 0.05% perf [.] perf_evsel__parse_sample 0.05% perf [.] rb_insert_color 0.05% [kernel] [k] pointer On Mon, Jun 30, 2014 at 2:05 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 06/29/2014 03:43 PM, Soni M wrote: top and sar says 100% cpu usage of one core, no sign of I/O wait. Hmm, I wonder what it's doing then... If you have perf installed on the system, you can do perf top to get a quick overlook of where the CPU time is spent. - Heikki -- Regards, Soni Maula Harriz
Re: [PERFORM] Volatility - docs vs behaviour?
Craig Ringer cr...@2ndquadrant.com writes: The docs say: For best optimization results, you should label your functions with the strictest volatility category that is valid for them. Yeah ... ... but I recall discussion here suggesting that in fact IMMUTABLE functions may not be inlined where you'd expect, e.g. http://www.postgresql.org/message-id/CAFj8pRBF3Qr7WtQwO1H_WN=hhfgk0semwhde+odz3iyv-tr...@mail.gmail.com The reason that case behaved surprisingly was exactly that the user had violated the above bit of documentation, ie, he'd marked the function *incorrectly* as being immutable when in fact its contained functions were only stable. I know STRICT can prevent inlining (unfortunately, though necessarily), but it seems inexplicable that IMMUTABLE should. I don't see why you find that inexplicable. If the planner were to inline this function, it would then fail to reduce a call with constant argument to a constant, which is presumably what the user desires from marking it immutable (questions of correctness in the face of timezone changes notwithstanding). Just as we keep the wrapper on when it's necessary to hide possible non-strictness of the body of a function, we must do so when inlining would raise the visible volatility of an expression. It's true that the above-quoted bit of advice presumes that you correctly identify the strictest volatility category that is valid for a given function. If you're too lazy or uninformed to do that, it might be better to leave the settings at defaults (volatile/nonstrict) and hope the planner can figure out that it's safe to inline anyway. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 06/30/2014 05:46 PM, Soni M wrote: Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 Ok, so it's stuck doing something.. Can you get build with debug symbols installed, so that we could see the function name? - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 9:14 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 06/30/2014 05:46 PM, Soni M wrote: Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 Ok, so it's stuck doing something.. Can you get build with debug symbols installed, so that we could see the function name? - Heikki Looks like StandbyReleaseLocks: Samples: 10K of event 'cpu-clock', Event count (approx.): 8507 89.21% postgres [.] StandbyReleaseLocks 0.89% libc-2.12.so [.] __strstr_sse2 0.83% perf [.] 0x0005f1e5 0.74% [kernel] [k] kallsyms_expand_symbol 0.52% libc-2.12.so [.] memchr 0.47% perf [.] symbols__insert 0.47% [kernel] [k] format_decode -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 2014-06-30 19:14:24 +0300, Heikki Linnakangas wrote: On 06/30/2014 05:46 PM, Soni M wrote: Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 Ok, so it's stuck doing something.. Can you get build with debug symbols installed, so that we could see the function name? My guess it's a spinlock, probably xlogctl-info_lck via RecoveryInProgress(). Unfortunately inline assembler doesn't always seem to show up correctly in profiles... What worked for me was to build with -fno-omit-frame-pointer - that normally shows the callers, even if it can't generate a proper symbol name. Soni: Do you use Hot Standby? Are there connections active while you have that problem? Any other processes with high cpu load? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund and...@2ndquadrant.com wrote: My guess it's a spinlock, probably xlogctl-info_lck via RecoveryInProgress(). Unfortunately inline assembler doesn't always seem to show up correctly in profiles... What worked for me was to build with -fno-omit-frame-pointer - that normally shows the callers, even if it can't generate a proper symbol name. Soni: Do you use Hot Standby? Are there connections active while you have that problem? Any other processes with high cpu load? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. -- Regards, Soni Maula Harriz
Re: [PERFORM] Guidelines on best indexing strategy for varying searches on 20+ columns
On Wed, Jun 25, 2014 at 1:48 AM, Niels Kristian Schjødt nielskrist...@autouncle.com wrote: Hi, I’m running a search engine for cars. It’s backed by a postgresql 9.3 installation. Now I’m unsure about the best approach/strategy on doing index optimization for the fronted search. The problem: The table containing the cars holds a around 1,5 million rows. People that searches for cars needs different criteria to search by. Some search by brand/model, some by year, some by mileage, some by price and some by special equipment etc. etc. - and often they combine a whole bunch of criteria together. Of cause some, like brand/mode and price, are used more frequently than others. In total we offer: 9 category criteria like brand/model or body type, plus 5 numeric criteria like price or mileage, plus 12 boolean criteria like equipment. Lastly people can order the results by different columns (year, price, mileage and a score we create about the cars). By default we order by our own generated score. What I’ve done so far: I have analyzed the usage of the criteria “lightly”, and created a few indexes (10). Among those, are e.g. indexes on price, mileage and a combined index on brand/model. Since we are only interested in showing results for cars which is actually for sale, the indexes are made as partial indexes on a sales state column. I'd probably partition the data on whether it is for sale, and then search only the for-sale partition. Questions: 1. How would you go about analyzing and determining what columns should be indexed, and how? I'd start out with intuition about which columns are likely to be used most often, and in a selective way. And followup by logging slow queries so they can be dissected at leisure. 2. What is the best strategy when optimizing indexes for searches happening on 20 + columns, where the use and the combinations varies a lot? (To just index everything, to index some of the columns, to do combined indexes, to only do single column indexes etc. etc.) There is no magic index. Based on your description, you are going to be seq scanning your table a lot. Focus on making it as small as possible, but vertical partitioning it so that the not-for-sale entries are hived off to an historical table, and horizontally partitioning it so that large columns rarely used in the where clause are in a separate table (Ideally you would tell postgresql to aggressively toast those columns, but there is no knob with which to do that) 3. I expect that it does not make sense to index all columns? You mean individually, or jointly? Either way, probably not. 4. I expect it does not make sense to index boolean columns? In some cases it can, for example if the data distribution is very lopsided and the value with the smaller side is frequently specified. 5. Is it better to do a combined index on 5 frequently used columns rather than having individual indexes on each of them? How often are the columns specified together? If they are completely independent it probably makes little sense to index them together. 6. Would it be a goof idea to have all indexes sorted by my default sorting? You don't get to choose. An btree index is sorted by the columns specified in the index, according to the operators specified (or defaulted). Unless you mean that you want to add the default sort column to be the lead column in each index, that actually might make sense. 7. Do you have so experiences with other approaches that could greatly improve performance (e.g. forcing indexes to stay in memory etc.)? If your queries are as unstructured as you imply, I'd forget about indexes for the most part, as you will have a hard time findings ones that work. Concentrate on making seq scans as fast as possible. If most of your queries end in something like ORDER by price limit 10 then concentrate on index scans over price. You will probably want to include heuristics in your UI such that if people configure queries to download half your database, you disallow that. You will probably find that 90% of the workload comes from people who are just playing around with your website and don't actually intend to do business with you. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 10:29 AM, Soni M diptat...@gmail.com wrote: On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund and...@2ndquadrant.com wrote: My guess it's a spinlock, probably xlogctl-info_lck via RecoveryInProgress(). Unfortunately inline assembler doesn't always seem to show up correctly in profiles... What worked for me was to build with -fno-omit-frame-pointer - that normally shows the callers, even if it can't generate a proper symbol name. Soni: Do you use Hot Standby? Are there connections active while you have that problem? Any other processes with high cpu load? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. compiled with -fno-omit-frame-pointer doesn't yield much more info: 76.24% postgres [.] StandbyReleaseLocks 2.64% libcrypto.so.1.0.1e[.] md5_block_asm_data_order 2.19% libcrypto.so.1.0.1e[.] RC4 2.17% postgres [.] RecordIsValid 1.20% [kernel] [k] copy_user_generic_unrolled 1.18% [kernel] [k] _spin_unlock_irqrestore 0.97% [vmxnet3] [k] vmxnet3_poll_rx_only 0.87% [kernel] [k] __do_softirq 0.77% [vmxnet3] [k] vmxnet3_xmit_frame 0.69% postgres [.] hash_search_with_hash_value 0.68% [kernel] [k] fin However, this server started progressing through the WAL files quite a bit better before I finished compiling, so we'll leave it running with this version and see if there's more info available the next time it starts replaying slowly.
Re: [PERFORM] Postgres Replaying WAL slowly
On 2014-06-30 11:34:52 -0700, Jeff Frost wrote: On Jun 30, 2014, at 10:29 AM, Soni M diptat...@gmail.com wrote: It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. compiled with -fno-omit-frame-pointer doesn't yield much more info: You'd need to do perf record -ga instead of perf record -a to see additional information. But: 76.24% postgres [.] StandbyReleaseLocks already is quite helpful. What are you doing on that system? Is there anything requiring large amounts of access exclusive locks on the primary? Possibly large amounts of temporary relations? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
2014-06-30 20:34 GMT+02:00 Jeff Frost j...@pgexperts.com: On Jun 30, 2014, at 10:29 AM, Soni M diptat...@gmail.com wrote: On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund and...@2ndquadrant.com wrote: My guess it's a spinlock, probably xlogctl-info_lck via RecoveryInProgress(). Unfortunately inline assembler doesn't always seem to show up correctly in profiles... For this kind of issues a systemtap or dtrace can be useful http://postgres.cz/wiki/Monitorov%C3%A1n%C3%AD_lwlocku_pomoc%C3%AD_systemtapu you can identify what locking is a problem - please, use a google translate Regards Pavel What worked for me was to build with -fno-omit-frame-pointer - that normally shows the callers, even if it can't generate a proper symbol name. Soni: Do you use Hot Standby? Are there connections active while you have that problem? Any other processes with high cpu load? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. compiled with -fno-omit-frame-pointer doesn't yield much more info: 76.24% postgres [.] StandbyReleaseLocks 2.64% libcrypto.so.1.0.1e[.] md5_block_asm_data_order 2.19% libcrypto.so.1.0.1e[.] RC4 2.17% postgres [.] RecordIsValid 1.20% [kernel] [k] copy_user_generic_unrolled 1.18% [kernel] [k] _spin_unlock_irqrestore 0.97% [vmxnet3] [k] vmxnet3_poll_rx_only 0.87% [kernel] [k] __do_softirq 0.77% [vmxnet3] [k] vmxnet3_xmit_frame 0.69% postgres [.] hash_search_with_hash_value 0.68% [kernel] [k] fin However, this server started progressing through the WAL files quite a bit better before I finished compiling, so we'll leave it running with this version and see if there's more info available the next time it starts replaying slowly.
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 11:39 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 11:34:52 -0700, Jeff Frost wrote: On Jun 30, 2014, at 10:29 AM, Soni M diptat...@gmail.com wrote: It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. compiled with -fno-omit-frame-pointer doesn't yield much more info: You'd need to do perf record -ga instead of perf record -a to see additional information. Ah! That's right. Here's how that looks: Samples: 473K of event 'cpu-clock', Event count (approx.): 473738 + 68.42% init [kernel.kallsyms] [k] native_safe_halt + 26.07% postgres postgres [.] StandbyReleaseLocks + 2.82% swapper [kernel.kallsyms] [k] native_safe_halt + 0.19% ssh libcrypto.so.1.0.1e [.] md5_block_asm_data_order + 0.19% postgres postgres [.] RecordIsValid + 0.16% ssh libcrypto.so.1.0.1e [.] RC4 + 0.10% postgres postgres [.] hash_search_with_hash_value + 0.06% postgres [kernel.kallsyms] [k] _spin_unlock_irqrestore + 0.05% init [vmxnet3] [k] vmxnet3_poll_rx_only + 0.04% postgres [kernel.kallsyms] [k] copy_user_generic_unrolled + 0.04% init [kernel.kallsyms] [k] finish_task_switch + 0.04% init [kernel.kallsyms] [k] __do_softirq + 0.04% ssh [kernel.kallsyms] [k] _spin_unlock_irqrestore + 0.04% ssh [vmxnet3] [k] vmxnet3_xmit_frame + 0.03% postgres postgres [.] PinBuffer + 0.03% init [vmxnet3] [k] vmxnet3_xmit_frame + 0.03% ssh [kernel.kallsyms] [k] copy_user_generic_unrolled + 0.03% postgres postgres [.] XLogReadBufferExtended + 0.03% ssh ssh [.] 0x0002aa07 + 0.03% init [kernel.kallsyms] [k] _spin_unlock_irqrestore + 0.03% ssh [vmxnet3] [k] vmxnet3_poll_rx_only + 0.02% ssh [kernel.kallsyms] [k] __do_softirq + 0.02% postgres libc-2.12.so [.] _wordcopy_bwd_dest_aligned + 0.02% postgres postgres [.] mdnblocks + 0.02% ssh libcrypto.so.1.0.1e [.] 0x000e25a1 + 0.02% scp [kernel.kallsyms] [k] copy_user_generic_unrolled + 0.02% ssh libc-2.12.so [.] memcpy + 0.02% postgres libc-2.12.so [.] memcpy But: 76.24% postgres [.] StandbyReleaseLocks already is quite helpful. What are you doing on that system? Is there anything requiring large amounts of access exclusive locks on the primary? Possibly large amounts of temporary relations? The last time we did a 100% logging run, the peak temp table creation was something like 120k/hr, but the replicas seemed able to keep up with that just fine. Hopefully Soni can answer whether that has increased significantly since May. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 12:17 PM, Jeff Frost j...@pgexperts.com wrote: already is quite helpful. What are you doing on that system? Is there anything requiring large amounts of access exclusive locks on the primary? Possibly large amounts of temporary relations? The last time we did a 100% logging run, the peak temp table creation was something like 120k/hr, but the replicas seemed able to keep up with that just fine. Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode: mode | count --+--- AccessExclusiveLock |11 AccessShareLock | 2089 ExclusiveLock|46 RowExclusiveLock |81 RowShareLock |17 ShareLock| 4 ShareUpdateExclusiveLock | 5 Seems to be relatively consistent. Of course, it's hard to say what it looked like back when the issue began.
Re: [PERFORM] Postgres Replaying WAL slowly
Jeff Frost j...@pgexperts.com writes: Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode: mode | count --+--- AccessExclusiveLock |11 AccessShareLock | 2089 ExclusiveLock|46 RowExclusiveLock |81 RowShareLock |17 ShareLock| 4 ShareUpdateExclusiveLock | 5 That's not too helpful if you don't pay attention to what the lock is on; it's likely that all the ExclusiveLocks are on transactions' own XIDs, which isn't relevant to the standby's behavior. The AccessExclusiveLocks are probably interesting though --- you should look to see what those are on. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Frost j...@pgexperts.com writes: Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode: mode | count --+--- AccessExclusiveLock |11 AccessShareLock | 2089 ExclusiveLock|46 RowExclusiveLock |81 RowShareLock |17 ShareLock| 4 ShareUpdateExclusiveLock | 5 That's not too helpful if you don't pay attention to what the lock is on; it's likely that all the ExclusiveLocks are on transactions' own XIDs, which isn't relevant to the standby's behavior. The AccessExclusiveLocks are probably interesting though --- you should look to see what those are on. You're right about the ExclusiveLocks. Here's how the AccessExclusiveLocks look: locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid| objsubid | virtualtransaction | pid |mode | granted --+--++--+---++---+-++--++---+-+- relation | 111285 | 3245291551 | | || | || | 233/170813 | 23509 | AccessExclusiveLock | t relation | 111285 | 3245292820 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292833 | | || | || | 173/1723993| 23407 | AccessExclusiveLock | t relation | 111285 | 3245287874 | | || | || | 133/3818415| 23348 | AccessExclusiveLock | t relation | 111285 | 3245292836 | | || | || | 173/1723993| 23407 | AccessExclusiveLock | t relation | 111285 | 3245292774 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292734 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292827 | | || | || | 173/1723993| 23407 | AccessExclusiveLock | t relation | 111285 | 3245288540 | | || | || | 133/3818415| 23348 | AccessExclusiveLock | t relation | 111285 | 3245292773 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292775 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292743 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292751 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245288669 | | || | || | 133/3818415| 23348 | AccessExclusiveLock | t relation | 111285 | 3245292817 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245288657 | | || | || | 133/3818415| 23348 | AccessExclusiveLock | t object | 111285 || | || | 2615 | 1246019760 |0 | 233/170813 | 23509 | AccessExclusiveLock | t relation | 111285 | 3245292746 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245287876 | | || | || | 133/3818415| 23348 | AccessExclusiveLock | t relation | 111285 | 3245292739 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292826 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292825 | | || | || | 5/22498235 | 23427 | AccessExclusiveLock | t relation | 111285 | 3245292832 | | |
Re: [PERFORM] Postgres Replaying WAL slowly
On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost j...@pgexperts.com wrote: And if you go fishing in pg_class for any of the oids, you don't find anything: That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass: SELECT relation::regclass, ... FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()); Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira matioli.math...@gmail.com wrote: On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost j...@pgexperts.com wrote: And if you go fishing in pg_class for any of the oids, you don't find anything: That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass: SELECT relation::regclass, ... FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()); Yah, i thought about that too, but verified I am in the correct DB. Just for clarity sake: SELECT relation::regclass FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()) and mode = 'AccessExclusiveLock'; relation 3245508214 3245508273 3245508272 3245508257 3245508469 3245508274 3245508373 3245508468 3245508210 3245508463 3245508205 3245508260 3245508265 3245508434 (16 rows)
Re: [PERFORM] Postgres Replaying WAL slowly
On 2014-06-30 12:57:56 -0700, Jeff Frost wrote: On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira matioli.math...@gmail.com wrote: On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost j...@pgexperts.com wrote: And if you go fishing in pg_class for any of the oids, you don't find anything: That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass: SELECT relation::regclass, ... FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()); Yah, i thought about that too, but verified I am in the correct DB. Just for clarity sake: So these are probably relations created in uncommitted transactions. Possibly ON COMMIT DROP temp tables? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 1:15 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 12:57:56 -0700, Jeff Frost wrote: On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira matioli.math...@gmail.com wrote: On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost j...@pgexperts.com wrote: And if you go fishing in pg_class for any of the oids, you don't find anything: That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass: SELECT relation::regclass, ... FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()); Yah, i thought about that too, but verified I am in the correct DB. Just for clarity sake: So these are probably relations created in uncommitted transactions. Possibly ON COMMIT DROP temp tables? That would make sense. There are definitely quite a few of those being used. Another item of note is the system catalogs are quite bloated: schemaname | tablename | tbloat | wastedmb | idxbloat | wastedidxmb +--++--+--+- pg_catalog | pg_attribute | 3945 | 106.51 | 2770 | 611.28 pg_catalog | pg_class | 8940 |45.26 | 4420 | 47.89 pg_catalog | pg_type | 4921 |18.45 | 5850 | 81.16 pg_catalog | pg_depend|933 |10.23 |11730 | 274.37 pg_catalog | pg_index | 3429 | 8.36 | 3920 | 24.24 pg_catalog | pg_shdepend |983 | 2.67 | 9360 | 30.56 (6 rows) Would that cause the replica to spin on StandbyReleaseLocks?
Re: [PERFORM] Postgres Replaying WAL slowly
Jeff Frost j...@pgexperts.com writes: On Jun 30, 2014, at 1:15 PM, Andres Freund and...@2ndquadrant.com wrote: So these are probably relations created in uncommitted transactions. Possibly ON COMMIT DROP temp tables? That would make sense. There are definitely quite a few of those being used. Uh-huh. I doubt that the mechanism that handles propagation of AccessExclusiveLocks to the standby is smart enough to ignore locks on temp tables :-( Another item of note is the system catalogs are quite bloated: Would that cause the replica to spin on StandbyReleaseLocks? AFAIK, no. It's an unsurprising consequence of heavy use of short-lived temp tables though. So it seems like we have a candidate explanation. I'm a bit surprised that StandbyReleaseLocks would get this slow if there are only a dozen AccessExclusiveLocks in place at any one time, though. Perhaps that was a low point and there are often many more? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 1:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another item of note is the system catalogs are quite bloated: Would that cause the replica to spin on StandbyReleaseLocks? AFAIK, no. It's an unsurprising consequence of heavy use of short-lived temp tables though. Yah, this has been an issue in the past, so we tend to cluster them regularly during off-hours to minimize the issue. So it seems like we have a candidate explanation. I'm a bit surprised that StandbyReleaseLocks would get this slow if there are only a dozen AccessExclusiveLocks in place at any one time, though. Perhaps that was a low point and there are often many more? Entirely possible that it was a low point. We'll set up some monitoring to track the number of AccessExclusiveLocks and see how much variance there is throughout the day. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 1:46 PM, Jeff Frost j...@pgexperts.com wrote: So it seems like we have a candidate explanation. I'm a bit surprised that StandbyReleaseLocks would get this slow if there are only a dozen AccessExclusiveLocks in place at any one time, though. Perhaps that was a low point and there are often many more? Entirely possible that it was a low point. We'll set up some monitoring to track the number of AccessExclusiveLocks and see how much variance there is throughout the day. Since we turned on the monitoring for that, we had a peak of 13,550 AccessExclusiveLocks. So far most of the samples have been in the double digit, with that and two other outliers: 6,118 and 12,747.
Re: [PERFORM] Postgres Replaying WAL slowly
Jeff Frost j...@pgexperts.com writes: So it seems like we have a candidate explanation. I'm a bit surprised that StandbyReleaseLocks would get this slow if there are only a dozen AccessExclusiveLocks in place at any one time, though. Perhaps that was a low point and there are often many more? Since we turned on the monitoring for that, we had a peak of 13,550 AccessExclusiveLocks. Ah ... that's more like a number I can believe something would have trouble coping with. Did you see a noticeable slowdown with this? Now that we've seen that number, of course it's possible there was an even higher peak occurring when you saw the trouble. Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe it just takes awhile to handle that many locks. Did you check whether the locks were all on temp tables of the ON COMMIT DROP persuasion? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On 2014-06-30 19:04:20 -0400, Tom Lane wrote: Jeff Frost j...@pgexperts.com writes: So it seems like we have a candidate explanation. I'm a bit surprised that StandbyReleaseLocks would get this slow if there are only a dozen AccessExclusiveLocks in place at any one time, though. Perhaps that was a low point and there are often many more? Since we turned on the monitoring for that, we had a peak of 13,550 AccessExclusiveLocks. Any chance the workload also uses lots of subtransactions? Ah ... that's more like a number I can believe something would have trouble coping with. Did you see a noticeable slowdown with this? Now that we've seen that number, of course it's possible there was an even higher peak occurring when you saw the trouble. Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe it just takes awhile to handle that many locks. I don't think there's a O(n^2) in StandbyReleaseLocks() itself, but in combination with StandbyReleaseLockTree() it looks possibly bad. The latter will call StandbyReleaseLocks() for every xid/subxid, and each of the StandbyReleaseLocks() will then trawl the entire RecoveryLockList... It'd probably be better to implement ReleaseLocksTree() by sorting the subxid list and bsearch that while iterating RecoveryLockList. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 4:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ah ... that's more like a number I can believe something would have trouble coping with. Did you see a noticeable slowdown with this? Now that we've seen that number, of course it's possible there was an even higher peak occurring when you saw the trouble. Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe it just takes awhile to handle that many locks. Did you check whether the locks were all on temp tables of the ON COMMIT DROP persuasion? Unfortunately not, because I went for a poor man's: SELECT count(*) FROM pg_locks WHERE mode = 'AccessExclusiveLock' run in cron every minute. That said, I'd bet it was mostly ON COMMIT DROP temp tables. The unfortunate thing is I wouldn't know how to correlate that spike with the corresponding slowdown because the replica is about 5.5hrs lagged at the moment. Hopefully it will get caught up tonight and we can see if there's a correlation tomorrow.
Re: [PERFORM] Volatility - docs vs behaviour?
On 06/30/2014 11:49 PM, Tom Lane wrote: Craig Ringer cr...@2ndquadrant.com writes: The docs say: For best optimization results, you should label your functions with the strictest volatility category that is valid for them. Yeah ... ... but I recall discussion here suggesting that in fact IMMUTABLE functions may not be inlined where you'd expect, e.g. http://www.postgresql.org/message-id/CAFj8pRBF3Qr7WtQwO1H_WN=hhfgk0semwhde+odz3iyv-tr...@mail.gmail.com The reason that case behaved surprisingly was exactly that the user had violated the above bit of documentation, ie, he'd marked the function *incorrectly* as being immutable when in fact its contained functions were only stable. Yes, I realise that's the case with this particular incident. It's the more general case I'm interested in - whether this can be true in general, not just when the user does something dumb. It sounds like you're saying that the behaviour observed here is specific to cases where the user incorrectly identifies the function volatility. In which case we don't care, that's fine, no problem here. My concern was only with whether the advice that the highest volatility category should be used is always true for *correct* immutable functions too. I know STRICT can prevent inlining (unfortunately, though necessarily), but it seems inexplicable that IMMUTABLE should. I don't see why you find that inexplicable. If the planner were to inline this function, it would then fail to reduce a call with constant argument to a constant, which is presumably what the user desires from marking it immutable (questions of correctness in the face of timezone changes notwithstanding). Just as we keep the wrapper on when it's necessary to hide possible non-strictness of the body of a function, we must do so when inlining would raise the visible volatility of an expression. If the input is constant, then clearly it should be evaluated and a constant substituted. If it _isn't_ a constant input, then why would STRICT inline when IMMUTABLE doesn't? It's true that the above-quoted bit of advice presumes that you correctly identify the strictest volatility category that is valid for a given function. If you're too lazy or uninformed to do that, it might be better to leave the settings at defaults (volatile/nonstrict) and hope the planner can figure out that it's safe to inline anyway. I was unaware that the planner made any attempt to catch users' errors in marking the strictness of functions. I thought it pretty much trusted the user not to lie about the mutability of functions invoked indirectly. I'm not really sure where in the inlining code to look to figure that out. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Volatility - docs vs behaviour?
Craig Ringer cr...@2ndquadrant.com writes: I was unaware that the planner made any attempt to catch users' errors in marking the strictness of functions. I thought it pretty much trusted the user not to lie about the mutability of functions invoked indirectly. I'm not really sure where in the inlining code to look to figure that out. It's in optimizer/util/clauses.c: /* * Additional validity checks on the expression. It mustn't return a set, * and it mustn't be more volatile than the surrounding function (this is * to avoid breaking hacks that involve pretending a function is immutable * when it really ain't). If the surrounding function is declared strict, * then the expression must contain only strict constructs and must use * all of the function parameters (this is overkill, but an exact analysis * is hard). */ if (expression_returns_set(newexpr)) goto fail; if (funcform-provolatile == PROVOLATILE_IMMUTABLE contain_mutable_functions(newexpr)) goto fail; else if (funcform-provolatile == PROVOLATILE_STABLE contain_volatile_functions(newexpr)) goto fail; As the comment says, this wasn't really coded with an eye towards catching user error. Rather, there are known use-cases where people intentionally use SQL wrapper functions to lie about the mutability of some underlying function; inlining would expose the truth of the matter and thus defeat such hacks. Now I'd be the first to agree that this isn't a terribly high-performance way of doing that, but the point here was to not change the behavior that existed before SQL inlining did. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance