Re: [HACKERS] Wait events monitoring future development
On Mon, Aug 8, 2016 at 7:03 PM, Bruce Momjian wrote: > It seems asking users to run pg_test_timing before deploying to check > the overhead would be sufficient. I'am not sure. Time measurement for waits is slightly more complicated than a time measurement for explain analyze: a good workload plus using gettimeofday in a straightforward manner can cause huge overhead. Thats why a proper testing is important - if we can see a significant performance drop if we have for example large shared_buffers with the same concurrency, that shows gettimeofday is too expensive to use. Am I correct, that we do not have such accurate tests now? My another concern is, that it is a bad idea to release a feature, which allegedly has huge performance impact even if it is not turned on by default. I often meet people who do not use exceptions in plpgsql because a tip "A block containing an EXCEPTION clause is significantly more expensive to enter ..." in PostgreSQL documentation -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Wait events monitoring future development
Hi, I've summarized Wait events monitoring discussion at Developer unconference in Ottawa this year on wiki: https://wiki.postgresql.org/wiki/PgCon_2016_Developer_Unconference/Wait_events_monitoring (Thanks to Alexander Korotkov for patiently pushing me to make this thing finally done) If you attended, fill free to point me out if I missed something, I will put it on the wiki too. Wait event monitoring looks ones again stuck on the way through community approval in spite of huge progress done last year in that direction. The importance of the topic is beyond discussion now, if you talk to any PostgreSQL person about implementing such a tool in Postgres and if the person does not get exited, probably you talk to a full-time PostgreSQL developer;-) Obviously it needs a better design, both the user interface and implementation, and perhaps this is why full-time developers are still sceptical. In order to move forward, imho we need at least some steps, whose steps can be done in parallel 1. Further requirements need to be collected from DBAs. If you are a PostgreSQL DBA with Oracle experience and use perf for troubleshooting Postgres - you are an ideal person to share your experience, but everyone is welcome. 2. Further pg_wait_sampling performance testing needed and in different environments. According to developers, overhead is small, but many people have doubts that it can be much more significant for intensive workloads. Obviously, it is not an easy task to test, because you need to put doubtfully non-production ready code into mission-critical production for such tests. As a result it will be clear if this design should be abandoned and we need to think about less-invasive solutions or this design is acceptable. Any thoughts? Best regards, Ilya -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Thu, Jun 25, 2015 at 1:49 PM, Amit Kapila wrote: >> Personally I think, that tracking waits is a not a good idea for >> pg_stat_activity (at least in that straight-forward manner). > > As mentioned in the initial mail by Robert, that sometimes system becomes > slow (either due to contention on various kinds of locks or due to I/O or > due > to some other such reasons) that such kind of handy information via some > view is quite useful. Recently while working on one of the > performance/scalability > projects, I need to use gdb to attach to different processes to see what > they > are doing (of course one can use perf or some other utilities as well) and I > found most of them were trying to wait on some LW locks, now having such > an information available via view could be really useful, because sometimes > at customer sites, we can't use gdb or perf to see what's going on. Yes, I understand such a use-case. But I hardly see if suggested design can help for such cases. Basically, a DBA has two reasons to take a look on waits: 1. Long response time for particular query (or some type of queries). In that case it is good to know how much time we spend on waiting for particular resources we need to get query results 2. Overall bad performance of a database. We know, that something goes wrong and consumes resources, we need to identify which backend, which query causes the most of waits. In both cases we need a) some historical data rather than simple snapshot b) some approach how to aggregate it because the will be certainly a lot of events So my point is, we need separate interface for waits, instead of integrating in pg_stat_activity. And it should be several interfaces: one for approximate top of waiting sessions (like active_sessions_history in oracle), one for detailed tracing of a session, one for waits per resource statistics etc. >> One >> process can wait for lots of things between 2 sampling of >> pg_stat_activity and that sampling can be pretty useless. >> > > Yeah, that's right and I am not sure if we should bother about such > scenario's > as the system is generally fine in such situations, however there are other > cases where we can find most of the backends are waiting on one or other > thing. I think approach with top of waiting sessions covers both scenarios (well, with only one exception: if we have billions of very short waits and high contention is the problem) However, it maybe a good idea, to identify the resource we are waiting for from pg_stat_activity if we are waiting for a long time. > > I think this is some what different kind of utility which can give us > aggregated information and I think this will address different kind of > usecase and will have somewhat more complex design and it doesn't > look impossible to use part of what will be developed as part of this > proposal. > I think it is more than possible to mix both approaches. My proof of concept now is only about LWLocks - yours and Robert's is more general, and certainly some wait event classification will be needed for both approaches and its much better to implement one rather than two different. And at least, I will be interesting in reviewing your approach. > > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
Hi all On Thu, Jun 25, 2015 at 12:37 PM, Amit Kapila wrote: > 2. Add 2 new columns to pg_stat_activity > waiting_resource - true for waits other heavy wait locks, false >otherwise > wait_event - description code for the wait event > > 3. Add new view 'pg_stat_wait_event' with following info: > pid - process id of this backend > waiting - true for any form of wait, false otherwise > wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc > wait_event - Lock (Relation), Lock (Relation Extension), etc Personally I think, that tracking waits is a not a good idea for pg_stat_activity (at least in that straight-forward manner). One process can wait for lots of things between 2 sampling of pg_stat_activity and that sampling can be pretty useless. My approach (about which Ive had a talk mentioned by Jim and which I hope to finalize and submit within a few days) is a bit different and I believe is more useful: 1. Some sort of histogram of top waits within entire database by pid. That will be an approximate one, because I hardly believe there is a possibility to make a precise one without significant overhead. 2. Some cyclic buffer of more precise wait statistic inside each worker. Sampling may be turned on if we see some issues in histogram (1) and want to have some more details. > Do you think 2nd or 3rd could be viable way to proceed for this feature? > > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Tue, Oct 7, 2014 at 4:45 PM, Robert Haas wrote: >> It's not like it'd be significantly different today - in a read mostly >> workload that's bottlenecked on ProcArrayLock you'll not see many >> waits. There you'd have to count the total number of spinlocks cycles to >> measure anything interesting. > > Hmm, really? I've never had to do that to find bottlenecks. Not sure. Long waiting time represents the same thing better or at least well enough. I think only acquisitions count is important. >>> Having said that, if there's no blocking or spindelay any more, to me >>> that doesn't mean we should look for some other measure of contention >>> instead. It just means that the whole area is a solved problem, we >>> don't need to measure contention any more because there isn't any, and >>> we can move on to other issues once we finish partying. But mildly >>> skeptical that the outcome will be as good as all that. >> >> It's not. Just because we're not waiting in a spinlock loop doesn't mean >> there can't be contention... It's just moved one level down, into the cpu. > > I guess that's true, but how much of the contention at that level is > really important to expose to DBAs? We can put anything that is of > developer interest only int LWLOCK_STATS. For DBA all this means we are waiting for a lock. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Tue, Oct 7, 2014 at 4:30 PM, Andres Freund wrote: > On 2014-10-07 17:22:18 +0300, Heikki Linnakangas wrote: >> FWIW, I liked Ilya's design. Before going to sleep, store the lock ID in >> shared memory. When you wake up, clear it. That should be cheap enough to >> have it always enabled. And it can easily be extended to other "waits", e.g. >> when you're waiting for input from client. > > I think there's a few locks where that's interesting. But in my > experience many slowdowns aren't caused by actual waits, but because of > cacheline contention. And for that the number of acquisitions is much > more relevant than the waiting. The primary example for this is probably > the procarray lock. I would say, that to see particular lwlockid 50 times in 100 samples or to see it 50 times one after another or see it only 2 times, provides good and representative information for DBA. At least better than nothing. > >> I don't think counting the number of lock acquisition is that interesting. >> It doesn't give you any information on how long the waits were, for >> example. > > Sure, that's a separate thing that we should be able to answer. The point is that a lot of short waits sometimes could be as worse as one long wait. That is why it is important, but I thing propper sampling provides good estimation for this. > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Tue, Oct 7, 2014 at 4:12 PM, Andres Freund wrote: >> I think the easiest way to measure lwlock contention would be to put >> some counters in the lwlock itself. My guess, based on a lot of >> fiddling with LWLOCK_STATS over the years, is that there's no way to >> count lock acquisitions and releases without harming performance >> significantly - no matter where we put the counters, it's just going >> to be too expensive. However, I believe that incrementing a counter - >> even in the lwlock itself - might not be too expensive if we only do >> it when (1) a process goes to sleep or (2) spindelays occur. > > Increasing the size will be painful on its own :(. I am afraid in this case we should think about minimizing overhead but not about avoiding it at all: having such DBA-friendly feature it is worth it. Let me step down a bit, since the discussion went to details, while the whole design idea stays unclear. What actually we need: fact, that lwlock acquired? lock count? time spent in lock? overall lock duration? Usual way to explain how any of such performance tools work, is Traffic example (and any oracle/db2 wait-interface aware DBA knows it): You have some from home to office way and spend an hour to make it. You try to optimize it and found, that however you take highway with no speed limit, you usually stack in traffic turning from highway to your office and spend there about 10-30 min. Alternative is to take another way with 2 speed limit zones and one traffic light, totally you will loose 2 and 5 minutes on speed limit parts and 2 min on red light - overall better than 30 minutes in a jam and even better than 10 min in a jam. That is all about: to found bottleneck we need information that process hold certain lock, that it was held certain time or there are a lot of shorter time locks. I think, sampling even 1-2 times pro second and building sort of histogram is well enough at the moment, because it shows (not very in a very precise manner however) that process hold certain lock, that it was held certain time or there are a lot of shorter time locks. After that it is possible to implement something more precise. (As far as I know, Greg Smith works on some sort of wait events, but it seems to me there are a lot of work to do to implement exact analog of OWI) -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Fri, Oct 3, 2014 at 5:51 PM, Robert Haas wrote: > I do think that the instrumentation data gathered by LWLOCK_STATS is > useful - very useful. Sure, quite useful. But how about this comment: /* * The LWLock stats will be updated within a critical section, which * requires allocating new hash entries. Allocations within a critical * section are normally not allowed because running out of memory would * lead to a PANIC, but LWLOCK_STATS is debugging code that's not normally * turned on in production, so that's an acceptable risk. The hash entries * are small, so the risk of running out of memory is minimal in practice. */ > But it does have significant overhead. I will say that it is a bit more than overhead for production use. -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Fri, Oct 3, 2014 at 5:33 PM, Bruce Momjian wrote: > As far as gathering data, I don't think we are going to do any better in > terms of performance/simplicity/reliability than to have a single PGPROC > entry to record when we enter/exit a lock, and having a secondary > process scan the PGPROC array periodically. That was the point. > > What that gives us is almost zero overhead on backends, high > reliability, and the ability of the scan daemon to give higher weights > to locks that are held longer. Basically, if you just stored the locks > you held and released, you either have to add timing overhead to the > backends, or you have no timing information collected. By scanning > active locks, a short-lived lock might not be seen at all, while a > longer-lived lock might be seen by multiple scans. What that gives us > is a weighting of the lock time with almost zero overhead. If we want > finer-grained lock statistics, we just increase the number of scans per > second. So I could add the function, which will accumulate the data in some view/table (with weights etc). How it should be called? From specific process? From some existing maintenance process such as autovacuum? Should I implement GUC for example lwlock_pull_rate, 0 for off, from 1 to 10 for 1 to 10 samples pro second? > > I am assuming almost no one cares about the number of locks, but rather > they care about cummulative lock durations. Oracle and DB2 measure both, cummulative durations and counts. > > I am having trouble seeing any other option that has such a good > cost/benefit profile. At least cost. In Oracle documentation clearly stated, that it is all about diagnostic convenience, performance impact is significant. > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + Everyone has their own god. + -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Thu, Oct 2, 2014 at 5:25 AM, Craig Ringer wrote: > It's not at all clear to me that a DTrace-like (or perf-based, rather) > approach is unsafe, slow, or unsuitable for production use. > With appropriate wrapper tools I think we could have quite a useful > library of perf-based diagnostics and tracing tools for PostgreSQL. It is not actually very slow, overhead is quite reasonable since we want such comprehensive performance diagnostics. About stability, I have had a couple of issues with postgres crushes with dtrace and dos not without. Most of them was on FreeBSD, which is still in use by many people and were caused actually by freebsd dtrace, but for me it is quite enough to have doubts about keeping dtrace aware build in production. OK, OK - maybe things were changed last couple of years or will change soon - still dtrace/perf is well enough for those who is familiar with it, but you need a really convenient wrapper to make oracle/db2 DBA happy with using such approach. > Resolving lock IDs to names might be an issue, though. I am afraid it is > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
On Thu, Oct 2, 2014 at 11:50 AM, Andres Freund wrote: > Not just from a oracle DBA POV ;). Generally. sure >> Saying that, principally they mean an >> Oracle Wait Interface analogue. The Basic idea is to have counters or >> sensors all around database kernel to measure what a particular >> backend is currently waiting for and how long/how often it waits. > > Yes, I can see that. I'm not sure whether lwlocks are the primary point > I'd start with though. In many cases you'll wait on so called > 'heavyweight' locks too... I try to kill two birds with one stone: make some prepositional work on main large topic and deliver some convenience about LWLock diagnostics. Maybe I'm wrong, but it seems to me it is much easier task to advocate some more desired feature: we have some heavyweight locks diagnostics tools and they are better than for lwlocks. > >> Suppose we have a PostgreSQL instance under heavy write workload, but >> we do not know any details. We could pull from time to time >> pg_stat_lwlock function which would say pid n1 currently in >> WALWriteLock and pid n2 in WALInsertLock. That means we should think >> about write ahead log tuning. Or pid n1 is in some clog-related >> LWLock, which means we need move clog to ramdisk. This is a stupid >> example, but it shows how useful LWLock tracing could be for DBAs. >> Even better idea is to collect daily LWLock distribution, find most >> frequent of them etc. > > I think it's more complicated than that - but I also think it'd be a > great help for DBAs and us postgres hackers. Sure it is more complicated, the example is stupid, just to show the point. >> An idea of this patch is to trace LWLocks with the lowest possible >> performance impact. We put integer lwLockID into procarray, then >> acquiring the LWLock we put its id to procarray and now we could pull >> procarray using a function to see if particular pid holds LWLock. > > But a backend can hold more than one lwlock at the same time? I don't > think that's something we can ignore. Yes, this one of the next steps. I have not figure out yet, how to do it less painfully than LWLOCK_STATS does. > I personally am doubtful that it makes much sense to move this into an > extension. It'll likely be tightly enough interlinked to backend code > that I don't see the point. But I'd not be surprised if others feel > differently. Thats why I asked this question, and also because I have no idea where exactly put this functions inside backend if not into extension. But probably there are some more important tasks with this work than moving the function inside, I could do this later if it will be necessary. > > I generally don't think you'll get interesting data without a fair bit > of additional work. Sure > The first problem that comes to my mind about collecting enough data is > that we have a very large number of lwlocks (fixed_number + 2 * > shared_buffers). One 'trivial' way of implementing this is to have a per > backend array collecting the information, and then a shared one > accumulating data from it over time. But I'm afraid that's not going to > fly :(. Hm. With the above sets of stats that'd be ~50MB per backend... > > Perhaps we should somehow encode this different for individual lwlock > tranches? It's far less problematic to collect all this information for > all but the buffer lwlocks... That is a good point. There are actually two things to keep in mind: i) user interface, ii) implementation i) Personally, as a DBA, I do not see much sense in unaggregated list of pid, lwlockid, wait_time or something like that. Much better to have aggregation by pid and lwlockid, for instance: - pid - lwlockid - lwlockname - total_count (or number of exclusive/shared acquirations that had to wait as you suggest, since we have a lot of lwlocks I am doubtful about how important is the information about non-waiting lwlocks) ii) Am I correct, that you suggest to go trough MainLWLockTranche and retrieve all available lwlock information to some structure like lwLockCell structure I've used in my patch? Something like hash lwlocid->usagecount? Regards, Ilya > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
s, WALWriteLock looks more useful than just 8)? To get it from tranche as T_NAME(lock) at the same time as lwLockId seems to be not a nice idea, especially to put the name into procarray to pull it. It spoils whole idea. Any advice? 4. Going through procarray to get pairs pid, lwlockid I do LWLockAcquire(ProcArrayLock, LW_SHARED); I am not quite sure is a good idea, because of its performance impact. Probably such a histogram-style thing does not need strong consistency and ProcArrayLock is not needed? So, any thoughts, should I proceed implementing this feature? Best regards, Ilya PS I am thankful to Simon Riggs and Bruce Momjian for discussing this idea with me on PGCon, and especially to Simon who gave me a brief but pretty useful tutorial how LWLock-related code lives in Postgres. Also I am thankful to Heikki Linnakangas and Magnus Hagander for answering some of my stupid questions about procarray internals. -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com pg_stat_lwlock_0.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SKIP LOCKED DATA
That is quite useful feature to implement smth. like message queues based on database and so on. Now there is possibility to jump over luck of such feature in Postgres using current advisory lock implementation (pg_try_advisory_xact_lock to determine if somebody already acquired log on particular row). So Im not sure this is an urgent matter. Best regards, Ilya On Mon, Jan 16, 2012 at 6:33 AM, Tom Lane wrote: > It sounds to me like "silently give the wrong answers". Are you sure > there are not better, more deterministic ways to solve your problem? > > (Or in other words: the fact that Oracle has it isn't enough to persuade > me it's a good idea.) > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres 9.1 - Release Theme
Nice to hear and thumbs up! I've just start planning to migrate one of my telco 3Tb database running blunt oracle to coachDb but now of course postgres looks better. Hopefully stupid transactions will be abrogated to wbr Ilya On Thu, Apr 1, 2010 at 12:33 PM, Dave Page wrote: > On Thu, Apr 1, 2010 at 9:30 AM, Thom Brown wrote: >> I prefer to dump all my data in a big text file and grep it for the >> information I need. > > There's no need to start showing off and get all technical y'know. > > -- > Dave Page > EnterpriseDB UK: http://www.enterprisedb.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers