Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 2015-01-21 22:43:03 -0500, Matt Kelly wrote: > > > > Sure, but nobody who is not a developer is going to care about that. > > A typical user who sees "pgstat wait timeout", or doesn't, isn't going > > to be able to make anything at all out of that. > > > As a user, I wholeheartedly disagree. Note that that the change we discussed wasn't removing the message. It was changing the log level from WARNING to LOG. Which means the change is not going to the client anymore, but still to the server log (perhaps surprisingly, the likelihood for the latter actually increases). > I think the warning is incredibly valuable. Along those lines I'd also > love to see a pg_stat_snapshot_timestamp() for monitoring code to use to > determine if its using a stale snapshot, as well as helping to smooth > graphs of the statistics that are based on highly granular snapshotting. I can see that being useful. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
> > Sure, but nobody who is not a developer is going to care about that. > A typical user who sees "pgstat wait timeout", or doesn't, isn't going > to be able to make anything at all out of that. As a user, I wholeheartedly disagree. That warning helped me massively in diagnosing an unhealthy database server in the past at TripAdvisor (i.e. high end server class box, not a raspberry pie). I have realtime monitoring that looks at pg_stat_database at regular intervals particularly for the velocity of change of xact_commit and xact_rollback columns, similar to how check_postgres does it. https://github.com/bucardo/check_postgres/blob/master/check_postgres.pl#L4234 When one of those servers was unhealthy, it stopped reporting statistics for 30 seconds+ at a time. My dashboard which polled far more frequently than that indicated the server was normally processing 0 tps with intermittent spikes. I went directly onto the server and sampled pg_stat_database. That warning was the only thing that directly indicated that the statistics collector was not to be trusted. It obviously was a victim of what was going on in the server, but its pretty important to know when your methods for measuring server health are lying to you. The spiky TPS at first glance appears like some sort of live lock, not just that the server is overloaded. Now, I know: 0 change in stats = collector broken. Rereading the docks, Also, the collector itself emits a new report at most once per > PGSTAT_STAT_INTERVAL milliseconds (500 ms unless altered while building > the server). Without context this merely reads: "We sleep for 500ms, plus the time to write the file, plus whenever the OS decides to enforce the timer interrupt... so like 550-650ms." It doesn't read, "When server is unhealthy, but _still_ serving queries, the stats collector might not be able to keep up and will just stop reporting stats all together." I think the warning is incredibly valuable. Along those lines I'd also love to see a pg_stat_snapshot_timestamp() for monitoring code to use to determine if its using a stale snapshot, as well as helping to smooth graphs of the statistics that are based on highly granular snapshotting. - Matt Kelly
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 21.1.2015 00:38, Michael Paquier wrote: > On Wed, Jan 21, 2015 at 1:08 AM, Tomas Vondra > >> I've tried to reproduce this on my Raspberry PI 'machine' and it's not >> very difficult to trigger this. About 7 out of 10 'make check' runs fail >> because of 'pgstat wait timeout'. >> >> All the occurences I've seen were right after some sort of VACUUM >> (sometimes plain, sometimes ANALYZE or FREEZE), and the I/O at the time >> looked something like this: >> >> Device: rrqm/s wrqm/s r/s w/srkB/swkB/s >> avgrq-sz avgqu-sz await r_await w_await svctm %util >> mmcblk0 0.0075.000.008.00 0.0036.00 >> 9.00 5.73 15633.750.00 15633.75 125.00 100.00 >> >> So pretty terrible (this is a Class 4 SD card, supposedly able to >> handle 4 MB/s). If hamster had faulty SD card, it might have been >> much worse, I guess. > > By experience, a class 10 is at least necessary, with a minimum > amount of memory to minimize the apparition of those warnings, > hamster having now a 8GB class 10 card. Well, my goal was exactly to produce those warnings ;-) and see if I can identify some strange cases. That's why I chose just class 4. But even then it produces rather low number of those warnings (one or two per check run), and mostly at the expected places with significant I/O overload. So I'm not any wiser :-( regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Wed, Jan 21, 2015 at 1:08 AM, Tomas Vondra wrote: > On 25.12.2014 22:28, Tomas Vondra wrote: >> On 25.12.2014 21:14, Andres Freund wrote: >> >>> That's indeed odd. Seems to have been lost when the statsfile was >>> split into multiple files. Alvaro, Tomas? >> >> The goal was to keep the logic as close to the original as possible. >> IIRC there were "pgstat wait timeout" issues before, and in most cases >> the conclusion was that it's probably because of overloaded I/O. >> >> But maybe there actually was another bug, and it's entirely possible >> that the split introduced a new one, and that's what we're seeing now. >> The strange thing is that the split happened ~2 years ago, which is >> inconsistent with the sudden increase of this kind of issues. So maybe >> something changed on that particular animal (a failing SD card causing >> I/O stalls, perhaps)? >> >> Anyway, I happen to have a spare Raspberry PI, so I'll try to reproduce >> and analyze the issue locally. But that won't happen until January. > > I've tried to reproduce this on my Raspberry PI 'machine' and it's not > very difficult to trigger this. About 7 out of 10 'make check' runs fail > because of 'pgstat wait timeout'. > > All the occurences I've seen were right after some sort of VACUUM > (sometimes plain, sometimes ANALYZE or FREEZE), and the I/O at the time > looked something like this: > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > mmcblk0 0.0075.000.008.00 0.0036.00 > 9.00 5.73 15633.750.00 15633.75 125.00 100.00 > > So pretty terrible (this is a Class 4 SD card, supposedly able to handle > 4 MB/s). If hamster had faulty SD card, it might have been much worse, I > guess. By experience, a class 10 is at least necessary, with a minimum amount of memory to minimize the apparition of those warnings, hamster having now a 8GB class 10 card. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 25.12.2014 22:28, Tomas Vondra wrote: > On 25.12.2014 21:14, Andres Freund wrote: > >> That's indeed odd. Seems to have been lost when the statsfile was >> split into multiple files. Alvaro, Tomas? > > The goal was to keep the logic as close to the original as possible. > IIRC there were "pgstat wait timeout" issues before, and in most cases > the conclusion was that it's probably because of overloaded I/O. > > But maybe there actually was another bug, and it's entirely possible > that the split introduced a new one, and that's what we're seeing now. > The strange thing is that the split happened ~2 years ago, which is > inconsistent with the sudden increase of this kind of issues. So maybe > something changed on that particular animal (a failing SD card causing > I/O stalls, perhaps)? > > Anyway, I happen to have a spare Raspberry PI, so I'll try to reproduce > and analyze the issue locally. But that won't happen until January. I've tried to reproduce this on my Raspberry PI 'machine' and it's not very difficult to trigger this. About 7 out of 10 'make check' runs fail because of 'pgstat wait timeout'. All the occurences I've seen were right after some sort of VACUUM (sometimes plain, sometimes ANALYZE or FREEZE), and the I/O at the time looked something like this: Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util mmcblk0 0.0075.000.008.00 0.0036.00 9.00 5.73 15633.750.00 15633.75 125.00 100.00 So pretty terrible (this is a Class 4 SD card, supposedly able to handle 4 MB/s). If hamster had faulty SD card, it might have been much worse, I guess. This of course does not prove the absence of a bug - I plan to dig into this a bit more. Feel free to point out some suspicious scenarios that might be worth reproducing and analyzing. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Sat, Dec 27, 2014 at 8:51 PM, Tom Lane wrote: > Robert Haas writes: >> On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane wrote: >>> This would have the effect of transferring all responsibility for >>> dead-stats-entry cleanup to autovacuum. For ordinary users, I think >>> that'd be just fine. It might be less fine though for people who >>> disable autovacuum, if there still are any. > >> -1. I don't think it's a good idea to inflict pain on people who want >> to schedule their vacuums manually (and yes, there are some) to get >> clean buildfarm runs. > > Did you read the rest of it? Yeah... since when do I not read your emails? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Robert Haas writes: > On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane wrote: >> This would have the effect of transferring all responsibility for >> dead-stats-entry cleanup to autovacuum. For ordinary users, I think >> that'd be just fine. It might be less fine though for people who >> disable autovacuum, if there still are any. > -1. I don't think it's a good idea to inflict pain on people who want > to schedule their vacuums manually (and yes, there are some) to get > clean buildfarm runs. Did you read the rest of it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 12/27/2014 12:16 AM, Alvaro Herrera wrote: >>> Tom Lane wrote: The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to "they use PGSTAT_STAT_INTERVAL like everybody else"? > >>> The point of wanting fresher stats than that, eons ago, was to avoid a >>> worker vacuuming a table that some other worker vacuumed more recently >>> than PGSTAT_STAT_INTERVAL. ... >>> Nowadays we can probably disregard the whole issue, since starting a new >>> vacuum just after the prior one finished should not cause much stress to >>> the system thanks to the visibility map. > >> Vacuuming is far from free, even if the visibility map says that most >> pages are visible to all: you still scan all indexes, if you remove any >> dead tuples at all. > > With typical autovacuum settings, I kinda doubt that there's much value in > reducing the window for this problem from 500ms to 10ms. As Alvaro says, > this was just a partial, kluge solution from the start --- if we're > worried about such duplicate vacuuming, we should undertake a real > solution that closes the window altogether. In any case, timeouts > occurring inside autovacuum are not directly causing the buildfarm > failures, since autovacuum's log entries don't reflect into regression > outputs. (It's possible that autovacuum's tight tolerance is contributing > to the failures by increasing the load on the stats collector, but I'm > not sure I believe that.) > > To get back to that original complaint about buildfarm runs failing, > I notice that essentially all of those failures are coming from "wait > timeout" warnings reported by manual VACUUM commands. Now, VACUUM itself > has no need to read the stats files. What's actually causing these > messages is failure to get a timely response in pgstat_vacuum_stat(). > So let me propose a drastic solution: let's dike out this bit in vacuum.c: > > /* > * Send info about dead objects to the statistics collector, unless we are > * in autovacuum --- autovacuum.c does this for itself. > */ > if ((vacstmt->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess()) > pgstat_vacuum_stat(); > > This would have the effect of transferring all responsibility for > dead-stats-entry cleanup to autovacuum. For ordinary users, I think > that'd be just fine. It might be less fine though for people who > disable autovacuum, if there still are any. -1. I don't think it's a good idea to inflict pain on people who want to schedule their vacuums manually (and yes, there are some) to get clean buildfarm runs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Heikki Linnakangas writes: > On 12/27/2014 12:16 AM, Alvaro Herrera wrote: >> Tom Lane wrote: >>> The argument that autovac workers need fresher stats than anything else >>> seems pretty dubious to start with. Why shouldn't we simplify that down >>> to "they use PGSTAT_STAT_INTERVAL like everybody else"? >> The point of wanting fresher stats than that, eons ago, was to avoid a >> worker vacuuming a table that some other worker vacuumed more recently >> than PGSTAT_STAT_INTERVAL. ... >> Nowadays we can probably disregard the whole issue, since starting a new >> vacuum just after the prior one finished should not cause much stress to >> the system thanks to the visibility map. > Vacuuming is far from free, even if the visibility map says that most > pages are visible to all: you still scan all indexes, if you remove any > dead tuples at all. With typical autovacuum settings, I kinda doubt that there's much value in reducing the window for this problem from 500ms to 10ms. As Alvaro says, this was just a partial, kluge solution from the start --- if we're worried about such duplicate vacuuming, we should undertake a real solution that closes the window altogether. In any case, timeouts occurring inside autovacuum are not directly causing the buildfarm failures, since autovacuum's log entries don't reflect into regression outputs. (It's possible that autovacuum's tight tolerance is contributing to the failures by increasing the load on the stats collector, but I'm not sure I believe that.) To get back to that original complaint about buildfarm runs failing, I notice that essentially all of those failures are coming from "wait timeout" warnings reported by manual VACUUM commands. Now, VACUUM itself has no need to read the stats files. What's actually causing these messages is failure to get a timely response in pgstat_vacuum_stat(). So let me propose a drastic solution: let's dike out this bit in vacuum.c: /* * Send info about dead objects to the statistics collector, unless we are * in autovacuum --- autovacuum.c does this for itself. */ if ((vacstmt->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess()) pgstat_vacuum_stat(); This would have the effect of transferring all responsibility for dead-stats-entry cleanup to autovacuum. For ordinary users, I think that'd be just fine. It might be less fine though for people who disable autovacuum, if there still are any. To analyze the effects on them, let's break down what pgstat_vacuum_stat() actually does: 1. It causes drops of whole databases from pgstat's tables. I claim we don't need to do that at this level. There's no issue unless the stats collector missed the PgStat_MsgDropdb message when the database was dropped; and even if it did, the negative consequences of having a dead DB's stats still lying around are pretty minimal since we split up the stats files. There will be no extra I/O except for one small record in the global stats file. So I think we can well afford to say that with autovac off, such missed databases only get cleaned up the next time an autovac is forced for antiwraparound. 2. Within the current database, it causes drops of pgstat entries for dropped tables. This would be a tad annoying if you have lots of transient tables and no or minimal autovacuuming. However, lots of transient tables can be a pain point anyway, because we don't currently have any mechanism other than pgstat_vacuum_stat() for cleaning up per-table stats for dropped tables. It seems like it might be time to do something about that. We could probably extend the PgStat_TableXactStatus infrastructure to keep track of whether a table was created or deleted in the current transaction, and solve the problem without very much new code. Or maybe we could move the responsibility for reporting stale entries into the code that reads the stats files for the stats views. 3. Within the current database, it causes drops of pgstat entries for dropped functions, if you're tracking per-function execution stats. Since that's not the default, maybe we could get away with saying that we don't clean up such dead entries very often unless you're running autovacuum. I don't really want to propose building the infrastructure to support dropping such entries retail. Or, much more simply, we could conclude that it's not that important if pgstat_vacuum_stat() is unable to get up-to-date stats, and rejigger the code so that we don't bleat when the file-reading request comes from that source. This should be a back-patchable fix, whereas #2 above might be a bit too complicated for that. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 12/27/2014 12:16 AM, Alvaro Herrera wrote: Tom Lane wrote: The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to "they use PGSTAT_STAT_INTERVAL like everybody else"? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. I realize now that the semantics we really want was something like "stats no older than XYZ" where the given value is the timestamp at which we start checking; if we get anything newer than that it would be okay, but we currently reject it because of lack of a more appropriate API. (If it takes more than PGSTAT_STAT_INTERVAL to get the stats back, a regular backend would ask for fresher stats, but to an autovac worker they would be good enough as long as they are newer than its recheck start time.) Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. Vacuuming is far from free, even if the visibility map says that most pages are visible to all: you still scan all indexes, if you remove any dead tuples at all. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tom Lane wrote: > The argument that autovac workers need fresher stats than anything else > seems pretty dubious to start with. Why shouldn't we simplify that down > to "they use PGSTAT_STAT_INTERVAL like everybody else"? The point of wanting fresher stats than that, eons ago, was to avoid a worker vacuuming a table that some other worker vacuumed more recently than PGSTAT_STAT_INTERVAL. I realize now that the semantics we really want was something like "stats no older than XYZ" where the given value is the timestamp at which we start checking; if we get anything newer than that it would be okay, but we currently reject it because of lack of a more appropriate API. (If it takes more than PGSTAT_STAT_INTERVAL to get the stats back, a regular backend would ask for fresher stats, but to an autovac worker they would be good enough as long as they are newer than its recheck start time.) Nowadays we can probably disregard the whole issue, since starting a new vacuum just after the prior one finished should not cause much stress to the system thanks to the visibility map. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Alvaro Herrera writes: > Tom Lane wrote: >> Yeah, I've been getting more annoyed by that too lately. I keep wondering >> though whether there's an actual bug underneath that behavior that we're >> failing to see. > I think the first thing to do is reconsider usage of PGSTAT_RETRY_DELAY > instead of PGSTAT_STAT_INTERVAL in autovacuum workers. That decreases > the wait time 50-fold, if I recall this correctly, and causes large > amounts of extra I/O traffic. Yeah --- that means that instead of the normal behavior that a stats file newer than 500 msec is good enough, an autovac worker insists on a stats file newer than 10 msec. I did some experimentation on prairiedog, and found that it's not hard at all to see autovac workers demanding multiple stats writes per second: 2014-12-26 16:26:52.958 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:53.128 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:53.188 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:54.903 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:26:55.058 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.022 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.285 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:00.792 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.010 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.163 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:01.193 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.595 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.673 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.839 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:03.878 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:05.878 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:06.571 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.001 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.769 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:07.950 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:10.256 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:11.039 EST 21026 LOG: sending inquiry for database 45116 2014-12-26 16:27:11.402 EST 21026 LOG: sending inquiry for database 45116 The argument that autovac workers need fresher stats than anything else seems pretty dubious to start with. Why shouldn't we simplify that down to "they use PGSTAT_STAT_INTERVAL like everybody else"? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tom Lane wrote: > Andres Freund writes: > > So I think a better way to deal with that warning would be a good > > idea. Besides somehow making the mechanism there are two ways to attack > > this that I can think of, neither of them awe inspiring: > > > 1) Make that WARNING a LOG message instead. Since those don't get send > > to the client with default settings... > > 2) Increase PGSTAT_MAX_WAIT_TIME even further than what 99b545 increased > > it to. > > Yeah, I've been getting more annoyed by that too lately. I keep wondering > though whether there's an actual bug underneath that behavior that we're > failing to see. I think the first thing to do is reconsider usage of PGSTAT_RETRY_DELAY instead of PGSTAT_STAT_INTERVAL in autovacuum workers. That decreases the wait time 50-fold, if I recall this correctly, and causes large amounts of extra I/O traffic. > BTW, I notice that in the current state of pgstat.c, all the logic for > keeping track of request arrival times is dead code, because nothing is > actually looking at DBWriteRequest.request_time. This makes me think that > somebody simplified away some logic we maybe should have kept. I will have a look. I remember being confused about this at some point when reviewing that patch. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Fri, Dec 26, 2014 at 6:28 AM, Tomas Vondra wrote: > On 25.12.2014 21:14, Andres Freund wrote: >> On 2014-12-25 14:36:42 -0500, Tom Lane wrote: >> >> My guess is that a checkpoint happened at that time. Maybe it'd be a >> good idea to make pg_regress start postgres with log_checkpoints >> enabled? My guess is that we'd find horrendous 'sync' times. >> >> Michael: Could you perhaps turn log_checkpoints on in the config? > > Logging timestamps (using log_line_prefux) would be also helpful. Done. If have been wondering about those failures as well but didn't get the time to look around. FYI, hamster is running with a 8GB class 10 SD card able to do 40M/s in read, card that I changed 2 weeks ago btw, the former 4GB card beginning to show I/O errors, RIP to it. So this is what is triggering the wait timeouts more frequently... But I have no real explanation why REL9_4_STABLE shows no signs of failures. For the time being, what about putting pg_stats_tmp into a ram disk to leverage the I/O? Whatever what we come up with, I imagine that I/O will still be tight on hamster. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 26.12.2014 02:59, Tom Lane wrote: > Tomas Vondra writes: >> On 25.12.2014 22:40, Tom Lane wrote: >>> I think that hamster has basically got a tin can and string for an I/O >>> subsystem. It's not real clear to me whether there's actually been an >>> increase in "wait timeout" failures recently; somebody would have to >>> go through and count them before I'd have much faith in that thesis. > >> That's what I did. On hamster I see this (in the HEAD): > >> 2014-12-25 16:00:07 yes >> 2014-12-24 16:00:07 yes >> 2014-12-23 16:00:07 yes >> 2014-12-22 16:00:07 yes >> 2014-12-19 16:00:07 yes >> 2014-12-15 16:00:11 no >> 2014-10-25 16:00:06 no >> 2014-10-24 16:00:06 no >> 2014-10-23 16:00:06 no >> 2014-10-22 16:00:06 no >> 2014-10-21 16:00:07 no >> 2014-10-19 16:00:06 no >> 2014-09-28 16:00:06 no >> 2014-09-26 16:00:07 no >> 2014-08-28 16:00:06 no >> 2014-08-12 16:00:06 no >> 2014-08-05 22:04:48 no >> 2014-07-19 01:53:30 no >> 2014-07-06 16:00:06 no >> 2014-07-04 16:00:06 no >> 2014-06-29 16:00:06 no >> 2014-05-09 16:00:04 no >> 2014-05-07 16:00:04 no >> 2014-05-04 16:00:04 no >> 2014-04-28 16:00:04 no >> 2014-04-18 16:00:04 no >> 2014-04-04 16:00:04 no > >> (where "yes" means "pgstat wait timeout" is in the logs). On >> chipmunk, the trend is much less convincing (but there's much less >> failures, and only 3 of them failed because of the "pgstat wait >> timeout"). > > mereswine's history is also pretty interesting in this context. That > series makes it look like the probability of "pgstat wait timeout" > took a big jump around the beginning of December, especially if you > make the unproven-but-not-unreasonable assumption that the two > pg_upgradecheck failures since then were also wait timeout failures. > That's close enough after commit 88fc71926392115c (Nov 19) to make me > suspect that that was what put us over the edge: that added a bunch > more I/O *and* a bunch more statistics demands to this one block of > parallel tests. Interesting. But even if this commit tipped us over the edge, it's not a proof that the split patch was perfectly correct. > But even if we are vastly overstressing the I/O subsystem on these > boxes, why is it manifesting like this? pgstat never fsyncs the stats > temp file, so it should not have to wait for physical I/O I'd think. > Or perhaps the file rename() operations get fsync'd behind the scenes > by the filesystem? My guess is that the amount of dirty data in page cache reaches, reaches dirty_ratio/dirty_bytes, effectively forcing the writes to go directly to the disks. Those ARM machines have rather low amounts of RAM (typically 256-512MB), and the default values for dirty_ratio are ~20% IIRC. So that's ~50MB-100MB. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tomas Vondra writes: > On 25.12.2014 22:40, Tom Lane wrote: >> I think that hamster has basically got a tin can and string for an I/O >> subsystem. It's not real clear to me whether there's actually been an >> increase in "wait timeout" failures recently; somebody would have to >> go through and count them before I'd have much faith in that thesis. > That's what I did. On hamster I see this (in the HEAD): > 2014-12-25 16:00:07 yes > 2014-12-24 16:00:07 yes > 2014-12-23 16:00:07 yes > 2014-12-22 16:00:07 yes > 2014-12-19 16:00:07 yes > 2014-12-15 16:00:11 no > 2014-10-25 16:00:06 no > 2014-10-24 16:00:06 no > 2014-10-23 16:00:06 no > 2014-10-22 16:00:06 no > 2014-10-21 16:00:07 no > 2014-10-19 16:00:06 no > 2014-09-28 16:00:06 no > 2014-09-26 16:00:07 no > 2014-08-28 16:00:06 no > 2014-08-12 16:00:06 no > 2014-08-05 22:04:48 no > 2014-07-19 01:53:30 no > 2014-07-06 16:00:06 no > 2014-07-04 16:00:06 no > 2014-06-29 16:00:06 no > 2014-05-09 16:00:04 no > 2014-05-07 16:00:04 no > 2014-05-04 16:00:04 no > 2014-04-28 16:00:04 no > 2014-04-18 16:00:04 no > 2014-04-04 16:00:04 no > (where "yes" means "pgstat wait timeout" is in the logs). On chipmunk, > the trend is much less convincing (but there's much less failures, and > only 3 of them failed because of the "pgstat wait timeout"). mereswine's history is also pretty interesting in this context. That series makes it look like the probability of "pgstat wait timeout" took a big jump around the beginning of December, especially if you make the unproven-but-not-unreasonable assumption that the two pg_upgradecheck failures since then were also wait timeout failures. That's close enough after commit 88fc71926392115c (Nov 19) to make me suspect that that was what put us over the edge: that added a bunch more I/O *and* a bunch more statistics demands to this one block of parallel tests. But even if we are vastly overstressing the I/O subsystem on these boxes, why is it manifesting like this? pgstat never fsyncs the stats temp file, so it should not have to wait for physical I/O I'd think. Or perhaps the file rename() operations get fsync'd behind the scenes by the filesystem? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 25.12.2014 22:40, Tom Lane wrote: > Tomas Vondra writes: >> The strange thing is that the split happened ~2 years ago, which is >> inconsistent with the sudden increase of this kind of issues. So maybe >> something changed on that particular animal (a failing SD card causing >> I/O stalls, perhaps)? > > I think that hamster has basically got a tin can and string for an I/O > subsystem. It's not real clear to me whether there's actually been an Yes. It's called "SD card". > increase in "wait timeout" failures recently; somebody would have to > go through and count them before I'd have much faith in that thesis. That's what I did. On hamster I see this (in the HEAD): 2014-12-25 16:00:07 yes 2014-12-24 16:00:07 yes 2014-12-23 16:00:07 yes 2014-12-22 16:00:07 yes 2014-12-19 16:00:07 yes 2014-12-15 16:00:11 no 2014-10-25 16:00:06 no 2014-10-24 16:00:06 no 2014-10-23 16:00:06 no 2014-10-22 16:00:06 no 2014-10-21 16:00:07 no 2014-10-19 16:00:06 no 2014-09-28 16:00:06 no 2014-09-26 16:00:07 no 2014-08-28 16:00:06 no 2014-08-12 16:00:06 no 2014-08-05 22:04:48 no 2014-07-19 01:53:30 no 2014-07-06 16:00:06 no 2014-07-04 16:00:06 no 2014-06-29 16:00:06 no 2014-05-09 16:00:04 no 2014-05-07 16:00:04 no 2014-05-04 16:00:04 no 2014-04-28 16:00:04 no 2014-04-18 16:00:04 no 2014-04-04 16:00:04 no (where "yes" means "pgstat wait timeout" is in the logs). On chipmunk, the trend is much less convincing (but there's much less failures, and only 3 of them failed because of the "pgstat wait timeout"). However, it's worth mentioning that all the pgstat failures happened at "16:00:07" and most of the older failures are before that. So it may be that it was failing because of something else, and the pgstat timeout could not happen because of that. OTOH, there's a fair amount of successful runs. > However, I notice that at least the last few occurrences on "hamster" > all seem to have been in this parallel block: > > test: brin gin gist spgist privileges security_label collate matview > lock replica_identity rowsecurity object_address > > which as recently as 9.4 contained just these tests: > > test: privileges security_label collate matview lock replica_identity > > I'm fairly sure that the index-related tests in this batch are I/O > intensive, and since they were not there at all six months ago, it's > not hard to believe that this block of tests has far greater I/O > demands than used to exist. Draw your own conclusions ... Yes, that might be the culprit here. Would be interesting to know what's happening on the machine while the tests are running to confirm this hypothesis. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 25.12.2014 22:16, Tom Lane wrote: > Tomas Vondra writes: >> On 25.12.2014 20:36, Tom Lane wrote: >>> BTW, I notice that in the current state of pgstat.c, all the logic >>> for keeping track of request arrival times is dead code, because >>> nothing is actually looking at DBWriteRequest.request_time. > >> Really? Which part of the code is dead? I see pgstat_recv_inquiry() is >> updating the request_time after receiving the inquiry, and >> pgstat_db_requested() is looking at it when writing the files. > > Where is pgstat_db_requested() looking at request_time? Oh, right. pgstat_db_requested() is not looking at the timestamp - it only checks the OID of the database. But pgstat_recv_inquiry() is checking it, comparing it to cutoff_time etc. ISTM the main change related to this is that this: if (last_statwrite < last_statrequest) pgstat_write_statsfile(false); got replaced by this: if (pgstat_write_statsfile_needed()) pgstat_write_statsfiles(false, false); where pgstat_write_statsfile_needed() only checks if the list is empty (and ignores the request/write timestamps). If I understand that correctly, this can would lead to writing the files more often, and we're dealing with a timeout. >> If we can simplify the code by keeping just OIDs, let's do that. I think >> the main reason why we haven't done that in 187492b6 was to keep as much >> of the existing logic (and maybe change it in a separate patch). > > The real point here is that I think that commit *already* changed > the existing logic, because the time-of-receipt used to matter. In > particular, there used to be a throttle on how often the stats file > could get written, which seems to have vanished. I seriously doubt > that that was a good change, especially on > write-bandwidth-challenged platforms. Yes - if that change causes writing the files being written more frequently, it's not a good change. But I think the time-of-receipt still matters - pgstat_recv_inquiry logic remained the same, just applied per database. ISTM the only thing that disappeared is the (last_statwrite < last_statrequest) check. I have to think about this a bit more, I haven't seen this code since the split patch. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tomas Vondra writes: > The strange thing is that the split happened ~2 years ago, which is > inconsistent with the sudden increase of this kind of issues. So maybe > something changed on that particular animal (a failing SD card causing > I/O stalls, perhaps)? I think that hamster has basically got a tin can and string for an I/O subsystem. It's not real clear to me whether there's actually been an increase in "wait timeout" failures recently; somebody would have to go through and count them before I'd have much faith in that thesis. However, I notice that at least the last few occurrences on "hamster" all seem to have been in this parallel block: test: brin gin gist spgist privileges security_label collate matview lock replica_identity rowsecurity object_address which as recently as 9.4 contained just these tests: test: privileges security_label collate matview lock replica_identity I'm fairly sure that the index-related tests in this batch are I/O intensive, and since they were not there at all six months ago, it's not hard to believe that this block of tests has far greater I/O demands than used to exist. Draw your own conclusions ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 25.12.2014 21:14, Andres Freund wrote: > On 2014-12-25 14:36:42 -0500, Tom Lane wrote: > > My guess is that a checkpoint happened at that time. Maybe it'd be a > good idea to make pg_regress start postgres with log_checkpoints > enabled? My guess is that we'd find horrendous 'sync' times. > > Michael: Could you perhaps turn log_checkpoints on in the config? Logging timestamps (using log_line_prefux) would be also helpful. > >> BTW, I notice that in the current state of pgstat.c, all the logic >> for keeping track of request arrival times is dead code, because >> nothing is actually looking at DBWriteRequest.request_time. This >> makes me think that somebody simplified away some logic we maybe >> should have kept. But if we're going to leave it like this, we >> could replace the DBWriteRequest data structure with a simple OID >> list and save a fair amount of code. > > That's indeed odd. Seems to have been lost when the statsfile was > split into multiple files. Alvaro, Tomas? The goal was to keep the logic as close to the original as possible. IIRC there were "pgstat wait timeout" issues before, and in most cases the conclusion was that it's probably because of overloaded I/O. But maybe there actually was another bug, and it's entirely possible that the split introduced a new one, and that's what we're seeing now. The strange thing is that the split happened ~2 years ago, which is inconsistent with the sudden increase of this kind of issues. So maybe something changed on that particular animal (a failing SD card causing I/O stalls, perhaps)? Anyway, I happen to have a spare Raspberry PI, so I'll try to reproduce and analyze the issue locally. But that won't happen until January. > I wondered for a second whether the split could be responsible > somehow, but there's reports of that in older backbranches as well: > http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=mereswine&dt=2014-12-23%2019%3A17%3A41 Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Tomas Vondra writes: > On 25.12.2014 20:36, Tom Lane wrote: >> BTW, I notice that in the current state of pgstat.c, all the logic >> for keeping track of request arrival times is dead code, because >> nothing is actually looking at DBWriteRequest.request_time. > Really? Which part of the code is dead? I see pgstat_recv_inquiry() is > updating the request_time after receiving the inquiry, and > pgstat_db_requested() is looking at it when writing the files. Where is pgstat_db_requested() looking at request_time? > If we can simplify the code by keeping just OIDs, let's do that. I think > the main reason why we haven't done that in 187492b6 was to keep as much > of the existing logic (and maybe change it in a separate patch). The real point here is that I think that commit *already* changed the existing logic, because the time-of-receipt used to matter. In particular, there used to be a throttle on how often the stats file could get written, which seems to have vanished. I seriously doubt that that was a good change, especially on write-bandwidth-challenged platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 25.12.2014 20:36, Tom Lane wrote: > > Yeah, I've been getting more annoyed by that too lately. I keep > wondering though whether there's an actual bug underneath that > behavior that we're failing to see. PGSTAT_MAX_WAIT_TIME is already > 10 seconds; it's hard to credit that increasing it still further > would be "fixing" anything. The other change would also mainly just > sweep the issue under the rug, if there is any issue and it's not > just that we're overloading underpowered buildfarm machines. (Maybe a > better fix would be to reduce MAX_CONNECTIONS for the tests on these > machines?) I agree that increasing the limit further is not a good solution. > BTW, I notice that in the current state of pgstat.c, all the logic > for keeping track of request arrival times is dead code, because > nothing is actually looking at DBWriteRequest.request_time. This > makes me think that somebody simplified away some logic we maybe > should have kept. But if we're going to leave it like this, we could > replace the DBWriteRequest data structure with a simple OID list and > save a fair amount of code. Really? Which part of the code is dead? I see pgstat_recv_inquiry() is updating the request_time after receiving the inquiry, and pgstat_db_requested() is looking at it when writing the files. If we can simplify the code by keeping just OIDs, let's do that. I think the main reason why we haven't done that in 187492b6 was to keep as much of the existing logic (and maybe change it in a separate patch). regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On 2014-12-25 14:36:42 -0500, Tom Lane wrote: > I wonder whether when multiple processes are demanding statsfile updates, > there's some misbehavior that causes them to suck CPU away from the stats > collector and/or convince it that it doesn't need to write anything. > There are odd things in the logs in some of these events. For example in > today's failure on hamster, > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2014-12-25%2016%3A00%3A07 > there are two client-visible wait-timeout warnings, one each in the > gist and spgist tests. But in the postmaster log we find these in > fairly close succession: > > [549c38ba.724d:2] WARNING: pgstat wait timeout > [549c39b1.73e7:10] WARNING: pgstat wait timeout > [549c38ba.724d:3] WARNING: pgstat wait timeout > > Correlating these with other log entries shows that the first and third > are from the autovacuum launcher while the second is from the gist test > session. So the spgist failure failed to get logged, and in any case the > big picture is that we had four timeout warnings occurring in a pretty > short span of time, in a parallel test set that's not all that demanding > (12 parallel tests, well below our max). Not sure what to make of that. My guess is that a checkpoint happened at that time. Maybe it'd be a good idea to make pg_regress start postgres with log_checkpoints enabled? My guess is that we'd find horrendous 'sync' times. Michael: Could you perhaps turn log_checkpoints on in the config? > BTW, I notice that in the current state of pgstat.c, all the logic for > keeping track of request arrival times is dead code, because nothing is > actually looking at DBWriteRequest.request_time. This makes me think that > somebody simplified away some logic we maybe should have kept. But if > we're going to leave it like this, we could replace the DBWriteRequest > data structure with a simple OID list and save a fair amount of code. That's indeed odd. Seems to have been lost when the statsfile was split into multiple files. Alvaro, Tomas? I wondered for a second whether the split could be responsible somehow, but there's reports of that in older backbranches as well: http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=mereswine&dt=2014-12-23%2019%3A17%3A41 Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Andres Freund writes: > So I think a better way to deal with that warning would be a good > idea. Besides somehow making the mechanism there are two ways to attack > this that I can think of, neither of them awe inspiring: > 1) Make that WARNING a LOG message instead. Since those don't get send > to the client with default settings... > 2) Increase PGSTAT_MAX_WAIT_TIME even further than what 99b545 increased > it to. Yeah, I've been getting more annoyed by that too lately. I keep wondering though whether there's an actual bug underneath that behavior that we're failing to see. PGSTAT_MAX_WAIT_TIME is already 10 seconds; it's hard to credit that increasing it still further would be "fixing" anything. The other change would also mainly just sweep the issue under the rug, if there is any issue and it's not just that we're overloading underpowered buildfarm machines. (Maybe a better fix would be to reduce MAX_CONNECTIONS for the tests on these machines?) I wonder whether when multiple processes are demanding statsfile updates, there's some misbehavior that causes them to suck CPU away from the stats collector and/or convince it that it doesn't need to write anything. There are odd things in the logs in some of these events. For example in today's failure on hamster, http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2014-12-25%2016%3A00%3A07 there are two client-visible wait-timeout warnings, one each in the gist and spgist tests. But in the postmaster log we find these in fairly close succession: [549c38ba.724d:2] WARNING: pgstat wait timeout [549c39b1.73e7:10] WARNING: pgstat wait timeout [549c38ba.724d:3] WARNING: pgstat wait timeout Correlating these with other log entries shows that the first and third are from the autovacuum launcher while the second is from the gist test session. So the spgist failure failed to get logged, and in any case the big picture is that we had four timeout warnings occurring in a pretty short span of time, in a parallel test set that's not all that demanding (12 parallel tests, well below our max). Not sure what to make of that. BTW, I notice that in the current state of pgstat.c, all the logic for keeping track of request arrival times is dead code, because nothing is actually looking at DBWriteRequest.request_time. This makes me think that somebody simplified away some logic we maybe should have kept. But if we're going to leave it like this, we could replace the DBWriteRequest data structure with a simple OID list and save a fair amount of code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
Hi, We quite regularly have buildfarm failures that are caused by 'WARNING: pgstat wait timeout' at random points during the build. Especially on some of the ARM buildfarm animals those are really frequent, to the point that it's hard to know the actual state of the buildfarm without looking at several animals. The ARM ones are probably affected more frequently because they'll often run on sd cards and such. So I think a better way to deal with that warning would be a good idea. Besides somehow making the mechanism there are two ways to attack this that I can think of, neither of them awe inspiring: 1) Make that WARNING a LOG message instead. Since those don't get send to the client with default settings... 2) Increase PGSTAT_MAX_WAIT_TIME even further than what 99b545 increased it to. Does anybody have an actually good idea? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers