Re: [HACKERS] max_standby_delay considered harmful
rk. Linus was directly aiming to maximize the number of person-hours thrown at debugging and development, even at the possible cost of instability in the code and user-base burnout if any serious bug proved intractable." I continue to be disappointed at how contributing code to PostgreSQL is far more likely to come with a dose of argument and frustration rather than reward, and this discussion is a perfect example of such. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Erik Rijkers wrote: Everything together: the raid is what Areca call 'raid10(1E)'. (to be honest I don't remember what that 1E exactly means - extra flexibility in the number of disks, I think). Standard RAID10 only supports an even number of disks. The 1E variants also allow putting an odd number in. If you're using an even number, like in your case, the results are the same until you start losing drives, at which point the degradation performance pattern changes a bit due to differences in how things are striped. See these for more info: http://bytepile.com/raid_class.php and http://en.wikipedia.org/wiki/Non-standard_RAID_levels#IBM_ServeRAID_1E http://publib.boulder.ibm.com/infocenter/eserver/v1r2/index.jsp?topic=/diricinfo/fqy0_craid1e.html I don't think using RAID1E has anything to do with your results, but it is possible given your test configuration that part of the difference you're seeing relates to where on disk blocks are stored. If you take a hard drive and write two copies of something onto it, the second copy will be a little slower than the first. That's just because how drive speed drops over the surface as you move further along it. There's some examples of what that looks like at http://projects.2ndquadrant.it/sites/default/files/pg-hw-bench-2010.pdf on pages 21-23. Returning to your higher level results, one of the variables you weren't sure how to account for was caching effects on the standby--the possibility that it just didn't have the data cached the same as the master impacting results. The usual way I look for that is by graphing the pgbench TPS over time. In that situation, you can see it shoot upwards over time, very obvious pattern. Example at http://projects.2ndquadrant.it/sites/default/files/pgbench-intro.pdf on pages 36,37. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Bruce Momjian wrote: Remember, delaying wal application just delays making the standby a master and makes the slave data appear staler. We can just tell people that the larger their queries are, the larger this delay will be. If they want to control this, they can set 'statement_timeout' already. While a useful defensive component, statement_timeout is a user setting, so it can't provide guaranteed protection against a WAL application denial of service from a long running query. A user that overrides the system setting and kicks off a long query puts you right back into needing a timeout to ensure forward progress of standby replay. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Yeb Havinga wrote: Rob Wultsch wrote: I can not imagine setting this value to anything other than a bool and most of the time that bool would be -1. That's funny because when I was reading this thread, I was thinking the exact opposite: having max_standby_delay always set to 0 so I know the standby server is as up-to-date as possible. If you ask one person about this, you'll discover they only consider one behavior here sane, and any other setting is crazy. Ask five people, and you'll likely find someone who believes the complete opposite. Ask ten and carefully work out the trade-offs they're willing to make given the fundamental limitations of replication, and you'll arrive at the range of behaviors available right now, plus some more that haven't been built yet. There are a lot of different types of database applications out there, each with their own reliability and speed requirements to balance. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Heikki Linnakangas wrote: Robert Haas wrote: I am not convinced it will be unpredictable. The only caveats that I've seen so far are: - You need to run ntpd. - Queries will get cancelled like crazy if you're not using steaming replication. And also in situations where the master is idle for a while and then starts doing stuff. That's the most significant source of confusion, IMHO, I wouldn't mind the requirement of ntpd so much. I consider it mandatory to include an documentation update here that says "if you set max_standby_delay > 0, and do not run something that regularly generates activity to the master like [example], you will get unnecessary query cancellation on the standby". As well as something like what Josh was suggesting, adding warnings that this is "for advanced users only", to borrow his wording. This is why my name has been on the open items list for a while now--to make sure I follow through on that. I haven't written it yet because there were still changes to the underlying code being made up until moments before beta started, then this discussion started without a break between. There are a clear set of user land things that can be done to make up the deficiencies in the state of the server code, but we won't even get to see how they work out in the field (feedback needed to improve the 9.1 design) if this capability goes away altogether. Is it not clear that there are some people who consider the occasional bit of cancellation OK, because they can correct for at the application layer and they're willing to factor it in to their design if it allows using the otherwise idle HA standby? I'm fine with expanding that section of the documentation too, to make it more obvious that's the only situation this aspect of HS is aimed at and suitable for. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Greg Stark wrote: On Thu, May 6, 2010 at 2:36 AM, Tom Lane wrote: One reason I believe this isn't so critical as all that is that it only matters for cases where the operation on the master took an exclusive lock. Uhm, or a vacuum ran. Or a HOT page cleanup occurred, or a btree page split deleted old tuples. Right; because there are so many regularly expected causes for query cancellation, the proposed boolean setup really hurts the ability of a server whose primary goal is high-availability to run queries of any useful duration. For years I've been hearing "my HA standby is idle, how can I put it to use?"; that's the back story of the users I thought everyone knew were the known audience waiting for this feature. If the UI for vacuum_defer_cleanup_age that prevented these things was good, I would agree that the cases where max_standby_delay does something useful are marginal. That's why I tried to get someone working on SR to provide a hook for that purpose months ago. But since the vacuum adjustment we have in completely obtuse xid units, that leaves max_standby_delay as the only tunable here that you can even think about in terms of human time. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On a somewhat disappointing correspondence
Bruce Momjian wrote: We are not very good at _removing_ functionality/GUCs, and based on the discussion so far, I think there is a very slim chance we would get it right for 9.0, which is why I suggested converting it to a boolean and revisiting this for 9.1. There's some feedback you can only get by exposing a complicated feature to the users and seeing what they make of it. This one hasn't even had a full week to gather beta user reports. Given that it's easy to disable (just limiting the range on what is effectively a 3-way switch to two positions), I don't understand why you're pushing at this point for its removal. You could be encouraging testing instead, which I believe is needed to know exactly what the right thing to do in 9.1 is. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Heikki Linnakangas wrote: Let's rip out the concept of a delay altogether, and make it a boolean. If you really want your query to finish, set it to -1 (using the current max_standby_delay nomenclature). If recovery is important to you, set it to 0. So the only user options would be "allow long-running queries to block WAL application forever" and "always cancel queries on conflict?" That would be taking away the behavior I was going to suggest as the default to many customers I work with. I expect a non-trivial subset of people using this feature will set max_standby_delay to is some small number of minutes, similarly to how archive_timeout is sized now. Enough time to get reasonably sized queries executed, not so long as to allow something that might try to run for hours on the standby to increase failover catchup time very much. The way the behavior works is admittedly limited, and certainly some people are going to want to set it to either 0 or -1. But taking it away altogether is going to cripple one category of potential Hot Standby use in the field. Consider this for a second: do you really think that Simon would have waded into this coding mess, or that I would have spent as much energy as I have highlighting issues with its use, if there wasn't demand for it? If it wouldn't hurt the usefulness of PostgreSQL 9.0 significantly to cut it, I'd have suggested that myself two months ago and saved everyone (especially myself) a lot of trouble. If you have the monitoring in place to sensibly monitor the delay between primary and standby, and you want a limit on that, you can put together a script to flip the switch in postgresql.conf if the standby falls too much behind. There's a couple of things you should do in order for max_standby_delay to working as well as it can. Watching clock sync and forcing periodic activity are two of them that always come up. Those are both trivial to script for, and something I wouldn't expect any admin to object to. If you need a script that involves changing a server setting to do something, that translates into "you can't do that" for a typical DBA. The idea of a program regularly changing a server configuration setting on a production system is one you just can't sell. That makes this idea incredibly more difficult to use in the field than any of the workarounds that cope with the known max_standby_delay issues. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Josh Berkus wrote: Having built-in replication in PostgreSQL was supposed to give the *majority* of users a *simple* option for 2-server failover, not cater only to the high end. If that's your criteria, 9.0 has already failed that goal. Just the fact that you have to make your own base backup and manage that whole area alone excludes "simple" as a goal anyone can claim 9.0 meets with a straight face, long before you get to the mechanics of how HS handles query cancellation. The new replication oriented features are functional, but neither are close to simple yet. Based on the complication level of replication in other database products, I wouldn't put money on that even being possible. You can make a simpler path the default one, but the minute you want to support more than one use case the complexity involved in setting up replication explodes. Anyway, I have no idea where the idea that recommending time synchronization is a somehow a "high end" requirement, given that every OS I'm aware of makes that trivial nowadays. Slave servers that drift too far away from the master time are going to cause all sorts of problems for user apps too. Any app that gauges how long ago something happened by comparing a database timestamp with now() is going to give misleading results for example, and I know I see those all the time. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
Tom Lane wrote: 1. The timestamps we are reading from the log might be historical, if we are replaying from archive rather than reading a live SR stream. In the current implementation that means zero grace period for standby queries. Now if your only interest is catching up as fast as possible, that could be a sane behavior, but this is clearly not the only possible interest --- in fact, if that's all you care about, why did you allow standby queries at all? If the standby is not current, you may not want people to execute queries against it. In some situations, returning results against obsolete data is worse than not letting the query execute at all. As I see it, the current max_standby_delay implementation includes the expectation that the results you are getting are no more than max_standby_delay behind the master, presuming that new data is still coming in. If the standby has really fallen further behind than that, there are situations where you don't want it doing anything but catching up until that is no longer the case, and you especially don't want it returning stale query data. The fact that tuning in that direction could mean the standby never actually executes any queries is something you need to monitor for--it suggests the standby isn't powerful/well connected to the master enough to keep up--but that's not necessarily the wrong behavior. Saying "I only want the standby to execute queries if it's not too far behind the master" is the answer to "why did you allow standby queries at all?" when tuning for that use case. 2. There could be clock skew between the master and slave servers. Not the database's problem to worry about. Document that time should be carefully sync'd and move on. I'll add that. 3. There could be significant propagation delay from master to slave, if the WAL stream is being transmitted with pg_standby or some such. Again this results in cutting into the standby queries' grace period, for no defensible reason. Then people should adjust their max_standby_delay upwards to account for that. For high availability purposes, it's vital that the delay number be referenced to the commit records on the master. If lag is eating a portion of that, again it's something people should be monitoring for, but not something we can correct. The whole idea here is that max_standby_delay is an upper bound on how stale the data on the standby can be, and whether or not lag is a component to that doesn't impact how the database is being asked to act. In addition to these fundamental problems there's a fatal implementation problem: the actual comparison is not to the master's current clock reading, but to the latest commit, abort, or checkpoint timestamp read from the WAL. Right; this has been documented for months at http://wiki.postgresql.org/wiki/Hot_Standby_TODO and on the list before that, i.e. "If there's little activity in the master, that can lead to surprising results." The suggested long-term fix has been adding keepalive timestamps into SR, which seems to get reinvented every time somebody plays with this for a bit. The HS documentation improvements I'm working on will suggest that you make sure this doesn't happen, that people have some sort of keepalive WAL-generating activity on the master regularly, if they expect max_standby_delay to work reasonably in the face of an idle master. It's not ideal, but it's straightforward to work around in user space. I'm inclined to think that we should throw away all this logic and just have the slave cancel competing queries if the replay process waits more than max_standby_delay seconds to acquire a lock. This is simple, understandable, and behaves the same whether we're reading live data or not. I don't consider something that allows queries to execute when not playing recent "live" data is necessarily a step forward, from the perspective of implementations preferring high-availability. It's reasonable for some people to request that the last thing a standby that's not current (last thing received) should be doing is answering any queries, when it doesn't have current data and it should be working on catchup instead. Discussion here obviously has wandered past your fundamental objections here and onto implementation trivia, but I didn't think the difference between what you expected and what's actually committed already was properly addressed before doing that. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Erik Rijkers wrote: OS: Centos 5.4 2 quadcores: Intel(R) Xeon(R) CPU X5482 @ 3.20GHz Areca 1280ML primary and standby db both on a 12 disk array (sata 7200rpm, Seagat Barracuda ES.2) To fill in from data you already mentioned upthread: 32 GB RAM CentOS release 5.4 (Final), x86_64 Linux, 2.6.18-164.el5 Thanks for the all the reporting you've done here, really helpful. Questions to make sure I'm trying to duplicate the right thing here: Is your disk array all configured as one big RAID10 volume, so essentially a 6-disk stripe with redundancy, or something else? In particular I want know whether the WAL/database/archives are split onto separate volumes or all on one big one when you were testing. Is this is on ext3 with standard mount parameters? Also, can you confirm that every test you ran only had a single pgbench worker thread (-j 1 or not specified)? That looked to be the case from the ones I saw where you posted the whole command used. It would not surprise me to find that the CPU usage profile of a standby is just different enough from the primary that it results in the pgbench program not being scheduled enough time, due to the known Linux issues in that area. Not going to assume that, of course, just one thing I want to check when trying to replicate what you've run into. I didn't see any glaring HS performance issues like you've been reporting on last time I tried performance testing in this area, just a small percentage drop. But I didn't specifically go looking for it either. With your testing rig out of service, we're going to try and replicate that on a system here. My home server is like a scaled down version of yours (single quad-core, 8GB RAM, smaller Areca controller, 5 disks instead of 12) and it's running the same CentOS version. If the problems really universal I should see it here too. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing file in git repo
Aidan Van Dyk wrote: HINT: It's all been done and posted to -hackers before too... Along with comparisons on on whte "one-time" conversions fare (parsecvs, cvs2svn/git), etc, as well as long discussion on which keyword you want expanded, and which you don't, etc. bla bla bla... And in some cases, even indexed to make them easier to find again due to links at http://wiki.postgresql.org/wiki/Switching_PostgreSQL_from_CVS_to_Git There's an item on there that talks about Tom's specific requests related to git backbranch workflow, from the last time this came up. Given Andrew's breakthrough with the buildfarm today, and so many developers due to be in the same place here this month, that might be something worth working through the details of live during PGCon. Most of the other steps beyond that are on Peter's plate, which isn't as concerning to me because we at least know for sure he's on-board with pushing toward the conversion. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_migrator to /contrib in a later 9.0 beta
Bruce Momjian wrote: As a summary, let me list the migrations pg_migrator supports: 8.3 -> 8.4 8.4 -> 9.0 8.3 -> 9.0 Surprisingly, it is 8.3 -> 8.4 that has the most restrictions because it doesn't have access to the features we added in Postgres 9.0. Tom is right that the code could be cleaned up if we removed 8.3 -> 8.4, but more importantly the documentation would be clearer. I think it's extremely valuable that either 8.3 or 8.4 can be upgraded to 9.0. But let's face it: in the long run, the number of people who are going to use pg_migrator for a 8.3->8.4 migration, but that's haven't already done so, is a small user base. The feature set improvement in 8.4 had a lot of great stuff, but few that were compelling from a "now I can do something completely impossible before!" standpoint. As was noted recently during the "Native DB replication for PG" discussion over on pgsql-general last week, there are plenty of people happily running a stable 8.3 who just ignore 8.4 altogether for that reason. The replication features in 9.0 are compelling in that way though, and I expect to see plenty of upgrades to that version from both 8.3 and 8.4 installs. If that works fine right now, I would prefer to see that documented as a special case two-versions at once situation that people shouldn't necessarily expect in the future, but certainly valuable to keep going if the maintenance burden isn't so bad. Balancing out development reality with the ideal situation from the perspective of [potential|current] customers that I deal with every day, what I would prefer to see here is: 1) Commit a streamlined pg_migrator that only handles conversions with 9.0 as a target into contrib, and ship it with 9.0. Like Bruce, I had presumed that the discussion about whether that was going to happen would happen in parallel with beta (read: right now), rather than its already being too late to even consider. I think it's completely bizarre from an advocacy standpoint to even consider that you wouldn't ship such a tool with the core database, now that it's been around for long enough to have a positive track record. 2) Deprecate the pg_migrator hosted on pg_foundry as only being recommended for limited 8.3->8.4 upgrades. Essentially stop active development on the version there, and focus on the one in contrib/ instead. People who want an improved 8.3->8.4 tool can always contract with someone to backport fixes needed for their particular use case. I think we're past the point where the community at large (meaning: mainly Bruce right now) should be expected to do that, now that 9.0 is coming out, so long as 8.3 to 9.0 conversions are available too. I can't imagine suggesting to anyone that they upgrade in-place from 8.3 to 8.4 right now. Everybody I talk to who isn't already on 8.4 is delaying upgrades in anticipation of 9.0 later this year or early next. My main issues with this project continuing to be hosted in pgfoundry are: 1) Perceived lack of confidence and/or legitimacy for it as an in-place upgrade solution, which would be a terrible PR move. When people ask about in-place upgrades and I tell them "there's a tool you can download for that", they look at me in terror and ask if I'm serious that it isn't just included in the core code. The improvement between answering that way and saying "yes, the tool for 8.3 and 8.4 is included with the core distribution", from the perspective of selling people on adopting PostgreSQL, cannot be overstated. 2) Anyone who looks at pgfoundry for more than a few minutes walks away covered with the scent of dead projects. One reason for that is that related to how painful it is to develop there. I don't want to reignite a full anti-pgfoundry discussion here. Suffice it to say that there are many of us who just can't bear working with CVS anymore who have just given up on doing anything useful to projects hosted there. Something that's in core (and therefore included in the git conversion already being published) is much easier to work with and submit patches against. I'm already dumping git clones of the PG repo on every system I do serious work on. If each of those were then capable of generating pg_migrator patches I could submit, I would actually do that each time I use the tool for an upgrade and notice how it could be improved. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct
Aidan Van Dyk wrote: It's all about the path of least *suprise*. My "least suprise" would have been that a similar config I have now with my PITR slaves would pretty much still work, and not suddenly start accepting connections, and even worse, at some later date when I've already verified that it was doing what I expected with the configuration. The first time I setup a system to test HS, when I got to the point where the slave was up and running as a standard warm standby, I expected there to be something else I had to do in order for it to be available for queries. When I fired up psql and I was able to run queries without doing anything extra, I was surprised--but it was that fun, everything just works when I expected it to be harder than that kind of surprise. One of the reasons the version number was bumped up to 9.0 was to put people on warning that they should not assume their old setups would port forward without behavioral changes. The fact that existing warm-standby server users will be surprised to find they can run queries without doing anything special could be considered under that banner. If you feel that's not obvious enough, that could argue for more prominent documentation of that fact, rather than turning it off. The idea that it should be made harder to enable just to protect the expectations current users, and therefore introduce yet another place where PostgreSQL is less friendly to get started with than it could be, is backwards from the perspective of making things as easy as possible for new users. Arguing from a usability standpoint needs to consider both new and existing user requirements, and those are quite opposed to one another in terms of what default makes more sense IMHO. Now, if the argument is from the perspective of "this adds performance/reliability issues that weren't there before", and those go away if the feature is disabled by default, that's a respectable and indisputable reason to do so. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_start_backup and pg_stop_backup Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct
Dimitri Fontaine wrote: IMO the real fun begins when we talk about multi-slaves support and their roles (a failover slave wants the master to wait for it to have applied the WAL before to commit, a reporting slave not so much). So you'd set the Availability level on each slave and wouldn't commit on the master until each slave got what it's configured for, or something like that. Ultimately the commit is stuck waiting for the slowest committing sync operation on the list; it's the bottleneck. Let's presume that the commit waits can be done in parallel, after sending the transaction to every slave. Given that and the situation you describe, having per-node sync levels only turns out to be a useful optimization if the reporting slave commits slower than the failover slave does. The master is going to be stuck waiting for the slowest one of the batch regardless of whether you've optimized them individually. There is a related situation that I think a per-node sync option would be more obviously useful for: local failover slave, remote disaster recovery slave over a WAN, where you accept that a serious disaster taking out a whole data center will lose some transactions. In that situation, you'd probably want fsync for the local slave, while going async for the remote datacenter. If the commits are done in a serial fashion, tuning sync per-node would be much more valuable in many use cases. Regardless, I wouldn't want to burden the first sync rep version with this requirement. Let's wait until the current scope is cleared before trying to move the goalposts for the people working on that. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_start_backup and pg_stop_backup Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct
Aidan Van Dyk wrote: I remember a presentation at pgcon a while ago, it was probaly Fujii (from NTT?) about their log streaming, and at that time, they talked about different "sync" options... It's all outlined at http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Erik Rijkers wrote: This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). Any possibility the standby is built with assertions turned out? That's often the cause of this type of difference between pgbench results on two systems, which easy to introduce when everyone is building from source. You should try this on both systems: psql -c "show debug_assertions" just to rule that out. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
Jim Nasby wrote: I've also seen large shared buffer settings perform poorly outside of IO issues, presumably due to some kind of internal lock contention. I tried running 8.3 with 24G for a while, but dropped it back down to our default of 8G after noticing some performance problems. Unfortunately I don't remember the exact details, let alone having a repeatable test case We got a report for Jignesh at Sun once that he had a benchmark workload where there was a clear performance wall at around 10GB of shared_buffers. At http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_best he says: "Shared Bufferpool getting better in 8.2, worth to increase it to 3GB (for 32-bit PostgreSQL) but still not great to increase it more than 10GB (for 64-bit PostgreSQL)" So you running into the same wall around the same amount just fuels the existing idea there's an underlying scalablity issue in there. Nobody with that right hardware has put it under the light of a profiler yet as far as I know. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
Kevin Grittner wrote: Perhaps, but be aware the current default benchmarked better than a larger setting in bulk loads. http://archives.postgresql.org/pgsql-hackers/2009-06/msg01382.php The apparent reason is that when there were fewer of them the WAL files were re-used before the RAID controller flushed them from BBU cache, causing an overall reduction in disk writes. I have little doubt that *without* a good BBU cached controller a larger setting is better, but it's not universally true that bigger is better on this one After running some tests, I believe what you observed is more universal than that, because I've been able to replicate a performance drop from a checkpoint_segments increase on a system without a BBWC (laptop with write caching turned off) where I really expected it to help. My working theory is that are a broader set of situations where limiting the working set of WAL files to a small number in order to decrease cache disruption applies than just when you've got hardware caching involved. However, I believe the guidelines to increasing this parameter along with shared_buffers still applies. The real case for wins with more segments is when you also have a large buffer cache, because that's where the write savings from postponed database writes to often used blocks becomes easy to measure. I've found it difficult today to demonstrate a slam-dunk bulk loading improvement through checkpoint_segments increase when shared_buffers is fixed at its default of ~32MB. If that keeps up, I might soon have enough data to bust the idea that it alone improves bulk loading performance when you haven't touched anything else in the default config, which was unexpected to me. Will report back once I've got a full handle on it. Thanks for reminding me about this counter example, it slipped by in that broader thread before and I didn't try doing that myself until today, to see that you're onto something there. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
Robert Haas wrote: Well, why can't they just hang out as dirty buffers in the OS cache, which is also designed to solve this problem? If the OS were guaranteed to be as suitable for this purpose as the approach taken in the database, this might work. But much like the clock sweep approach should outperform a simpler OS caching implementation in many common workloads, there are a couple of spots where making dirty writes the OS's problem can fall down: 1) That presumes that OS write coalescing will solve the problem for you by merging repeat writes, which depending on implementation it might not. 2) On some filesystems, such as ext3, any write with an fsync behind it will flush the whole write cache out and defeat this optimization. Since the spread checkpoint design has some such writes going to the data disk in the middle of the currently processing checkpoing, in those situations that's likely to push the first write of that block to disk before it can be combined with a second. If you'd have kept it in the buffer cache it might survive as long as a full checkpoint cycle longer.. 3) The "timeout" as it were for shared buffers is driven by the distance between checkpoints, typically as long as 5 minutes. The longest a filesystem will hold onto a write is probably less. On Linux it's typically 30 seconds before the OS considers a write important to get out to disk, longest case; if you've already filled a lot of RAM with writes it can be substantially less. I guess the obvious question is whether Windows "doesn't need" more shared memory than that, or whether it "can't effectively use" more memory than that. It's probably can't effectively use. We know for a fact that applications where blocks regularly accumulate high usage counts and have repeat read/writes to them, which includes pgbench, benefit in several easy to measure ways from using larger amounts of database buffer cache. There's just plain old less churn of buffers going in and out of there. The alternate explanation of "Windows is just so much better at read/write caching that you should give it most of the RAM anyway" doesn't really sound as probable as the more commonly proposed theory "Windows doesn't handle large blocks of shared memory well". Note that there's no discussion of the why behind this is in the commit you just did, just the description of what happens. The reasons why are left undefined, which I feel is appropriate given we really don't know for sure. Still waiting for somebody to let loose the Visual Studio profiler and measure what's causing the degradation at larger sizes. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
Robert Haas wrote: It seems intuitive to me that setting shared_buffers too small will also cause a performance problem, especially for write-heavy workloads, but I'm less sure I can clearly explain why. More text to add: When the server needs to allocate more space for reading or writing blocks, and the next available space available is a block that's been modified but not used recently, that block will be written out to the operating system. With large settings for shared_buffers, that prefers evicting blocks that are used infrequently from the cache. The main downside to tuning in that direction is that all recently modified blocks not already written must be flushed to disk during each checkpoint, which can cause large amounts of disk writes grouped together. But if shared_buffers is set too low instead, and therefore only a portion of the active working set can be kept in the buffer cache at once, that can cause the same block to be written out more frequently than is optimal. And I'm curious why the correct setting is different on Windows than it is on other platforms. Can anyone shed some light on this? No one has ever come up with a good explanation for why this is other than "Windows doesn't seem to like large amounts of shared memory". But we've seen it show up in too many benchmarks to dismiss.Dave and Greg Stark did benchmarks focused on this: http://archives.postgresql.org/pgsql-hackers/2008-12/msg3.php that Magnus concurred with last time I tried to dig for more info about this specific subject. And the last time I remember this caming up it was with someone who suggested 8MB (!) worked best on their Windows system: http://archives.postgresql.org/pgsql-general/2009-12/msg00475.php -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rogue TODO list created
Bruce Momjian wrote: http://wiki.postgresql.org/wiki/User:Simon Well, unless Simon wants to keep it for some reason, it should be removed, and if kept, renamed. Simon? I already retitled the copy left on the personal page and deleted the one that was causing the confusion. I doubt anyone will accidentally consider official a page labeled "Simon's Work in Progress: Prioritised Todo" that's attached to User:Simon, that nothing links to, and that doesn't show up on the first set of results if you search for "todo" either. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rogue TODO list created
Bruce Momjian wrote: What is "Prioritised" Todo? It looks like a copy of the TODO list that was created on March 23, 2010, and only you and Simon have modified it: http://wiki.postgresql.org/index.php?title=Prioritised_Todo&action=history Well, the updates I made to that one were strictly an accident; I didn't notice I was editing the forked version. I have put everything I did in that session back onto the right one. The "Prioritised Todo" wasn't linked to anywhere that you'd find it except via a bit of bad late night searching like I did. I'm not sure what Simon was tinkering with there, but having fallen victim to it myself I agree having it there with that name is not a great choice. I moved that bit of work in progress he was doing to http://wiki.postgresql.org/wiki/User:Simon and deleted the one with the confusing name. Sorry about propagating my own confusion to others. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
Kevin Grittner wrote: I wonder if we should add any hints telling people what they might see as problems if they are too far one way or the other. (Or does that go beyond the scope of what makes sense in TFM?) It's hard to figure that out. One of the talks I'm doing at PGCon next month is focusing on how to monitor things when increasing shared_buffers and the related checkpoint parameters, so that you don't make things worse. It's going to take a solid 45 minutes to cover that, and a section of the manual covering this bit of trivial would be a few pages long and hard to follow. Maybe I'll get that in shape to insert into TFM eventually, but it's a bit bleeding edge to put into there now. Trying to explain it live to other people a couple of times should make it clearer how to describe what I do. As for updating the size recommendations, the text at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server has been beaten into the status quo by a number of people. Here's what might make sense from there to insert into the docs, removing the bits referring to older versions, rewriting a bit for manual tone, and noting the checkpoint issues: If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers on a dedicated database server is 25% of the memory in your system. If you have less RAM, you'll have to account more carefully for how much memory the operating system is taking up, allocating a fraction of the free memory instead. There are some workloads where even larger settings for shared_buffers are effective. But given the way PostgreSQL also relies on the operating system cache, it's unlikely you'll find using more than 40% of RAM to work better than a smaller amount. On Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the OS cache more instead. The useful size range for shared_buffers on Windows systems is generally from 64MB to 512MB of RAM. Larger settings for shared_buffers usually require a corresponding increase in checkpoint_segments, in order to spread out writing large quantities of changed or new data in the cache over a longer period of time. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Josh Berkus wrote: I just worry about any feature which doesn't get as far as a user-visible implementation. If someone doesn't do the rest of the parts soon, such features tend to atrophy because nobody is using them. While they're limited, there are complexly viable prototype quality implementations possible here without a large amount of work to get them started. I'm not worried too much about this feature being unused. As I was just reminded when assembling an page on the wiki about it: http://wiki.postgresql.org/wiki/Materalized_Views it's currently ranked #1--by a large margin--on the UserVoice feature request survey that Peter kicked off. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Josh Berkus wrote: What would be the use case for (1) by itself? There isn't any use case for just working on the infrastructure, just like there's no use case for "Syntax for partitioning" on its own. That why people rarely work on that part of these problems--it's boring and produces no feature of value on its own. I believe that in both cases, attempts to build the more complicated parts, ones that don't first address some of the core infrastructure first, will continue to produce only prototypes. I don't want to see Materialized Views wander down the same path as partitioning, where lots of people produce "fun parts" patches, while ignoring the grunt work of things like production quality catalog support for the feature. I think Pavel's proposal got that part right by starting with the grammar and executor setup trivia. And Robert's comments about the details in that area it's easy to forget about hit the mark too. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Josh Berkus wrote: There are basically 2 major parts for materialized views: A) Planner: Getting the query planner to swap in the MatView for part of a query automatically for query plan portions which the MatView supports; B) Maintenance: maintaining the MatView data according to the programmed scheme (synch, asynch, periodic). I'm run more into problems where it's perfectly fine to specify using the materialized view directly in the query, but keeping that view up to date usefully was the real problem. The whole idea of getting a MV used automatically is valuable, but far down the roadmap as I see it. Not everyone would agree of course, and your description does suggest a better way to organize a high-level summary though; here's a first cut: 1) Creation of materalized view Current state: using "CREATE TABLE AS" or similar mechanism, maintain manually Optimal: "CREATE MATERIALIZED VIEW" grammar, metadata to store MV data, dump/reload support 2) Updating materialized views Current state: periodically create new snapshots, or maintain using triggers Optimal: Built-in refresh via multiple strategies, with minimal locking as to improve concurrent access 3) Using materialized views in the planner Current state: specify the manually created MV in queries that can use it Optimal: Automatically accelerate queries that could be satisfied by substituting available MVs With (1) being what I think is the only GSoC sized subset here. I'm not saying someone can't jump right into (3), using the current implementations for (1) and (2) that are floating around out there. I just think it would end up wasting a fair amount of work on prototypes that don't work quite the same way as the eventual fully integrated version. You certainly can start working on (3) without a fully fleshed out implementation of (2), I don't know that it makes sense to work on before (1) though. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Robert Haas wrote: I also think that you're underestimating the number of problems that will have to be solved to get this done. It's going to take some significant work - both design work and coding work - to figure out how this should integrate into the rest of the system. (What should be the value of pg_class.relkind? Where should the node representation of the snapshot query be stored? And did we handle all of those OID dependencies correctly?) I don't think I'm underestimating all that, but I suspect Pavel is by a considerable amount. This is why I've been suggesting that a GSoC scope here might just be wrestling with this area of the problem for the whole summer--not even getting into updates beyond a completely trivial implementation, if any at all. Things like "handle OID dependencies" are definitely not on the fun side of the development work that people tend to think about in advance. Where I can see this possibly falling down (other than being just too much work for a relative PostgreSQL novice to get it done in one summer) is if there are concerns about it being incompatible with incrementally-updated views. I imagine that we're going to want to eventually support both, so we need to make sure that this implementation doesn't box us into a corner. Exactly my concern; comitting this part without knowing how that's later going to fit into place strikes me the sort of the thing this project doesn't like to do. The alternate approach of starting with the update machinery is less likely IMHO to get stuck wondering if there's a future blind spot coming or not, since you'd be building from the bottom up starting with the hardest parts. From the rest of your comments, I'm comfortable that you're in sync with the not necessarily obvious risky spots here I wanted to raise awareness of. It's unreasonable to expect we'll have exactly the same priorities here, and I doubt it's useful to debate how I perceive the merit of various development subsets here compared to yourself. I don't think it's really important whether anyone agrees with me or not about exactly the value of a full table lock implementation. The main thing I'm concerned about is just that it's noted as a known risky part, one that could end up blocking the project's ability to commit even a subset of the proposed patch here. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Robert Haas wrote: It's not obvious to me that a brief full-table lock wouldn't be acceptable for an initial implementation. Obviously it wouldn't be suitable for every use case but since we're talking about manually refreshed views that was bound to be true anyway. There already is an initial implementation of sorts. There are a couple of ways you can build these right now, so any new development has to look like it will end with good odds of being an improvement over what's already available before it's worth putting development resources into. As a rough idea of what people want these for in the field, based on what I've seen requests for, imagine that someone has a 1TB table they're materializing a view on in order to get at least a 10:1, and hopefully close to a 100:1, speedup on viewing summary data. Now, picture what happens if you have someone doing a sequential scan on the MV, which is still quite big, the updater process lines up to grab an exclusive lock when it's done, and now a second user wanting to read a single row quickly comes along behind it. Given a full-table lock implementation, that scenario is unlikely to play out with the second user getting a fast response. They'll likely sit in a lock queue for some potentially long period of time instead, waiting for the active seq scan to finish then the update to happen. You have to build it that way or a steady stream of people reading could block out updates forever. To be frank, that makes for a materalized view implementation of little value over what you can currently do as far as I'm concerned. It might be interesting as a prototype, but that's not necessarily going to look like what's needed to do this for real at all. I'm not a big fan of dumping work into projects when you can see exactly how it's going to fail before you even get started. As I see if, if you know where it's going to fall down, you don't need to build a prototype as an exercise to show you how to build it--you should work on that part first instead. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Heikki Linnakangas wrote: Your proposal basically describes doing 1, in a limited fashion where the view is not updated automatically, but only when the DBA runs a command to refresh it. I'm not sure if that's useful enough on its own, writing "CREATE MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just writing "CREATE TABLA AS ...". But if you can do something about 2, or even a very limited part of 1, keeping the view up-to-date automatically, it becomes much more useful. You've hit upon the core issue here. You can build materialized views right now using "CREATE TABLE AS". You can even update them by creating a new table the same way, with a new name, and doing the LOCK/RENAME/DROP shuffle--what people used to do for rewriting bloated tables before there was CLUSTER. The first step in the proposal here is essentially syntax to give an easier UI for that. That's an interesting step, but recognize that it doesn't actually provide anything you can't do already. If you then note that doing any sort of incremental update to the view is a hard problem, and that a lot of the useful cases for materialized views involve tables where it's impractical to recreate the whole thing anyway, you'll inevitably find yourself deeply lost in the minutia of how to handle the updates. It's really the core problem in building what people expect from a materialized view implementation in a serious database. Chipping away at the other pieces around it doesn't move the feature that far forward, even if you get every single one of them except incremental updates finished, because everything else combined is still not that much work in comparison to the issues around updates. There certainly are a fair number of subproblems you can break out of here. I just think it's important to recognize that the path that leads to a useful GSoC project and the one that gives a production quality materialized view implementation may not have that much in common, and to manage expectations on both sides accordingly. If Pavel thinks he's going to end up being able to say "I added materialized views to PostgreSQL" at the end of the summer, that's going to end in disappointment. And if people think this project plan will lead to being able to claim PostgreSQL now has this feature, that's also not going to go well. If the scope is "add initial grammar and rewriting moving toward a future materialized view feature", which the underlying implementation noted as a stub prototype, that might work out OK. This is why I likened it to the work on "Syntax for partitioning", which has a similarly focused subgoal structure. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
a way that will be useful (and by that I mean committable quality) until there's a better way to handle updates than writing a whole new table and grabbing a full relation lock to switch to it. To do a good job just on the first two steps should take at least a whole summer anyway--there's a whole stack of background research needed I haven't seen anyone do yet, and that isn't on your plan yet. There is a precedent for taking this approach. After getting stalled trying to add the entirety of easy partitioning to PostgreSQL, the current scope has been scaled back to just trying to get the syntax and on-disk structure right, then finish off the implementation. See http://wiki.postgresql.org/wiki/Table_partitioning to get an idea how that's been broken into those two major chunks. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
An introduction to the current state of work in progress for adding improved partitioning features to PostgreSQL is documented at http://wiki.postgresql.org/wiki/Table_partitioning If you can find a small, targeted piece of that overall plan that builds on the work already done, and is in the direction of the final goal here, you may be able to make useful progress in a few months time. This area is extremely well explored already. There are 13 mailing list threads you'll need to browse through carefully just to have enough background that you're likely to build something new, rather than just wandering down a path that's already been followed but leads to a dead end. You have picked a PostgreSQL feature that is dramatically more difficult than it appears to be, and I wouldn't expect you'll actually finish even a fraction of your goals in a summer of work. You're at least in plentiful company--most students do the same. As a rule, if you see a feature on our TODO list that looks really useful and fun to work on, it's only still there because people have tried multiple times to build it completely but not managed to do so because it's harder than it appears. This is certainly the case with improving the partitioning support that's built in to the database. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SELECT constant; takes 15x longer on 9.0?
Merlin Moncure wrote: On Tue, Apr 6, 2010 at 3:58 PM, Tom Lane wrote: Greg has the right idea: show debug_assertions. why not the entire set of configure options? Given that the standard way to find those is pg_config, there's a couple of reasons why not to prefer that, on top of those Tom already mentioned: 1) pg_config isn't in the standard PostgreSQL package set in some distributions (it's sometimes in the -devel package), so it may not be available; debug_assertions is always there if you have psql. For my goals, which include benchmarking scripts I often distribute to other people, that matters. 2) It's easy to get pg_config output from your client that doesn't actually match the running server, particularly when developing. That's been the source of more than one of the times I was running a debug build on the server but didn't notice it, and therefore would have produced worthless performance numbers. Given that the main slowdowns from having assertions turned on are server side, whether or not the local client running things like psql have them turned on or not doesn't worry me as much. 3) It's a little easier to check the value of "show" in a script to confirm you're not running a bad build than to parse the output from pg_config. Here's the recipe I use for shell scripts: #!/bin/sh DEBUG=`psql -At -c "show debug_assertions"` if [ "$DEBUG" = "on" ] ; then echo "Debug build - aborting performance test" exit 1 fi Pushing this data into something like version() would solve the first two items above, while making the issue of how to parse the results in a test client even harder, given there's already too much junk in one big string there. You couldn't make the above check much simpler, which makes it hard to justify any alternative approach to grab this data. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] SELECT constant; takes 15x longer on 9.0?
David E. Wheeler wrote: By stupidly having configured with --enable-cassert --enable-debug without realizing it. I've just rebuilt without them and run the tests again using the default postgresql.conf and I'm back down to 57s and 46s over two runs. Every performance test I run, regardless of where the binaries come from or how I thought they were built, starts like this: postgres=# show debug_assertions; debug_assertions -- off (1 row) It's a really good habit to get into, or even enforce in your testing script if practical. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha release this week?
Josh Berkus wrote: I started with pgbench performance comparisons: http://wiki.postgresql.org/wiki/Pgbenchtesting I'd already created http://wiki.postgresql.org/wiki/Regression_Testing_with_pgbench for this purpose, and it looks like you started where I ended that, more or less, which is good because you didn't duplicate anything I'd already written. I just recently finished a full exploration of how the multi-threaded pgbench ends up working in practice, and will update/merge those two as part of that once I get the full data published where people can look at it. I've given up on expecting ad-hoc pgbench testing done without an extremely clear methodology to produce a lot of data, it's tough to get useful results out of that without a clear plan to follow for finding useful data points. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enable_joinremoval
Tom Lane wrote: The problem with this line of thought is that it imagines you can look at worked-out alternative plans. You can't, because the planner doesn't pursue rejected alternatives that far (and you'd not want to wait long enough for it to do so...) Not on any production system, sure. I know plenty of people who would gladly let a rejected plan enumerator run for *a day* on their development box if it let them figure out exactly why the costing on the plan they expected ended up higher than the plan they actually get. While I know you don't run into this, regular people can easily spend a week on one such problem without gaining even that much insight, given the current level of instrumentation and diagnostic tools available. "Read the source" and "ask Tom" are both effective ways to resolve that but have their limits. (Not because of you, of course--my bigger problem are people who just can't share their plans with the lists for privacy or security reasons) -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enable_joinremoval
Robert Haas wrote: The query planner is a great piece of code but it is not so transparently simple that it doesn't need debugging or instrumentation, and "why did the planner do X" has got to be one of our top ten most-frequently asked questions. Debugging and instrumentation are two slightly different issues. There is a lot more instrumentation needed in the query optimizer before people have better odds of understanding what's going on in this part of the database. Recent features like pg_stat_statements and auto_explain are just the first round of what people really want here. Now that we can get the explain data out in usable formats (XML, JSON, YAML) for a tool to manage them, the thing at the top of my list in this area for 9.1 is to track down the rumored patch that exports information about the rejected plans considered and get that comitted. That always seems what I want to look at for answering the question "why this plan instead of what I was expecting?" Stepping away from that, from the debugging perspective it seems one way to answer the question "is this unexpected behavior being caused by the new join removal code or not?" is to provide a way to toggle it off and see what changes. Much like enable_seqscan, just because we don't ever want people to use it in production doesn't necessarily mean it's a bad idea to expose it. Also, given that this is a complicated feature, I think it's reasonable to ask whether allowing it to be turned off is the right thing just from the pragmatic basis that it provides a, ahem, backup plan in case there's unexpected difficulty with it in the field. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC Query
gaurav gupta wrote: My idea is to add a functionality of Auto tuning and Auto Indexing/ Reindexing in DB languages. Ah, the classic request to start with the tuning and index wizards. Really fun to work on, always lots of interest in it. Here's the thing: projects in this area attract endless interest. If it were possible to write something useful in a couple of months, we'd have a hundred such programs fighting for attention. So the fact that we actually have zero of them should tell you something about the actual difficultly level of the work. You could spend the whole summer just reading research papers on this topic and maybe catch up to the late 90's by the end. Here's the usual advice I give to students looking to make a useful contribution to any mature development project: the more boring the work sounds, the more likely it is you'll actually do something people can use. It's easy to find people who want to work on fun projects--so easy that they've all been done already. What's left is either much harder than it looks, or kind of dull to do. The idea behind intentionally picking a boring one is that you're more likely to get one that's unfinished for that reason, rather than because it's actually a year or two of work to complete. Or, in the case you're asking about, a decade or three if you were to start from scratch and were really smart. If you started working on this now rather than stopping to follow the research already done you might catch up to http://portal.acm.org/citation.cfm?id=810505 in a couple of months. Similarly using the no. of select hits on a table we can check that if maximum no. of times it is on a non-index field we can index on that field to make select faster. It's impractical to figure out where indexes should go at without simulating what the optimizer would then do with them against a sample set of queries. You can't do anything useful just with basic statistics about the tables. I would recommend http://msdn.microsoft.com/en-us/library/aa226167(SQL.70).aspx as a good, practical introduction to the topic of what it takes to figure out where indexes go at, from someone who came up with a reasonable solution to that problem. You can find a list of the underlying research they cite (and an idea what has been done since then) at http://portal.acm.org/citation.cfm?id=673646 -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
David Christensen wrote: Enclosed is a patch to add a -C option to initdb to allow you to easily append configuration directives to the generated postgresql.conf file for use in programmatic generation. We had a patch not quite make it for 9.0 that switched over the postgresql.conf file to make it easy to scan a whole directory looking for configuration files: http://archives.postgresql.org/message-id/9837222c0910240641p7d75e2a4u2cfa6c1b5e603...@mail.gmail.com The idea there was to eventually reduce the amount of postgresql.conf hacking that initdb and other tools have to do. Your patch would add more code into a path that I'd like to see reduced significantly. That implementation would make something easy enough for your use case too (below untested but show the general idea): $ for cluster in 1 2 3 4 5 6; do initdb -D data$cluster ( cat < data$cluster/conf.d/99clustersetup done This would actually work just fine for what you're doing right now if you used ">> data$cluster/postgresql.conf" for that next to last line there. There would be duplicates, which I'm guessing is what you wanted to avoid with this patch, but the later values set for the parameters added to the end would win and be the active ones. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] last_statrequest is in the future
Tom Lane wrote: Should we redesign the stats signaling logic to work around this, or just hope we can nag kernel people into fixing it? Even if there was something to be done in kernel space, how many years from now would it be before it made this problem go away for the majority of near future 9.0 users? We've been seeing a fairly regular stream of "pgstat wait timeout" reports come in. The one I reported was from recent hardware and a very mainstream Linux setup. I'm not real optimistic that this one can get punted toward the OS and get anything done about it in time to head off problems in the field. This particularly pathologic case with jaguar is great because it's made it possible to nail down how to report the problem. I don't think it's possible to make a strong conclusion about how to resolve this just from that data though. What we probably need is for your additional logging code to catch this again on some systems that are not so obviously broken, to get a better idea what a normal (rather than extreme) manifestation looks like. How much skew is showing up, whether those do in fact correspond with the wait timeouts, that sort of thing. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ragged latency log data in multi-threaded pgbench
Takahiro Itagaki wrote: Greg Smith wrote: Attached is an updated version that I think is ready to commit. Only changes are docs--I rewrote those to improve the wording some. Thanks for the correction. Applied. By the way: the pgbench.sgml that you committed looks like it passed through a system that added a CR to every line in it. Probably not the way you intended to commit that. So far I've done over 40 hours of pgbench runtime worth of testing (>500 runs) using the patched version without any issues, the code itself continues to act fine. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting to beta1
Robert Haas wrote: On Thu, Mar 18, 2010 at 4:28 PM, Marc G. Fournier wrote: On Thu, 18 Mar 2010, Joshua D. Drake wrote: On Thu, 2010-03-18 at 10:18 -0700, Josh Berkus wrote: Or, let's put it another way: I've made my opinion clear in the past that I think that we ought to ship with a minimal postgresql.conf with maybe 15 items in it. +1 +1 ... but, why the 'top 15'? why not just those that are uncommented to start with, and leave those that are commented out as 'in the docs' ... ? +1 to either proposal. If this is turning into a vote: -1 from me for any work on this until http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items is cleared. It boggles my mind that anyone could have a different prioritization right now. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting to beta1
Joshua D. Drake wrote: As usual, the postgresql.conf is entirely too full. We should ship with the top 15. Maybe, but what we should do is ship, and then talk about this again when it's appropriate--earlier in the release cycle. Let me try and cut this one off before it generates a bunch of traffic by summarizing where this is stuck at. We started this release with a good plan for pulling off a major postgresql.conf trimming effort that I still like a lot ( http://wiki.postgresql.org/wiki/PgCon_2009_Developer_Meeting#Auto-Tuning ) The first step was switching over to a directory-based structure that allowed being all things to all people just by selecting which of the files provided you put into there. We really need the things initdb touches to go into a separate file, rather than the bloated sample, in a way that it's easy to manage; if you just drop files into a directory and the server reads them all that's the easiest route. Extending to include the top 15 or whatever other subset people want is easy after that. Now, that didn't go anywhere in this release due to development focus constraints, but I'm willing to take "has what we can advertise as built-in replication" as a disappointing but acceptable substitute in lieu of that. (rolls eyes) I think it will fit nicely into the "9.1 adds the polish" theme already gathering around the replication features being postponed to the next release. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ragged latency log data in multi-threaded pgbench
Takahiro Itagaki wrote: The log filenames are "pgbench_log.." for each thread, but the first thread (including single-threaded) still uses "pgbench_log." for the name because of compatibility. Attached is an updated version that I think is ready to commit. Only changes are docs--I rewrote those to improve the wording some. The code looked and tested fine to me. I just added support for the new format to pgbench-tools and am back to happily running large batches of tests using it again. I confirmed a few things: -On my CentOS system, the original problem is masked if you have "--enable-thread-safety" on; the multi-threaded output shows up without any broken lines into the single file. As I suspected it's only the multi-process implementation that shows the issue here. Since Tom points out that's luck rather than something that should be relied upon, I don't think that actually changes what to do here, it just explains why this wasn't obvious in earlier testing--normally I have thread safety on nowadays. -Patch corrects the problem. I took a build without thread safety on, demonstrated the issue with its pgbench. Apply the patch, rebuild just pgbench, run again; new multiple log files have no issue. -It's easy to convert existing scripts to utilize the new multiple log format. Right now the current idiom you're forced into using when running pgbench scripts is to track the PID it's run as, then use something like: mv pgbench_log.${PID} pgbench.log To convert to a stable filename for later processing. Now you just use something like this instead: cat pgbench_log.${PID}* > pgbench.log rm -f pgbench_log.${PID}* And that works fine. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 0019db4..28a8c84 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -131,11 +131,9 @@ int fillfactor = 100; #define ntellers 10 #define naccounts 10 -FILE *LOGFILE = NULL; - bool use_log; /* log transaction latencies to a file */ - -int is_connect; /* establish connection for each transaction */ +bool is_connect; /* establish connection for each transaction */ +int main_pid; /* main process id used in log filename */ char *pghost = ""; char *pgport = ""; @@ -183,6 +181,7 @@ typedef struct */ typedef struct { + int tid; /* thread id */ pthread_t thread; /* thread handle */ CState *state; /* array of CState */ int nstate; /* length of state[] */ @@ -741,7 +740,7 @@ clientDone(CState *st, bool ok) /* return false iff client should be disconnected */ static bool -doCustom(CState *st, instr_time *conn_time) +doCustom(CState *st, instr_time *conn_time, FILE *log_file) { PGresult *res; Command **commands; @@ -778,7 +777,7 @@ top: /* * transaction finished: record the time it took in the log */ - if (use_log && commands[st->state + 1] == NULL) + if (log_file && commands[st->state + 1] == NULL) { instr_time now; instr_time diff; @@ -791,12 +790,12 @@ top: #ifndef WIN32 /* This is more than we really ought to know about instr_time */ - fprintf(LOGFILE, "%d %d %.0f %d %ld %ld\n", + fprintf(log_file, "%d %d %.0f %d %ld %ld\n", st->id, st->cnt, usec, st->use_file, (long) now.tv_sec, (long) now.tv_usec); #else /* On Windows, instr_time doesn't provide a timestamp anyway */ - fprintf(LOGFILE, "%d %d %.0f %d 0 0\n", + fprintf(log_file, "%d %d %.0f %d 0 0\n", st->id, st->cnt, usec, st->use_file); #endif } @@ -857,7 +856,7 @@ top: INSTR_TIME_ACCUM_DIFF(*conn_time, end, start); } - if (use_log && st->state == 0) + if (log_file && st->state == 0) INSTR_TIME_SET_CURRENT(st->txn_begin); if (commands[st->state]->type == SQL_COMMAND) @@ -1833,7 +1832,7 @@ main(int argc, char **argv) } break; case 'C': -is_connect = 1; +is_connect = true; break; case 's': scale_given = true; @@ -1955,6 +1954,12 @@ main(int argc, char **argv) exit(1); } + /* + * save main process id in the global variable because process id will be + * changed after fork. + */ + main_pid = (int) getpid(); + if (nclients > 1) { state = (CState *) realloc(state, sizeof(CState) * nclients); @@ -1980,20 +1985,6 @@ main(int argc, char **argv) } } - if (use_log) - { - char logpath[64]; - - snprintf(logpath, 64, "pgbench_log.%d", (int) getpid()); - LOGFILE = fopen(logpath, "w"); - - if (LOGFILE == NULL) - { - fprintf(stderr, "Couldn't open logfile \"%s\": %s", logpath, strerror(errno
Re: [HACKERS] Ragged latency log data in multi-threaded pgbench
Takahiro Itagaki wrote: 1. Use explicit locks. The lock primitive will be pthread_mutex for multi-threaded implementations or semaphore for multi-threaded ones. 2. Use per-thread log files. File names would be "pgbench_log..". I'm concerned that the locking itself will turn into a new pgbench bottleneck, just as we're clearing the point where it's not for the first time in a while. And that sounds like it has its own potential risks/complexity involved. I could live with per-thread log files. I think my pgbench-tools is the main consumer of these latency logs floating around right now, I just pushed a 9.0 update to handle the multiple workers option today that discovered this). It doesn't make any difference to what I'm doing how many file I have to process. Just a few lines of extra shell code for me to pull the rest into the import. That seems like the simplest solution that's guaranteed to work, just push the problem onto the client side instead where it's easier to deal with. Unless someone feels strongly that these have to be interleaved into one file, based on Andrew's suggestion that this is a hard problem to get right and Tom's suggestion that this might even extend into the proper threaded version too, I think a log file per worker is the easiest way out of this. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Ragged latency log data in multi-threaded pgbench
Just noticed a problem popping up sometimes with the new multi-threaded pgbench. This is on a Linux RPM build (the alpha4 set) compiled with '--disable-thread-safety'. Still trying to nail down whether that's a requirement for this problem to appear or not. I did most of my review of this feature with it turned on, and haven't been seeing this problem on other systems that are thread safe. Not sure yet if that's cause and effect or coincidence yet. Here's a sample invocation that produces ragged output for me on my one system: pgbench -S -T 5 -c 4 -j 4 -l pgbench The log file produced by this (pgbench_log.) is supposed to consist of a series of lines in the following format: client,trans,latency,filenum,sec,usec It looks like the switch between clients running on separate workers can lead to a mix of their respective lines showing up though. Here's a couple of typical samples, with the bad line in the middle of each set: 1 138 178 0 1268665788 607559 1 139 182 0 1268665788 607751 1 1402 0 2491 0 1268665788 586135 2 1 264 0 1268665788 586463 2 2 192 0 1268665788 586665 1 274 160 0 1268665788 632966 1 275 178 0 1268665788 633154 1 276 184 0 126866578 178 0 1268665788 614015 2 141 190 0 1268665788 614252 2 142 169 0 1268665788 614430 2 274 178 0 1268665788 639218 2 275 175 0 1268665788 639402 2 276 169 0 126866578 171 0 1268665788 626933 0 141 185 0 1268665788 627165 0 142 202 0 1268665788 627377 Looks like sometimes a client is only getting part of its line written out before getting stomped on by the next one. I think one of the assumptions being made about how to safely write to this log file may be broken by the multi-process implementation, which is what you get when thread-safety is not available. Since there should only be 6 fields here, I think you can find whether a given log file has this problem or not like this: cat pgbench_log.x | cut -d " " -f 7 | sort | uniq If anything comes out of that, the latency log file has at least one bad line in it. Similarly, this: cat pgbench_log.x | cut -d " " -f 1 | sort | uniq Should only show the client numbers; here there's some first columns with much bigger numbers too. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting to beta1
Dimitri Fontaine wrote: Maybe some more admin level tutorial would be great to have too, such as how to find what's locking, how to monitor table and index usage to determine which indexes to drop, which to create, how to monitor (slaves lag, hitratio, transactions, I/U/D activity, you name it). Wow, that's at least one order of magnitude more ambitious than the actual scope of work on the docs that should be getting focused on for beta right now, perhaps two. Regardless, I already have stubs for the first couple of these sitting on the wiki at http://wiki.postgresql.org/wiki/Category:Administration (locks, monitoring). I know I'd rather see work done on those, where we can continue to improve without doc commits and easily make things available for all versions, until that content is good. Maybe then we can talk about merging some of that back into the main docs. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable pg_controldata?
Josh Berkus wrote: Oh, I wasn't proposing doing *anything* for 9.0. I wanted to get something on the TODO list for 9.1. As far as I'm concerned, 9.0 is closed to new ideas. We have enough bugs to fix as it is. I didn't get that initially from how you characterized this as "past time" to add. It's at http://wiki.postgresql.org/wiki/Todo#Point-In-Time_Recovery_.28PITR.29 now. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable pg_controldata?
Heikki Linnakangas wrote: Then again, if you don't use the copy in shared memory but just open the pg_control file and read it in the UDF, you could implement this as a pgfoundry module that works with older versions too. This is the direction I'd prefer to see this go in a 9.0 context. It's easy enough to build a fully functional version that lives works via the same proposed final UDF interface, just with the extra step of reading the file. Get that working, and you just added a useful module supporting all the way back to 8.2 (I think--not sure if there's been any other changes that would break this) that people would love to have. Once it's done, the UI is solid, all the data is known to be exposed in the right way it turns out people wanted it to be, then do the simple conversion it to grab from shared memory instead and add it as an official 9.1 feature. I'm not feeling any pressure that this is a must-fix item for the 9.0 release freeze--as warts here go, this is a both a small one and one that doesn't have to be fixed in core, so two strikes against it being critical. I would rather have the ability to tweak on this for a few months to get everything right, while being able to expose regular updates outside of core, than to commit "this is the best we've got so far" just under the wire for 9.0 without necessarily enough time to do it well. The few messages that have shown up here already have made left me with the optinion that just getting the requirements and preferred implementation nailed down here is going to take a few rounds of development to work out to everyone's satisfaction. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable pg_controldata?
Magnus Hagander wrote: Huh? It's fixed with, you don't need regexps for that. Just split the string apart. Taking options for single fields might have a better usecase, of course :-) I do find it a bit hard to imagine that any program capable of shelling out to call pg_controldata and doing something with the output would hit a major hurdle parsing the format that's already there. Moving toward single fields I could see as being better for some cases, but going all the way to XML/JSON is a step backwards from the plain text format as far as I'm concerned. Anything that can parse one of those complicated formats should be able to trivially chew the existing one already. Seriously: I have bash scripts that parse that thing. Even better would be the ability to get everything which is in pg_controldata currently as part of a system view in a running PostgreSQL; I can get most of them, but certainly not all. +1 for having all the information available from inside the backend, if that's technically possible (which I assume it should be) I revisit this every time I write yet another user-space parser and ask myself why I haven't exposed it in the server yet. The primary answer so far has always been "because you can't execute a query on the standby while it's in recovery", making half the stuff I wanted the data far (e.g. standby lag monitoring like http://www.kennygorman.com/wordpress/?p=249 ) unable to use that interface anyway. Now that Hot Standby cracks that objection, it's worth talking about for a minute. pg_controldata itself just reads the file in directly and dumps the data. There is a copy of it kept around all the time in shared memory though (ControlFile in xlog.c), protected by a LWLock. At a high level you can imagine a new function in xlog.c that acquires that lock, copies the block into a space the backend allocated for saving it, releases the lock, and then returns the whole structure. Then just wrap some number of superuser-only UDFs around it (I'd guess nobody wants regular ones able to hold a lock on ControlFile) and you've exposed the results to user-space. Two questions before I'd volunteer to write that: 1) How do you handle the situation where the pg_controldata is invalid? "Not read in yet" and "CRC is bad" are the two most obvious ones that can happen. Return a null for every field, try and guess (the way pg_resetxlog does), don't return a row of output at all, or throw an error? Each of these has slightly different implications for how admin code that will do something with these values will have to be structured. 2) While it's easy to say "I only want one or two of these values" and expose a whole set of UDFs to grab them individually (perhaps wrapping into a system view via that popular approach), I am concerned that people are going to call any single-value versions provided one at a time and get an inconsistent set. I think the only reasonable interface to this would not return a single field, it would pop out all of them so you got a matching set from the point in time the lock was held. And if that's the case, I'm not sure of the most reasonable UI is. Just return a whole row with a column for each field in the file, and then people can select out just the ones they want? (That's probably the right one) Produce the mess as a series of rows with (name,value) pairs? Put them into an array? Have re-raised these concerns to myself, this is usually the point in this exercise where I go "screw it, I'll just parse pg_controldata again instead" and do that instead. This is happening so much lately that I think Josh's suggestion it's just unworkable to keep going via that model forever has merit though. I find it hard to mark this 9.0 territory though, given the data is not actually difficult to grab--and that trail is already well blazed, nothing new in this version. In short: I'd vote for TODO item and would happily write myself for 9.1 given reasonable agreement on the questions raised above, -1 for doing anything about it right now though. Given both the existence of completely reasonable workarounds and the existence of much more serious blocker problems sitting on the roadmap to release, can't get real excited about this as the thing to worry about right now. Same reason I ignored the idea when Joshua Tolley brought it up last month: http://archives.postgresql.org/message-id/4b69caeb.9513f30a.731a.3...@mx.google.com -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: Right now you can't choose "master bloat", but you can choose the other two. I think that is acceptable for 9.0, assuming the other two don't have the problems that Tom foresees. I was wrong. You can choose "master bloat" with vacuum_defer_cleanup_age, but only crudely because it is measured in xids and the master defers no matter what queries are running on the slave... OK with you finding the situation acceptable, so long as it's an informed decision. From how you're writing about this, I'm comfortable you (and everybody else still involved here) have absorbed the issues enough that we're all talking about the same thing now. Since there are a couple of ugly user-space hacks possible for prioritizing "master bloat", and nobody is stepping up to work on resolving this via my suggestion involving better SR integration, seems to me heated discussion of code changes has come to a resolution of sorts I (and Simon, just checked) can live with. Sounds like we have three action paths here: -Tom already said he was planning a tour through the HS/SR code, I wanted that to happen with him aware of this issue. -Josh will continue doing his testing, also better informed about this particular soft spot. -I'll continue test-case construction for the problems here there are still concerns about (pathologic max_standby_delay and b-tree split issues being the top two on that list), and keep sharing particularly interesting ones here to help everyone else's testing. If it turns out any of those paths leads to a must-fix problem that doesn't have an acceptable solution, at least the idea of this as a "plan B" is both documented and more widely understood then when I started ringing this particular bell. I just updated the Open Items list: http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items to officially put myself on the hook for the following HS related documentation items that have come up recently, aiming to get them all wrapped up in time before or during early beta: -Update Hot Standby documentation: clearly explain relationships between the 3 major setup trade-offs, "buffer cleanup lock", notes on which queries are killed once max_standby_delay is reached, measuring XID churn on master for setting vacuum_defer_cleanup_age -Clean up archive_command docs related to recent "/bin/true" addition. Given that's where I expect people who run into the pg_stop_backup warning message recently added will end up at, noting its value for escaping from that particular case might be useful too. To finish airing my personal 9.0 TODO list now that I've gone this far, I'm also still working on completing the following patches that initial versions have been submitted of, was close to finishing both before getting side-tracked onto this larger issue: -pgbench > 4000 scale bug fix: http://archives.postgresql.org/message-id/4b621ba3.7090...@2ndquadrant.com -Improving the logging/error reporting/no timestamp issues in pg_standby re-raised recently by Selena: http://archives.postgresql.org/message-id/2b5e566d1001250945oae17be8n6317f827e3bd7...@mail.gmail.com If nobody else claims them as something they're working on before, I suspect I'll then move onto building some of the archiver UI improvements discussed most recently as part of the "pg_stop_backup does not complete" thread, despite Heikki having crushed my dreams of a simple solution to those by pointing out the shared memory memory limitation involved. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] building postgres-A4.tex-pdf crushed
Oleg Bartunov wrote: just to inform, that building of postgres-A4.tex-pdf (8.4) crushed on my Ubuntu 9.10 machine. No problem with CVS HEAD. Alvaro and I are both having the opposite problem on Ubuntu 9.04: 8.4 works fine, but CVS HEAD dumps core: http://archives.postgresql.org/message-id/20100215141242.ga2...@alvh.no-ip.org I wonder if someone more familiar with doing git bisects than me might track down the exact commit that swapped the behavior here. Since this looks like an awful openjade bug regardless, not sure what we could do about it; would be nice to know exactly how it happened though. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Simon Riggs wrote: On Tue, 2010-03-02 at 13:13 +, Greg Stark wrote: Why do we disallow turning off archive_mode anyways? Because it is needed for safety and nobody has got around to coding the idea of turning it on/off during normal running, which is possible, with appropriate care. It's actually made it pretty high up on the list of desired features for some of the replication projects: http://wiki.postgresql.org/wiki/ClusterFeatures#Start.2Fstop_archiving_at_runtime Since that is one of the easier items on that list to actually knock off (probably an order of magnitude so than the average feature there), it's completely feasible somebody will do so for 9.1. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Fujii Masao wrote: We would be easily able to calculate the last archived log file from the existence of archive status files. Right, but you have to actually scan the whole archive directory to figure that out, and I'd rather not see that code get duplicated somewhere else when it's already inside the archive_command logic. If it just shared that info with the rest of the system instead this would be trivial to discover. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Fujii Masao wrote: On Fri, Feb 26, 2010 at 2:47 AM, Bruce Momjian wrote: Postgres 9.0 will be the first release to mention /bin/true as a way of turning off archiving in extraordinary circumstances: http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html Setting archive_mode to a command that does nothing but return true, e.g. /bin/true, "return true" seems ambiguous for me. How about writing clearly "return a zero exit status" instead? This is a good catch, and I have a work in progress update to that doc section that fixes that wording, as well as rearranging the recent additions a bit. Really that whole "/bin/true" big needs to go after the example. A very brief intro to what "exit status" means on various platforms might be in order too. I'm adjusting all that to read better, once I'm happy with it I'll submit a doc patch in the next week or two with the final result. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Robert Haas wrote: I just read through the current documentation and it doesn't really seem to explain very much about how HS decides which queries to kill. Can someone try to flesh that out a bit? I believe it just launches on a mass killing spree once things like max_standby_delay expire. This I want to confirm via testing (can simulate with a mix of long and short running pgbench queries) and then intend to update the docs to clarify. It also uses the term "buffer cleanup lock", which doesn't seem to be used anywhere else in the documentation (though it does appear in the source tree, including README.HOT). This loose end was already noted in my last docs update. I wrote an initial description, but Bruce and I decided to leave out until something more thorough could be put together. This is also on my docs cleanup list, will get to it somewhere along the beta timeline. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: Joachim Wieland wrote: 1) With the current implementation they will see better performance on the master and more aggressive vacuum (!), since they have less long-running queries now on the master and autovacuum can kick in and clean up with less delay than before. On the other hand their queries on the standby might fail and they will start thinking that this HS+SR feature is not as convincing as they thought it was... I assumed they would set max_standby_delay = -1 and be happy. The admin in this situation might be happy until the first time the primary fails and a failover is forced, at which point there is an unbounded amount of recovery data to apply that was stuck waiting behind whatever long-running queries were active. I don't know if you've ever watched what happens to a pre-8.2 cold standby when you start it up with hundreds or thousands of backed up WAL files to process before the server can start, but it's not a fast process. I watched a production 8.1 standby get >4000 files behind once due to an archive_command bug, and it's not something I'd like to ever chew my nails off to again. If your goal was HA and you're trying to bring up the standby, the server is down the whole time that's going on. This is why no admin who prioritizes HA would consider 'max_standby_delay = -1' a reasonable setting, and those are the sort of users Joachim's example was discussing. Only takes one rogue query that runs for a long time to make the standby so far behind it's useless for HA purposes. And you also have to ask yourself "if recovery is halted while waiting for this query to run, how stale is the data on the standby getting?". That's true for any large setting for this parameter, but using -1 for the unlimited setting also gives the maximum possible potential for such staleness. 'max_standby_delay = -1' is really only a reasonable idea if you are absolutely certain all queries are going to be short, which we can't dismiss as an unfounded use case so it has value. I would expect you have to also combine it with a matching reasonable statement_timeout to enforce that expectation to make that situation safer. In any of the "offload batch queries to the failover standby" situations, it's unlikely an unlimited value for this setting will be practical. Perhaps you set max_standby_delay to some number of hours, to match your expected worst-case query run time and reduce the chance of cancellation. Not putting a limit on it at all is a situation no DBA with healthy paranoia is going to be happy with the potential downside of in a HA environment, given that both unbounded staleness and recovery time are then both possible. The potential of a failed long-running query is much less risky than either of those. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: However, this leaves aside Greg's point about snapshot age and successive queries; does anyone dispute his analysis? Simon? There's already a note on the Hot Standby TODO about unexpectly bad max_standby_delay behavior being possible on an idle system, with no suggested resolution for it besides better SR integration. The issue Greg Stark has noted is another variation on that theme. It's already on my list of theorized pathological but as yet undemonstrated concerns that Simon and I identified, the one I'm working through creating a test cases to prove/disprove. I'm past "it's possible..." talks at this point though as not to spook anyone unnecessarily, and am only raising things I can show concrete examples of in action. White box testing at some point does require pausing one's investigation of what's in the box and getting on with the actual testing instead. The only real spot where my opinion diverges here that I have yet to find any situation where 'max_standby_delay=-1' makes any sense to me. When I try running my test cases with that setting, the whole system just reacts far too strangely. My first patch here is probably going to be adding more visibility into the situation when queries are blocking replication forever, because I think the times I find myself at "why is the system hung right now?" are when that happens and it's not obvious as an admin what's going on. Also, the idea that a long running query on the standby could cause an unbounded delay in replication is so foreign to my sensibilities that I don't ever include it in the list of useful solutions to the problems I'm worried about. The option is there, not disputing that it makes sense for some people because there seems some demand for it, just can't see how it fits into any of the use-cases I'm concerned about. I haven't said anything about query retry mainly because I can't imagine any way it's possible to build it in time for this release, so whether it's eventually feasible or not doesn't enter into what I'm worried about right now. In any case, I would prioritize that behind work on preventing the most common situations that cause cancellations in the first place, until those are handled so well that retry is the most effective improvement left to consider. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: And I think we can measure bloat in a pgbench test, no? When I get a chance, I'll run one for a couple hours and see the difference that cleanup_age makes. The test case I attached at the start of this thread runs just the UPDATE to the tellers table. Running something similar that focuses just on UPDATEs to the pgbench_accounts table, without the rest of the steps done by the standard test, is the fastest route to bloat. The standard test will do it too, just does a lot of extra stuff too that doesn't impact results (SELECT, INSERT) so it wastes some resources compared to a targeted bloater script. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: Well, we could throw this on the user if we could get them some information on how to calculate that number. For example, some way for them to calculate the number of XIDs per minute via a query, and then set vacuum_defer_cleanup_age appropriately on the master. Sure, it's clunky, but we've already warned people that 9.0 will be clunky and hard to administer. And it's no worse than setting FSM_pages used to be. However, first we need to test that setting vacuum_defer_cleanup_age actually benefits query cancel issues. Proving that setting works as expected is already on my test case grid, seems fine in my limited testing so far. I've started looking into ways to monitor XID churn in a way for setting it better. I'll take care of providing all that in my next test case update. My intent here is to take the ideas outlined in my "Hot Standby Tradeoffs" blog post and turn that into a new documentation section making it more clear where the problem steps are, regardless of what else happens here. And I need some concrete example of XID burn rate measurement to finish that job. The main problem with setting vacuum_defer_cleanup_age high isn't showing it works, it's a pretty simple bit of code. It's when you recognize that it penalizes all cleanup all the time, whether or not the standby is actually executing a long-running query or not, that you note the second level of pain in increasing it. Returning to the idea of "how is this different from a site already in production?", it may very well be the case that a site that sets vacuum_defer_cleanup_age high enough to support off-peak batch reporting cannot tolerate how that will impact vacuums during their peak time of day. The XID export implementation sidesteps that issue by only making the vacuum delay increase when queries that require it are running, turning this back into a standard "what's the best time of day to run my big reports?" issue that people understand how to cope with already. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: First, from the nature of the arguments, we need to eventually have both versions of SR: delay-based and xmin-pub. And it would be fantastic if Greg Smith and Tom Lane could work on xmin-pub to see if we can get it ready as well. As I see it, the main technical obstacle here is that a subset of a feature already on the SR roadmap needs to get built earlier than expected to pull this off. I don't know about Tom, but I have no expectation it's possible for me to get up to speed on that code fast enough to contribute anything there. I expect the thing I'd be most productive at as far as moving the release forward is to continue testing this pair of features looking for rough edges, which is what I have planned for the next month. I'm not even close to finished with generating test cases specifically probing for bad behavior suspected after a look the implementation details--this is just what I came up with in my first week of that. Count me in for more testing, but out for significant development here. It's not what I've got my time allocated for because it's not where I think I'll be most productive. 2) A more usable vacuum_defer_cleanup_age. If it was feasible for a user to configure the master to not vacuum records less than, say, 5 minutes dead, then that would again offer the choice to the user of slightly degraded performance on the master (acceptable) vs. lots of query cancel (unacceptable). I'm going to test Greg's case with vacuum_cleanup_age used fairly liberally to see if this approach has merit. I've been down that road and it leads quickly to the following question: "how can I tell how old in time-based units an xid is?" If there were an easy answer to that question, vacuum_defer_cleanup_age would already be set in time units. It's the obvious UI to want, it's just not obvious how to build it internally. Maybe I missed something, but my guess is that vacuum_defer_cleanup_age is already as good as it's going to get. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Joachim Wieland wrote: Instead, I assume that most people who will grab 9.0 and use HS+SR do already have a database with a certain query profile. Now with HS+SR they will try to put the most costly and longest read-only queries to the standby but in the end will run the same number of queries with the same overall complexity. This is a nice summary of the primary use-case I am trying to optimize usability for, because I know for a fact there's a stack of pent-up requests for exactly this form of improvement from existing warm standby users. And your subsequent discussion of how administrators will react in each of the possible configurations here matches my own concerns. I would highly recommend anyone who feels this is not a critical feature to fix carefully read Joachim's message from an advocacy perspective, that's a better user-oriented prediction than mine of exactly how this is going to play out in the field post-release. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Greg Stark wrote: On Sun, Feb 28, 2010 at 5:28 AM, Greg Smith wrote: The idea of the workaround is that if you have a single long-running query to execute, and you want to make sure it doesn't get canceled because of a vacuum cleanup, you just have it connect back to the master to keep an open snapshot the whole time. Also, I'm not sure this actually works. When your client makes this additional connection to the master it's connecting at some transaction in the future from the slave's point of view. The master could have already vacuumed away some record which the snapshot the client gets on the slave will have in view. Right, and there was an additional comment in the docs alluding to some sleep time on the master that intends to try and improve thins. If you knew how long archive_timeout was you could try to sleep longer than it to try and increase your odds of avoiding an ugly spot. But there are race conditions galore possible here, particularly if your archiver or standby catchup is backlogged. Still it's a handy practical trick even if it isn't 100% guaranteed to work. But I don't think it provides the basis for something we can bake in. Agreed on both counts, which is why it's in the current docs as a workaround people can consider, but not what I've been advocating as the right way to proceed. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Robert Haas wrote: It seems to me that if we're forced to pass the xmin from the slave back to the master, that would be a huge step backward in terms of both scalability and performance, so I really hope it doesn't come to that. Not forced to--have the option of. There are obviously workloads where you wouldn't want this. At the same time, I think there are some pretty common ones people are going to expect HS+SR to work on transparently where this would obviously be the preferred trade-off to make, were it available as one of the options. The test case I put together shows an intentionally pathological but not completely unrealistic example of such a workload. I wish I understood better exactly what you mean by "the notion of synchronizing the WAL stream against slave queries" and why you don't think it will work. Can you elaborate? There's this constant WAL stream coming in from the master to the slave. Each time the slave is about to apply a change from that stream, it considers "will this disrupt one of the queries I'm already executing?". If so, it has to make a decision about what to do; that's where the synchronization problem comes from. The current two options are "delay applying the change", at which point the master and standby will drift out of sync until the query ends and it can catch back up, or "cancel the query". There are tunables for each of these, and they all seem to work fine (albeit without too much testing in the field yet). My concern is that the tunable that tries to implement the other thing you might want to optimize for--"avoid letting the master generate WAL entires that are the most likely ones to conflict"--just isn't very usable in its current form. Tom and I don't see completely eye to eye on this, in that I'm not so sure the current behaviors are "fundamentally wrong and we will never be able to make [them] work". If that's really the case, you may not ever get the scalability/performance results you're hoping for from this release, and really we're all screwed if those are the only approaches available. What I am sure of is that a SR-based xmin passing approach is simpler, easier to explain, more robust for some common workloads, and less likely to give surprised "wow, I didn't think *that* would cancel my standby query" reports from the field than any way you can configure Hot Standby alone right now. And since I never like to bet against Tom's gut feel, having it around as a "plan B" in case he's right about an overwhelming round of bug reports piling up against the max_standby_delay etc. logic doesn't hurt either. I spent a little time today seeing if there was any interesting code I might steal from the early "synchrep" branch at http://git.postgresql.org/gitweb?p=users/fujii/postgres.git;a=summary , but sadly when I tried to rebase that against the master to separate out just the parts unique to it the merge conflicts were overwhelming. I hate getting beaten by merge bitrot even when Git is helping. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: "The first option is to connect to the primary server and keep a query active for as long as needed to run queries on the standby. This guarantees that a WAL cleanup record is never generated and query conflicts do not occur, as described above. This could be done using contrib/dblink and pg_sleep(), or via other mechanisms." I am unclear how you would easily advance the snapshot as each query completes on the slave. The idea of the workaround is that if you have a single long-running query to execute, and you want to make sure it doesn't get canceled because of a vacuum cleanup, you just have it connect back to the master to keep an open snapshot the whole time. That's basically the same idea that vacuum_defer_cleanup_age implements, except you don't have to calculate a value--you just hold open the snapshot to do it. When that query ended, its snapshot would be removed, and then the master would advance to whatever the next latest one is. Nothing fancier than that. The only similarity is that if you made every query that happened on the standby do that, it would effectively be the same behavior I'm suggesting could be available via the standby->master xmin publication. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Performance Patches Was: [HACKERS] Lock Wait Statistics (next commitfest)
Mark Kirkwood wrote: While I completely agree that the submitter should be required to supply a test case and their results, so the rest of us can try to reproduce said improvement - rejecting the patch out of hand is a bit harsh I feel - Hey, they may just have forgotten to supply these things! I didn't put any strong wording in the Wiki, I was just mentioning my personal position is far less tolerant of this than the current project policy. What I added was: "If the patch is intended to improve performance, it's a good idea to include some reproducible tests to demonstrate the improvement. If a reviewer cannot duplicate your claimed performance improvement in a short period of time, it's very likely your patch will be bounced. Do not expect that a reviewer is going to find your performance feature so interesting that they will build an entire test suite to prove it works. You should have done that as part of patch validation, and included the necessary framework for testing with the submission." Finding a reviewer for a performance patch and getting them up to speed to evaluate any submitted patch is time intensive, and it really sucks from the perspective of the CF manager and any reviewer who is handed a messy one. The intention was not to cut people off without warning them. The position I would advocate as being a fair one is that if you don't provide a test case for a performance improvement patch, you can't then expect that you'll be assigned a reviewer by the CF manager either until that's corrected. And if time for the CF runs out before you do that, you're automatically moved to "returned with feedback"--specifically, "write us a test case". -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: Hey, take it easy! I read Stark's post as tongue-in-cheek, which I think it was. Yeah, I didn't get that. We've already exchanged mutual off-list apologies for the misunderstanding in both directions, I stopped just short of sending flowers. I did kick off this discussion with noting a clear preference this not wander into any personal finger-pointing. And I am far too displeased with the technical situation here to have much of a sense of humor left about it either. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lock Wait Statistics (next commitfest)
Mark Kirkwood wrote: I don't mean to be ungrateful about the actual reviews at all - and I did value the feedback received (which I hope was reasonably clear in the various replies I sent). I sense a bit of attacking the messenger in your tone... I thought there was a moderately big difference between the reality of the review you got and how you were characterizing it, and I was just trying to provide some perspective on how bad a true "bit of review" only would have worked. Since I saw you disclaimed that wording with a smiley I know it wasn't intending to be ungrateful, and I didn't intend to shoot the messenger. Apologies if my tone grazed you though. In any case, process feedback noted and assimilated into recommended practice: I just added a section about WIP patches to http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission While I was in there I also added some more notes on my personal top patch submission peeve, patches whose purpose in life is to improve performance that don't come with associated easy to run test cases, including a sample of that test running on a system that shows the speedup clearly. If I were in charge I just would make it standard project policy to reject any performance patch without those characteristics immediately. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Josh Berkus wrote: Now that I think about it, the xmin thing really doesn't seem conceptually difficult. If the slave just opens a 2nd, special query connection back to the master and publishes its oldest xmin there, as far as the master is concerned, it's just another query backend. Could it be that easy? Something just like that is in fact already suggested as a workaround in the Hot Standby manual: "The first option is to connect to the primary server and keep a query active for as long as needed to run queries on the standby. This guarantees that a WAL cleanup record is never generated and query conflicts do not occur, as described above. This could be done using contrib/dblink and pg_sleep(), or via other mechanisms." And the idea of doing it mainly in client land has its attractions. The main reason I wandered toward asking about it in the context of SR is that there's already this open "Standby delay on idle system" issue with Hot Standby, and the suggested resolution for that problem involves publishing keep-alive data with timestamps over SR. While all these problems and potential solutions have been floating around for a long time, as you pointed out, the little flash of insight I had here was that it's possible to bundle these two problems together with a combined keep-alive timestamp+xmin message that goes in both directions. That removes one serious Hot Standby issue altogether, and adds an additional conflict avoidance mechanism for people who want to enable it, all with something that needs to get done sooner or later anyway for sync rep. The part I still don't have good visibility on is how much of the necessary SR infrastructure needed to support this communications channel is already available in some form. I had though the walsender on the master was already receiving messages sometimes from the walreceiver on the standby, but I'm getting the impression from Heikki's comments that this not the case at all yet. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lock Wait Statistics (next commitfest)
Gokulakannan Somasundaram wrote: Statspack works by the following way -a) it takes a copy of important catalog tables(pg_ tables) which store the information like wait statistics against wait events, i/o statistics cumulative against each SQL_Hash( and SQL_Text), whether a particular plan went for hard parse/ soft parse(because of plan caching) and the status of different in-memory data structures etc. This is actually missing the real work that went into building this feature into Oracle. Before it was possible to build statspack, first they went to the trouble of noting every source of this form of latency--lock waits, I/O statistics and waits, buffer pool waits--and instrumented every single one of them internally. Basically, every time something that might wait for a resource you later wanted to monitor the wait for happens, a start/end timestamp for that wait is noted, and ultimately the difference between them noting how long the event took is stored into the database. That's the data you must have collected at some point in order to get the summary. Meanwhile, PostgreSQL development is resistant to making any changes in this direction under the theory that a) it adds a lot of code complexity and b) constant calls to get the current system time are too expensive on some platforms to do them all the time. Until those two things are sorted out, what the high-level interface to the direction you're suggesting looks like doesn't really matter. DTrace support has managed to clear both of those hurdles due to its optional nature, perceived low overhead, and removing *storing* all the events generated to something that happens outside of the database. I agree with you that something like statspack is the direction PostgreSQL eventually needs to go, but it's going to be an uphill battle the whole time to get it built. The objections will be that it will add too much overhead at the lowest levels, where the data needed to support it is collected at. Once that is cleared, the high-level interface is easy to build. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lock Wait Statistics (next commitfest)
Mark Kirkwood wrote: Greg Smith wrote: Returned with feedback in October after receiving a lot of review, no updated version submitted since then: https://commitfest.postgresql.org/action/patch_view?id=98 Hmm - I would say a bit of review rather than a lot :-) It looks like you got useful feedback from at least three people, and people were regularly looking at your patch in some form for about three months. That's a lot of review. In many other open-source projects, your first patch would have been rejected after a quick look as unsuitable and that would have been the end of things for you. I feel lucky every time I get a volunteer to spend time reading my work and suggesting how it could be better; your message here doesn't seem to share that perspective. I'd also like to take the opportunity to express a little frustration about the commitfest business - really all I wanted was the patch *reviewed* as WIP - it seemed that in order to do that I needed to enter it into the various commitfests... then I was faced with comments to the effect that it was not ready for commit so should not have been entered into a commifest at all... sigh, a bit of an enthusiasm killer I'm afraid... To lower your frustration level next time, make sure to label the e-mail and the entry on the CommitFest app with the magic abbreviation "WIP" and this shouldn't be so much of an issue. The assumption for patches is that someone submitted them as commit candidates, and therefore they should be reviewed to that standard, unless clearly labeled otherwise. You briefly disclaimed yours as not being in that category in the initial text of your first message, but it was easy to miss that, particularly once it had been >8 months from when that messages showed up and it was still being discussed. If you wanted to pick this back up again, I'd think that a look at what's been happening with the lock_timeout GUC patch would be informative--I'd think that has some overlap with the sort of thing you were trying to do. FYI, I thought your patch was useful, but didn't spent time on it because it's not ambitious enough. I would like to see statistics on a lot more types of waiting than just locks, and keep trying to find time to think about that big problem rather than worrying about the individual pieces of it. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lock Wait Statistics (next commitfest)
Bruce Momjian wrote: What happened to this patch? Returned with feedback in October after receiving a lot of review, no updated version submitted since then: https://commitfest.postgresql.org/action/patch_view?id=98 -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Aidan Van Dyk wrote: Would we (ya, the royal we) be willing to say that if you want the benifit of removing the MVCC overhead of long-running queries you need to run PITR backup/archive recovery, and if you want SR, you get a closed-loop master-follows-save-xmin behaviour? To turn that question around a little, I think it's reasonable to say that closed-loop master-follows-slave-xmin behavior is only practical to consider implementing with SR--and even there, it should be optional rather than required until there's more field experience on the whole thing. Whether it's the default or not could take a bit of debate to sort out too. If you think of it in those terms, the idea that "you need to run PITR backup/archive recovery" to not get that behavior isn't an important distinction anymore. If you run SR with the option enabled you could get it, any other setup and you won't. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Greg Stark wrote: But if they move from having a plain old PITR warm standby to having one they can run queries on they might well assume that the big advantage of having the standby to play with is precisely that they can do things there that they have never been able to do on the master previously without causing damage. Just not having the actual query running on the master is such a reduction in damage that I think it's delivering the essence of what people are looking for regardless. That it might be possible in some cases to additionally avoid the overhead that comes along with any long-running query is a nice bonus, and it's great the design allows for that possibility. But if that's only possible with risk, heavy tweaking, and possibly some hacks, I'm not sure that's making the right trade-offs for everyone. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Joshua D. Drake wrote: On Sat, 27 Feb 2010 00:43:48 +, Greg Stark wrote: I want my ability to run large batch queries without any performance or reliability impact on the primary server. +1 I can use any number of other technologies for high availability. Remove "must be an instant-on failover at the same time" from the requirements and you don't even need 9.0 to handle that, this has been a straightforward to solve problem since 8.2. It's the combination of HA and queries that make things hard to do. If you just want batch queries on another system without being concerned about HA at the same time, the first option is to just fork the base backup and WAL segment delivery to another server and run queries there. Some simple filesystem snapshot techniques will also suffice to handle it all on the same standby. Stop warm standby recovery, snapshot, trigger the server, run your batch job; once finished, rollback to the snapshot, grab the latest segment files, and resume standby catchup. Even the lame Linux LVM snapshot features can handle that job--one of my coworkers has the whole thing scripted even this is so common. And if you have to go live because there's a failover, you're back to the same "cold standby" situation a large max_standby_delay puts you at, so it's not even very different from what you're going to get in 9.0 if this is your priority mix. The new version is just lowering the operational complexity involved. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Greg Stark wrote: Eh? That's not what I meant at all. Actually it's kind of the exact opposite of what I meant. Sorry about that--I think we just hit one of those language usage drift bits of confusion. "Sit in the corner" has a very negative tone to it in US English and I interpreted your message badly as a result. A Google search for images using that phrase will quickly show you what I mean. What I meant was that your description of the "High Availability first and foremost" is only one possible use case. Simon in the past expressed the same single-minded focus on that use case. It's a perfectly valid use case and I would probably agree if we had to choose just one it would be the most important. Sure, there are certainly others, and as much as possible more flexibility here is a good thing. What I was suggesting is that if the only good way to handle long-running queries has no choice but to sacrifice high-availability, which is is the situation if max_standby_delay is the approach you use, then the most obvious users for this feature are not being well served by that situation. I would guess a large portion of the users looking forward to Hot Standby are in the "have an underutilized high-availability standby I'd like to use for offloading long running reports", and if there is no way to serve them well this feature is missing the mark a bit. You really can't do any better without better master/standby integration though, and as pointed out a couple of times here that was considered and just not followed through on yet. I'm increasingly concerned that nothing else will really do though. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Heikki Linnakangas wrote: One such landmine is that the keepalives need to flow from client to server while the WAL records are flowing from server to client. We'll have to crack that problem for synchronous replication too, but I think that alone is a big enough problem to make this 9.1 material. This seems to be the real sticking point then, given that the xmin/PGPROC side on the master seems logically straightforward. For some reason I thought the sync rep feature had the reverse message flow already going, and that some other sort of limitation just made it impractical to merge into the main codebase this early. My hope was that just this particular part could get cherry-picked out of there, and that it might even have been thought about already in that context given the known HS keepalive "serious issue". If there was a solution or partial solution in progress to that floating around, my thought was that just piggybacking this extra xid info on top of it would be easy enough. If there's not already a standby to primary communications backchannel implementation available that can be harvested from that work, your suggestion that this may not be feasible at all for 9.0 seems like a more serious concern than I had thought it was going to be. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: Well, I think the choice is either you delay vacuum on the master for 8 hours or pile up 8 hours of WAL files on the slave, and delay application, and make recovery much slower. It is not clear to me which option a user would prefer because the bloat on the master might be permanent. But if you're running the 8 hour report on the master right now, aren't you already exposed to a similar pile of bloat issues while it's going? If I have the choice between "sometimes queries will get canceled" vs. "sometimes the master will experience the same long-running transaction bloat issues as in earlier versions even if the query runs on the standby", I feel like leaning toward the latter at least leads to a problem people are used to. This falls into the principle of least astonishment category to me. Testing the final design for how transactions get canceled here led me to some really unexpected situations, and the downside for a mistake is "your query is lost". Had I instead discovered that sometimes long-running transactions on the standby can ripple back to cause a maintenance slowdown on the master, that's not great. But it would not have been so surprising, and it won't result in lost query results. I think people will expect that their queries cancel because of things like DDL changes. And the existing knobs allow inserting some slack for things like locks taking a little bit of time to acquire sometimes. What I don't think people will see coming is that a routine update on an unrelated table is going to kill a query they might have been waiting hours for the result of, just because that update crossed an autovacuum threshold for the other table and introduced a dead row cleanup. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Greg Stark wrote: Well you can go sit in the same corner as Simon with your high availability servers. I want my ability to run large batch queries without any performance or reliability impact on the primary server. Thank you for combining a small personal attack with a selfish commentary about how yours is the only valid viewpoint. Saves me a lot of trouble replying to your messages, can just ignore them instead if this is how you're going to act. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: 5 Early cleanup of data still visible to the current query's snapshot #5 could be handled by using vacuum_defer_cleanup_age on the master. Why is vacuum_defer_cleanup_age not listed in postgresql.conf? I noticed that myself and fired off a corrective patch to Simon yesterday, he said it was intentional but not sure why that is yet. We'll sort that out. You are correct that my suggestion is targeting primarily #5 on this list. There are two problems with the possible solutions using that parameter though: -vacuum_defer_cleanup_age is set in a unit that people cannot be expected to work in--transactions ids. The UI is essentially useless, and there's no obvious way how to make a better one. The best you can do will still be really fragile. -If you increase vacuum_defer_cleanup_age, it's active all the time. You're basically making every single transaction that could be cleaned up pay for the fact that a query *might* be running on the standby it needs to avoid. You can think of the idea of passing an xmin back from the standby as being like an auto-tuning vacuum_defer_cleanup_age. It's 0 when no standby queries are running, but grows in size to match longer ones. And you don't have to have to know anything to set it correctly; just toggle on the proposed "feedback xid from the standby" feature and you're safe. Expecting that anyone will ever set vacuum_defer_cleanup_age correctly in the field in its current form is pretty unreasonable I think. Since there's no timestamp-based memory of past xid activity, it's difficult to convert it to that form instead, and I think something in terms of time is what people would like to set this in. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration
Bruce Momjian wrote: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. The first issue is that you're basically saying "I don't care about high availability anymore" when you increase max_standby_delay to a high value. Want to offload an 8 hour long batch report every day to the standby? You can do it with max_standby_delay=8 hours. But the day your master crashes 7 hours into that, you're in for a long wait before your standby is available while it replays all the queued up segments. Your 'hot standby' has actually turned into the old form of 'cold standby' just when you need it to be responsive. This is also the reason why the whole "pause recovery" idea is a fruitless path to wander down. The whole point of this feature is that people have a secondary server available for high-availability, *first and foremost*, but they'd like it to do something more interesting that leave it idle all the time. The idea that you can hold off on applying standby updates for long enough to run seriously long reports is completely at odds with the idea of high-availability. The second major problem is that the day the report actually takes 8.1 hours instead, because somebody else ran another report that slowed you down a little, you're screwed if that's something you depend on being available--it just got canceled only *after* wasting 8 hours of reporting resource time. max_standby_delay is IMHO only useful for allowing non-real-time web-app style uses of HS (think "Facebook status updates"), where you say "I'm OK giving people slightly out of date info sometimes if it lets me split the query load over two systems". Set max_standby_delay to a few seconds or maybe a minute, enough time to service a typical user query, make your app tolerate the occasional failed query and only send big ones to the master, and you've just scaled up all the small ones. Distributed queries with "eventual consistency" on all nodes is where many of the web app designs are going, and this feature is a reasonable match for that use case. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Hot Standby query cancellation and Streaming Replication integration
Tom Lane wrote: I don't see a "substantial additional burden" there. What I would imagine is needed is that the slave transmits a single number back --- its current oldest xmin --- and the walsender process publishes that number as its transaction xmin in its PGPROC entry on the master. That is exactly the core idea I was trying to suggest in my rambling message. Just that small additional bit of information transmitted and published to the master via that route, and it's possible to optimize this problem in a way not available now. And it's a way that I believe will feel more natural to some users who may not be well served by any of the existing tuning possibilities. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A thought: should we run pgindent now?
Bruce Momjian wrote: The diff is here: http://momjian.us/tmp/pgindent.diff and I checked into CVS a copy of the typedef list I used from Andrew Dunstan. Is it worthwhile to consider writing up a "how to run pgindent like Bruce does" page on the wiki? I've been scared off of that topic before because it always seemed like it was perilous and there were missing pieces. I think we'd get some uptake from contributors doing this themselves before submitting patches if it were made easy enough for them to do. I know I'd do it just to get rid of the accidental tab/space errors that always manage to sneak into non-trivial work, if I could automate the whole thing as a pass to do just before generating a final diff to submit. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot Standby query cancellation and Streaming Replication integration
rious issue you can run into (that itself would be great to eliminate): "Requires keep-alives with timestamps to be added to sync rep feature" If those keep-alives flowed in both directions, and included both timestamps *and* xid visibility information, the master could easily be configured to hold open xid snapshots needed for long running queries on the standby when that was necessary. I might be missing an implementation detail here, but from a high level it seems like you could make the walreceiver on the master publish the information about where the standby has advanced to as a bit of ProcArray xmin data. Then the master could only advance past where the standby says it cannot need visibility behind anymore. This is a much more elegant solution than any of the hacks available so far. It would turn Hot Standby + Streaming Replication into a system that stepped out of the way of the worst of the technical limitations of HS alone. The master would easily and automatically avoid advancing past where the queries running on the standby needed visibility back to, essentially the same way cleanup is blocked during a long-running query on the primary--except with the actual main query work offloaded to the standby, the idea all along. I don't know how difficult the keepalive feature was expected to be, and there's certainly plenty of potential landmines in this whole xid export idea. How to handle situations where the standby goes away for a while, such as a network outage, so that it doesn't block the master from ever cleaning up dead tuples is a concern. I wouldn't expect that to be too serious of a blocker, given that if the standby isn't talking it probably isn't running queries you need to worry about canceling either. Not sure where else this can fall down, and unfortunately I don't know nearly enough about the SR code to help myself with implementing this feature. (I think Simon is in a similar position--it's just not what we've been staring at the last few months). But I do know that the current Hot Standby implementation is going to be frustrating to configure correctly for people. If it's possible to make most of that go away just by doing some final integration between it and Streaming Replication that just wasn't practical to accomplish until now, I think it's worth considering how to make that happen before the final 9.0 release. I really hope this discussion can say focused on if and how it's possible to improve this area, with the goal being to deliver a product everyone can be proud of with the full feature set that makes this next release a killer one. The features that have managed to all get into this release already are fantastic, everyone who contributed should be proud of that progress, and it's encouraging that the alpha4 date was nailed. It would be easy to descend into finger-pointing for why exactly this particular problem is only getting more visibility now, or into schedule-oriented commentary suggesting it must be ignored because it's too late to do anything about it. I hope everyone appreciates wandering that way will not help make PostgreSQL 9.0 a better release. This issue is so easy to encounter, and looks so bad when it happens, that I feel it could easily lead to an embarrassing situation for the community if something isn't done about it before release. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us hs-demo.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Greg Stark wrote: In an ideal world it would be best if pg_stop_backup could actually print the error status of the archiving command. Is there any way for it to get ahold of the fact that the archiving is failing? This is in the area I mentioned I'd proposed a patch to improve not too long ago. The archiver doesn't tell anyone anything about what it's doing right now, or even save its state information. I made a proposal for making the bit it's currently working on (or just finished, or both) visible not too long ago: http://archives.postgresql.org/message-id/4b4fea18.5080...@2ndquadrant.com The main content for that was tracking disk space, which wandered into a separate discussion, but it would be easy enough to use the information that intends to export ("what archive file is currently being processed?") and print that in the error message too. Makes it easy enough for people to infer the command is failing if the same segment number shows up every time in that message. I didn't finish that only because the CF kicked off and I switched out of new development to review. Since this class of error keeps popping up, I could easily finish that patch off by next week and see if it helps here. I thought it was a long overdue bit of monitoring to add to the database anyway, just never had the time to work on it before. And do we have closure on whether a "fast" shutdown is hanging? Or was that actually a smart shutdown? When I tested this myself, a smart shutdown hung every time, while a fast one blew right through the problem--matching what's described in the manual. Josh suggested at one point he might have seen a situation where fast shutdown wasn't sufficient to work around this and an immediate one was required. Certainly possible that happened for an as yet unknown reason--I've seen plenty of situations where fast shutdown didn't work--but I haven't been able to replicate it. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Tom Lane wrote: The one thing I'm undecided about is whether we want the immediate NOTICE, as opposed to dialing down the time till the first WARNING to something like 5 or 10 seconds. I think the main argument for the latter approach would be to avoid log-spam in normal operation I though about that for a minute, but didn't think pg_stop_backup is a common enough operation that anyone will complain that it's a little more verbose in its logging now. I know when I was new to this, I used to wonder just what it was busy doing just after executing this command when it hung there for a while sometimes, and would have welcomed this extra bit of detail--preferably immediately, not even after a 5 or 10 second delay. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Tom Lane wrote: The value of the HINT I think would be to make them (a) not afraid to hit control-C and (b) aware of the fact that their archiver has got a problem. Agreed on both points. Patch attached that implements something similar to Josh's wording, tweaking the original warning too. Here's what it looks like when you run into the bad situation (which I easily simulated with "archive_command='/bin/false'") from the client's perspective: gsm...@meddle:~/pgwork/src/master/src$ psql -c "select pg_start_backup('test')" pg_start_backup - 0/520 (1 row) gsm...@meddle:~/pgwork/src/master/src$ psql psql (9.0devel) Type "help" for help. gsmith=# select pg_stop_backup(); NOTICE: pg_stop_backup cleanup done, waiting for required segments to archive WARNING: pg_stop_backup still waiting for all required segments to archive (60 seconds elapsed) HINT: Confirm your archive_command is executing successfully. pg_stop_backup can be aborted safely, but the resulting backup will not be usable. ^CCancel request sent ERROR: canceling statement due to user request And this is the sort of thing that shows up in the logs with default logging behavior while all this is happening; you don't see the NOTICE, but the WARNING and HINT are both there which I think is good: LOG: archive command failed with exit code 1 DETAIL: The failed archive command was: /bin/false WARNING: transaction log file "0001" could not be archived: too many failures WARNING: pg_stop_backup still waiting for all required segments to archive (60 seconds elapsed) HINT: Confirm your archive_command is executing successfully. pg_stop_backup can be aborted safely, but the resulting backup will not be usable. Does this solve the logging side of this? You can still make a case for a more forceful pg_stop_backup, this seems to at least remove much of the mystery and frustration from the whole exercise. This patch plus a little documentation suggesting how to recover from this issue might be enough. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ca088b0..c09ede9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8125,6 +8125,9 @@ pg_stop_backup(PG_FUNCTION_ARGS) BackupHistoryFileName(histfilename, ThisTimeLineID, _logId, _logSeg, startpoint.xrecoff % XLogSegSize); + ereport(NOTICE, + (errmsg("pg_stop_backup cleanup done, waiting for required segments to archive"))); + seconds_before_warning = 60; waits = 0; @@ -8139,8 +8142,10 @@ pg_stop_backup(PG_FUNCTION_ARGS) { seconds_before_warning *= 2; /* This wraps in >10 years... */ ereport(WARNING, - (errmsg("pg_stop_backup still waiting for archive to complete (%d seconds elapsed)", - waits))); + (errmsg("pg_stop_backup still waiting for all required segments to archive (%d seconds elapsed)", + waits), + errhint("Confirm your archive_command is executing successfully. " + "pg_stop_backup can be aborted safely, but the resulting backup will not be usable."))); } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Josh Berkus wrote: Thing is, if archive_command is failing, then the backup is useless regardless until it's fixed. And sending the archives to /dev/null (the fix you're essentially recommending above) doesn't make the backup any more useful. That's not what I said to do first. If it's possible to fix your archive_command, and it never returned bad "I'm saying success but I didn't really do the right thing" information to the server--it just failed--this situation is completely recoverable with no damage to the backup. Just fix the archive_command, reload the configuration, and the queue of archived files will flow and eventually your consistent backup completes. This it the only behavior someone who is trying to recover from a mistake in production is likely to find acceptable, and as Simon has pointed out that is what the current situation is optimized for. Only in the situation where the archive_command was so bad that it returned the wrong data to the server--saying the segment was saved but it really wasn't--did I suggest that you might as well change archive_command to go nowhere. Because in that case, your backup is already screwed, you lost an essential piece of it. As far your comment about treating this like it's a problem specific to you, did you miss the part where I pointed out I was just expressing concerns about poor visiblity into this area ("what is the archiver doing?") recently? I'm well aware this path is full of difficult to escape from holes. We just need to be careful not do something that screws over production users in the name of reducing the learning curve. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup does not complete
Josh Berkus wrote: pg_stop_backup() doesn't complete until all the WAL segments needed to restore from the backup are archived. If archive_command is failing, that never happens. OK, so we need a way out of that cycle if the user is issuing pg_stop_backup because they *already know* that archive_command is failing. Right now, there's no way out other than a fast shutdown, which is a bit user-hostile. gsmith=# select name,context from pg_settings where name like 'archive%'; name | context -+ archive_command | sighup archive_mode| postmaster archive_timeout | sighup I expect for your particular bad situation, you can replace the archive_command with a corrected one, use "pg_ctl reload" to send a SIGHUP to make that fix active, and escape from this. That's the only right way out of this situation. You can't just abort a backup someone has asked for just because archives are failing and allow the server to shutdown cleanly in this situation. That's the wrong thing to do for production setups; the last thing you want for a system with archiving issues is to be stopped normally if it's interfering with an explicit admin requested backup. Not necessarily any reason that backup even needs to fail, and no reason for the server to get restarted in this situation at all. If the archive_command never returned false information, and in fact just returned a valid error code, all of the segments needed to make the backup consistent will be queued up waiting for the problem to be fixed. Put the fixed archive_command in place, and you're off and running again. If that's impossible, because the archive_command was really screwed up, we can just tell people to swap to an archive_command that just returns success, and let the queued up segments to be archived all get tossed away. That backup will be bad, they fix the archive_command, send SIGHUP, and start over with a new backup. There's some doc patches that could guide how to handle this situation better for sure, but I don't see any code changes needed. Everything working as designed, optimized for production use at the expense of some confusion on how to recover if you configure things badly. I suggested a patch a few weeks ago to make "what is the archiver doing?" behavior easier to monitor, got the impression people felt it was redundant given SR was the preferred path moving forward and eventually this whole archive_command bit would be going away. I could revive that work if you feel this is such a bad issue that we need a better way to watch what the archiver is doing. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOTIFY/LISTEN on read-only slave?
Bruce Momjian wrote: Oops, I did cleanup on the HS docs all day today in response to a doc patch that was posted in December. How extensive are your changes? Not very--wording improvements, typos, some technical clarification after quizzing Simon on internals that were described in a fuzzy way. I'll just wait until I see your commit and then rebase on top of that, no big deal. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOTIFY/LISTEN on read-only slave?
Greg Smith wrote: Tom Lane wrote: The explanation is wrong, but it's still disallowed. What's the actual reason for the restriction then? I did a whole proofreading round on the HS documentation the other day and am working on a patch to clean up everything I found, can add better notes about this to it. Will answer my own question now, from the commit log: "There is not yet any facility for HS slaves to receive notifications generated on the master, although such a thing is possible in future." I'll include that detail in the doc patch I'm working on. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOTIFY/LISTEN on read-only slave?
Tom Lane wrote: Bruce Momjian writes: Our documentation says listen/notify will return an error if executed on the hot standby server: o LISTEN, UNLISTEN, NOTIFY since they currently write to system tables With the listen/notify system now implemented in memory, is this still true? The explanation is wrong, but it's still disallowed. What's the actual reason for the restriction then? I did a whole proofreading round on the HS documentation the other day and am working on a patch to clean up everything I found, can add better notes about this to it. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psycopg2 license changed
Federico Di Gregorio wrote: Even if tests and examples code aren't almost never distributed except in the psycopg2 source package? A couple of other people contributed to the tests: if you really feel like it is so important I'll contact them and ask their permission to use the LGPL3 + exception (the contribution was without the exception) or remove the code (we won't lose much.) I understand that from a technical perspective these are all different bits. But the sort of people who get stressed about licenses might not, and that's why it's always better to have a simple, standard, unified license that covers the entire chunk of software you're packaging. If the examples show up in the source package, that means the source package has two licenses instead of one, and that's a bad thing. It's not a huge issue, I'm just afraid that if you don't get this nailed down now there's just going to another round of this tedious license investigation in the future one day. I'd think it's better for you and everyone else in the long run to just completely unify the license. And if takes another release for the examples to get that license change, I think that's OK. I wouldn't hold up the big work here--getting your next release out with the big LGPL3 switch for the main code--over this bit of trivia. I just think it's a potential future headache you should try to remove when you can. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with 8.4 stats collector high load
Jakub Ouhrabka wrote: I've found similar reports but with older versions of postgres: http://old.nabble.com/100--of-CPU-utilization-postgres-process-tt27302021.html Those all looked like a FreeBSD issue, doubt it's related to yours. The pgstat.stat is ~20MB. There are 650 databases, 140GB total. default_statistics_target = 1000 The system is running Proxmox linux distribution. PostgreSQL is in OpenVZ container. With this many databases and this high of a statistics target, running in a VM, suspecting autovacuum seems reasonable. You might want to try setting log_autovacuum_min_duration=0 in the postgresql.conf, restarting or signalling (pg_ctl reload) the server, and watching just what it's doing. You might need to reduce how aggressively that runs, or limit the higher target to only the tables that need it, to get this under control. You're really pushing what you can do in a VM with this many databases of this size. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explain buffers display units.
Greg Stark wrote: We do *not* display raw block numbers anywhere else. Generally I think we should have a policy of outputing human-readable standard units of memory whenever displaying a memory quantity. Actually I thought we already had that policy, hence things like... The first counter example I thought of is log_checkpoints which looks like this: LOG: checkpoint complete: wrote 133795 buffers (25.5%); 0 transaction log file(s) added, 0 removed, 98 recycled; write=112.281 s, sync=108.809 s, total=221.166 s Probably the XML schema should include the units as an attribute for each tag so tools don't have to hard-code knowledge about what unit each tag is in. I don't know if it's practical at this point, but it might be helpful for the truly machine-targeted output formats to include specifically BLCKSZ somewhere in their header--just so there's a universal way to interpret the output even if the user tuned that. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us
Re: [HACKERS] psycopg2 license changed
Federico Di Gregorio wrote: I just wanted all interested people know that psycopg2 2.0.14 to be released in the next few days will be under the LGPL3 + OpenSSL exception (example code and tests under the LGPL3 alone because they are never linked to OpenSSL). Great news and I look forward to the release. One small thing to consider: having more than one license can turn into a cost to users of your software who are required to have each license reviewed for legal issues, and I'd think that maintaining two has some cost for you too. If it's possible for you to fold all these into a single license, that would really be a lot nicer. Being able to say "psycopg2 is LGPL3 + OpenSSL exception", period, is much easier for people to deal with than having two licenses and needing to include the description you gave above for explanation. Having to educate a lawyer on how linking works, so they understand the subtle distinction for why the two licenses exist, is no fun at all. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explain buffers display units.
Greg Stark wrote: We can always continue tweak the details of the format such as adding spaces before the units to make it similar to the pg_size_pretty(). I'm not sure I like the idea of making it exactly equivalent because pg_size_pretty() doesn't print any decimals so it's pretty imprecise for smaller values. That's a reasonable position; I'd be fine with upgrading the requirements for a text scraping app to handle either "8 kB" or "1.356 kB" if it wanted to share some code to consume either type of info, if all you did was throw a space in there. I'd suggest either removing the PB units support from your implementation, or adding it to pg_size_pretty, just to keep those two routines more like one another in terms of what they might produce as output given the same scale of input. Also, a quick comment in the new code explaining what you just said above might be helpful, just to preempt a similar "how is this different from pg_size_pretty?" question from popping up again one day. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [FWD] About "Our CLUSTER implementation is pessimal" patch
Leonardo F wrote: Could at least the message: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00766.php be added to the TODO page, under "Improve CLUSTER performance by sorting to reduce random I/O" ? It would be sad if the patch got lost... You should read http://wiki.postgresql.org/wiki/Submitting_a_Patch and follow the instructions there. As outlined in the "Submission timing" section, you're asking about something during the wrong time to be doing so--that's why you're not getting any real feedback. Add your patch to the next CommitFest by linking to your message at https://commitfest.postgresql.org/ and it won't get lost--it will get assigned a reviewer at the time when the time comes to look at completely new patches again. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explain buffers display units.
Greg Stark wrote: b) Used units of memory -- I formatted them with 3 significant digits (unless the unit is bytes or kB where that would be silly). It's just what looked best to my eye. How does this compare with what comes out of pg_size_pretty (src/backend/utils/adt/dbsize.c)? I already have code floating around that parses the output from pg_size_pretty when I'm slurping in things from PostgreSQL, and it's not immediately obvious to me what having a format that's similar to but not quite the same as that one is buying here. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Documentation build issues on Debian/Ubuntu
I can't seem to build the PDF version of the documentation on any of my Ubuntu 9.04 systems, and wonder if there's anything that can/should should get done about it. The problem happens like this: gsm...@gsmith-desktop:~/pgwork/doc/src/sgml$ make postgres-US.pdf openjade -D . -D . -c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d ./stylesheet.dsl -t tex -V tex-backend -i output-print -i include-index -V texpdf-output -V '%paper-type%'=USletter -o postgres-US.tex-pdf postgres.sgml openjade:./stylesheet.dsl:621:2:E: flow object not accepted by port; only display flow objects accepted make: *** [postgres-US.tex-pdf] Segmentation fault make: *** Deleting file `postgres-US.tex-pdf' These "flow object not accepted by port" errors showing up and then causing a crash go back a long ways: http://archives.postgresql.org/pgsql-docs/2003-12/msg00024.php There used to be a warning in the docs about not building with openjade 1.4, which is certainly what I've got here: $ openjade --version openjade:I: "OpenJade" version "1.4devel" openjade:I: "OpenSP" version "1.5.2" The problems here don't seem limited to this project; I see a similar report of the same style of crash against some Linux kernel docbooks at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=521148 That wanders toward suggesting it's a 32/64 bit issue, but that's not true here--I get the same crash on both architectures. Any suggestions? -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers