Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 22:42 +0100, Thom Brown wrote: > On 20 September 2010 22:14, Robert Haas wrote: > > Well, if you need to talk to "all the other standbys" and see who has > > the furtherst-advanced xlog pointer, it seems like you have to have a > > list somewhere of who they all are. > > When they connect to the master to get the stream, don't they in > effect, already talk to the primary with the XLogRecPtr being relayed? > Can the connection IP, port, XLogRecPtr and request time of the > standby be stored from this communication to track the states of each > standby? They would in effect be registering upon WAL stream > request... and no doubt this is a horrifically naive view of how it > works. It's not viable to record information at the chunk level in that way. But the overall idea is fine. We can track who was connected and how to access their LSNs. They don't need to be registered ahead of time on the master to do that. They can register and deregister each time they connect. This discussion is reminiscent of the discussion we had when Fujii first suggested that the standby should connect to the master. At first I though "don't be stupid, the master needs to connect to the standby!". It stood everything I had thought about on its head and that hurt, but there was no logical reason to oppose. We could have used standby registration on the master to handle that, but we didn't. I'm happy that we have a more flexible system as a result. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Any reason why the default_with_oids GUC is still there?
On 21/09/10 04:18, Josh Berkus wrote: ... or did we just forget to remove it? Backwards-compatibility? ;-) There hasn't been any pressing reason to remove it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pg_upgrade performance
On 21/09/10 16:14, Mark Kirkwood wrote: I've been having a look at this guy, trying to get a handle on how much down time it will save. As a quick check, I tried upgrading a cluster with a 1 non default db containing a scale 100 pgbench schema: - pg_upgrade : 57 s - pgdump/pg_restore : 154 s So, a reasonable saving all up - but I guess still a sizable chunk of downtime in the case of a big database to copy the user relation files. I notice there is a "link" option that would be quicker I guess - would it make sense to have a "move" option too? (perhaps with pg_upgrade writing an "un-move" script to move them back just in case). Replying to this - looking more carefully at what the --link option does, it is clear that this is in fact covered. Sorry for the (my) confusion. For completeness, with this option the upgrade is substantially faster: - pg_upgrade (link): 12 s regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] .gitignore files, take two
Robert Haas writes: > I suppose you already know my votes, but here they are again just in case. > ... > Centralize. > ... > All the build products in a normal build. I don't understand your preference for this together with a centralized ignore file. That will be completely unmaintainable IMNSHO. A centralized file would work all right if it's limited to the couple dozen files that are currently listed in .cvsignore's, but I can't see doing it that way if it has to list every executable and .so built anywhere in the tree. You'd get merge conflicts from completely-unrelated patches, not to mention the fundamental action-at-a-distance nastiness of a top-level file that knows about everything going on in every part of the tree. To put it another way: would you expect anyone to take it seriously if you proposed moving all the "make clean" rules into the top-level Makefile? That's pretty much exactly what this would be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] .gitignore files, take two
I suppose you already know my votes, but here they are again just in case. On Tue, Sep 21, 2010 at 12:00 AM, Tom Lane wrote: > 1. Whether to keep the per-subdirectory ignore files (which CVS > insisted on, but git doesn't) or centralize in a single ignore file. Centralize. > 2. Whether to have the ignore files ignore common cruft such as > editor backup files, or only "expected" build product files. I don't care too much about this. A mild preference for just the expected build product files, but then that's heavily influenced by my choice of editor, which doesn't leave such cruft around permanently. > 3. What are the ignore filesets *for*, in particular should they list > just the derived files expected in a distribution tarball, or all the > files in the set of build products in a normal build? All the build products in a normal build. One of the infelicities of git is that 'git status' shows the untracked files at the bottom. So if you have lots of unignored stuff floating around, the information about which files you've actually changed or added to the index scrolls right off the screen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq changes for synchronous replication
On Mon, Sep 20, 2010 at 11:55 PM, Heikki Linnakangas wrote: > It doesn't feel right to always accept PQputCopyData in COPY OUT mode, > though. IMHO there should be a new COPY IN+OUT mode. > > It should be pretty safe to add a CopyInOutResponse message to the protocol > without a protocol version bump. Thoughts on that? Or we check "replication" field in PGConn, and accept PQputCopyData in COPY OUT mode only if it indicates TRUE? This is much simpler, but maybe not versatile.. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver
On Tue, Sep 21, 2010 at 9:48 AM, fazool mein wrote: > Hi, > > I want to shut down the server under certain conditions that can be checked > inside a backend process. For instance, while running symmetric replication, > if the primary dies, I want the the walreceiver to detect that and shutdown > the standby. The reason for shutdown is that I want to execute some other > stuff before I start the standby as a primary. Creating a trigger file > doesn't help as it converts the standby into primary at run time. > > Using proc_exit() inside walreceiver only terminates the walreceiver > process, which postgres starts again. The other way I see is using > ereport(PANIC, ...). Is there some other way to shutdown the main server > from within a backend process? Are you going to change the source code? If yes, you might be able to do that by making walreceiver send the shutdown signal to postmaster. If no, I think that a straightforward approach is to use a clusterware like pacemaker. That is, you need to make a clusterware periodically check the master and cause the standby to end when detecting the crash of the master. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Path question
On Mon, Sep 20, 2010 at 10:57:00PM -0400, Robert Haas wrote: > 2010/9/3 Hans-Jürgen Schönig : > > On Sep 2, 2010, at 1:20 AM, Robert Haas wrote: > >> I agree. Explicit partitioning may open up some additional > >> optimization possibilities in certain cases, but Merge Append is > >> more general and extremely valuable in its own right. > > > > we have revised greg's wonderful work and ported the entire thing > > to head. it solves the problem of merge_append. i did some > > testing earlier on today and it seems most important cases are > > working nicely. > > First, thanks for merging this up to HEAD. I took a look through > this patch tonight, and the previous reviews thereof that I was able > to find, most notably Tom's detailed review on 2009-07-26. I'm not > sure whether or not it's accidental that this didn't get added to > the CF, It's because I missed putting it in, and oversight I've corrected. If we need to bounce it on to the next one, them's the breaks. > [points elided] > > 7. I think there's some basic code cleanup needed here, also: comment > formatting, variable naming, etc. Hans-Jürgen, Will you be able to get to this in the next couple of days? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Pg_upgrade performance
I've been having a look at this guy, trying to get a handle on how much down time it will save. As a quick check, I tried upgrading a cluster with a 1 non default db containing a scale 100 pgbench schema: - pg_upgrade : 57 s - pgdump/pg_restore : 154 s So, a reasonable saving all up - but I guess still a sizable chunk of downtime in the case of a big database to copy the user relation files. I notice there is a "link" option that would be quicker I guess - would it make sense to have a "move" option too? (perhaps with pg_upgrade writing an "un-move" script to move them back just in case). Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] .gitignore files, take two
Back here I asked what we were going to do about .gitignore files: http://archives.postgresql.org/pgsql-hackers/2010-08/msg01232.php The thread died off when the first git conversion attempt crashed and burned; but not before it became apparent that we didn't have much consensus. It seemed that there was lack of agreement as to: 1. Whether to keep the per-subdirectory ignore files (which CVS insisted on, but git doesn't) or centralize in a single ignore file. 2. Whether to have the ignore files ignore common cruft such as editor backup files, or only "expected" build product files. It was pointed out that exclusion rules could be configured locally to one's own repository, so one possible answer to issue #2 is to have only a minimal ignore-set embodied in .gitignore files, and let people who prefer to ignore more stuff set that up in local preferences. Although this point wasn't really brought up during that thread, it's also the case that the existing implementation is far from consistent about ignoring build products. We really only have .cvsignore entries for files that are not in CVS but are meant to be present in distribution tarballs. CVS will, of its own accord, ignore certain build products such as .o files; but it doesn't ignore executables for instance. So unless you do a "make distclean" before "cvs update", you will get notices about non-ignored files. That never bothered me particularly but I believe it annoys some other folks. So really there is a third area of disagreement: 3. What are the ignore filesets *for*, in particular should they list just the derived files expected in a distribution tarball, or all the files in the set of build products in a normal build? We need to get some consensus on this now. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > Ok, I've pushed a new repository to both gitmaster and the > postgresql-migration.git mirror, that has this setting. > NOTE! Do a complete wipe of your repository before you clone this > again - it's a completely new repo that will have different SHA1s. AFAICT this version is good: it passes comparisons against all the historical tarballs I have, as well as against my checked-out copies of branch tips. History looks sane as best I can tell, too. I'm ready to sign off on this. NOTE: Magnus told me earlier that the new repository isn't ready to accept commits, so committers please hold your fire till he gives the all-clear. It looks okay to clone this and start working locally, though. For the archives' sake, below are the missing historical tags that match available tarballs, plus re-instantiation of the Release_2_0 and Release_2_0_0 tags on non-manufactured commits. I will push these up to the repo once it's open for pushing. regards, tom lane git tag PG95-1_08 bf3473c468b1938f782fdcc208bd62c4b061daa3 # commit bf3473c468b1938f782fdcc208bd62c4b061daa3 refs/heads/Release_1_0_3 # Author: Marc G. Fournier # Date: Fri Oct 4 20:38:49 1996 + git tag PG95-1_09 1b5e30e615eacae651a3cd12aa6b5c44d398b919 # commit 1b5e30e615eacae651a3cd12aa6b5c44d398b919 refs/heads/Release_1_0_3 # Author: Marc G. Fournier # Date: Thu Oct 31 20:25:56 1996 + git tag REL6_1 0acf9c9b28433120ca96a3a1c03222bfe45c8932 # commit 0acf9c9b28433120ca96a3a1c03222bfe45c8932 refs/tags/release-6-3 # Author: Bruce Momjian # Date: Fri Jun 13 14:08:48 1997 + git tag REL6_1_1b6d983559a2d2a6bd0b03b7b7f59a63a4c3f4918 # commit b6d983559a2d2a6bd0b03b7b7f59a63a4c3f4918 refs/tags/release-6-3 # Author: Bruce Momjian # Date: Mon Jul 21 22:29:41 1997 + git tag REL6_2 d663f1c83944cf8934f549ff879b51364f1a60ad # commit d663f1c83944cf8934f549ff879b51364f1a60ad refs/tags/release-6-3 # Author: Bruce Momjian # Date: Thu Oct 2 18:32:58 1997 + git tag REL6_2_18a1a39c39079ebc26f1bb55ad1ed2a11c2d36045 # commit 8a1a39c39079ebc26f1bb55ad1ed2a11c2d36045 refs/tags/release-6-3 # Author: Bruce Momjian # Date: Sat Oct 18 16:59:06 1997 + git tag REL6_3 b1c7c31e07b9284843d85bbe71a327a1ca13be63 # commit b1c7c31e07b9284843d85bbe71a327a1ca13be63 refs/tags/release-6-3 # Author: Marc G. Fournier # Date: Mon Mar 2 14:54:59 1998 + git tag REL6_3_2b542fa1a6e838d3e32857cdfbe8aeff940a91c74 # commit b542fa1a6e838d3e32857cdfbe8aeff940a91c74 refs/tags/REL6_5 # Author: Marc G. Fournier # Date: Sat Apr 18 18:32:44 1998 + git tag REL6_4_23be6c6eb73922fb872a6251cb45cb89d8822744f # commit 3be6c6eb73922fb872a6251cb45cb89d8822744f refs/heads/REL6_4 # Author: Bruce Momjian # Date: Sun Jan 3 06:50:17 1999 + git tag REL6_5 275a1d054e72b35bfd98c9731e51b2961ab8dbf5 # commit 275a1d054e72b35bfd98c9731e51b2961ab8dbf5 refs/tags/REL6_5 # Author: Tom Lane # Date: Mon Jun 14 17:49:06 1999 + git tag REL6_5_1c7092a8e8fe67e556f5c7b2f1336453b2ebecbeb # commit c7092a8e8fe67e556f5c7b2f1336453b2ebecbeb refs/heads/REL6_5_PATCHES # Author: Bruce Momjian # Date: Mon Jul 19 05:08:23 1999 + git tag REL6_5_2d5d33e2ee453656d607ad6b1036f0091d29de25a # commit d5d33e2ee453656d607ad6b1036f0091d29de25a refs/heads/REL6_5_PATCHES # Author: Tom Lane # Date: Tue Sep 14 22:33:35 1999 + git tag REL6_5_3ef26b944b12ce52b14101512c39cf7a42ca970a6 # commit ef26b944b12ce52b14101512c39cf7a42ca970a6 refs/heads/REL6_5_PATCHES # Author: Bruce Momjian # Date: Thu Nov 4 16:22:41 1999 + git tag REL7_0_2e261306b439e8286f8e8d7dcb6871c485df581c8 # commit e261306b439e8286f8e8d7dcb6871c485df581c8 refs/heads/REL7_0_PATCHES # Author: Bruce Momjian # Date: Mon Jun 5 17:02:27 2000 + git tag REL7_0_36835ca629877b9470f206cbea36c21aac9cdd493 # commit 6835ca629877b9470f206cbea36c21aac9cdd493 refs/heads/REL7_0_PATCHES # Author: Marc G. Fournier # Date: Sun Nov 12 07:31:36 2000 + git tag REL7_1 741604dd84dbbd58368a0206f73de259cb6718f4 # commit 741604dd84dbbd58368a0206f73de259cb6718f4 refs/tags/REL7_2_BETA1 # Author: Marc G. Fournier # Date: Fri Apr 13 21:21:33 2001 + git tag REL7_1_1ed6586063813cb4c9263254bb60b514cd12427e9 # commit ed6586063813cb4c9263254bb60b514cd12427e9 refs/tags/REL7_1_2 # Author: Marc G. Fournier # Date: Sat May 5 20:23:57 2001 + git tag REL7_1_20b471cc338777b84f3510b124aeaa7de75572848 # commit 0b471cc338777b84f3510b124aeaa7de75572848 refs/heads/REL7_1_STABLE # Author: Thomas G. Lockhart # Date: Tue May 22 14:46:46 2001 + git tag REL7_1_38c78169c4a766376317b2255572820dfcc52470e # co
Re: [HACKERS] Path question
2010/9/3 Hans-Jürgen Schönig : > On Sep 2, 2010, at 1:20 AM, Robert Haas wrote: >> I agree. Explicit partitioning may open up some additional optimization >> possibilities in certain cases, but Merge Append is more general and >> extremely valuable in its own right. > > we have revised greg's wonderful work and ported the entire thing to head. > it solves the problem of merge_append. i did some testing earlier on today > and it seems most important cases are working nicely. First, thanks for merging this up to HEAD. I took a look through this patch tonight, and the previous reviews thereof that I was able to find, most notably Tom's detailed review on 2009-07-26. I'm not sure whether or not it's accidental that this didn't get added to the CF, but here's an attempt to enumerate the things that seem like they need to be fixed. The quotes labeled "TGL" are from the aforementioned review by Tom. 1. The code in set_append_rel_pathlist() that accumulates the pathkeys of all sub-paths is, as it says, and as previous discussed, O(n^2). In a previous email on this topic, Tom suggested on possible approach for this problem: choose the largest child relation and call it the leader, and consider only the pathkeys for that relation rather than all of them. I think it would be nice if we can find a way to be a bit smarter, though, because that won't necessarily always find the best path. One idea I had is to choose some arbitrary limit on how long the all_pathkeys list is allowed to become and iterate over the children from largest to smallest, stopping early if you hit that limit. But thinking about it a little more, can't we just adjust the way we do this so that it's not O(n^2)? It seems we're only concerned with equality here, so what about using a hash table? We could hash the PathKey pointers in each list, but not the lists or listcells obviously. 2. TGL: "you need an explicit flag to say 'we'll do a merge', not just rely on whether the path has pathkeys." This makes sense and doesn't seem difficult. 3. TGL: "Speaking of sorting, it's not entirely clear to me how the patch ensures that all the child plans produce the necessary sort keys as output columns, and especially not how it ensures that they all get produced in the *same* output columns. This might accidentally manage to work because of the "throwaway" call to make_sort_from_pathkeys(), but at the very least that's a misleading comment." I'm not sure what needs to be done about this; I'm going to look at this further. 4. TGL: "In any case, I'm amazed that it's not failing regression tests all over the place with those critical tests in make_sort_from_pathkeys lobotomized by random #ifdef FIXMEs. Perhaps we need some more regression tests...". Obviously, we need to remove that lobotomy and insert the correct fix for whatever problem it was trying to solve. Adding some regression tests seems wise, too. 5. TGL: "In the same vein, the hack to 'short circuit' the append stuff for a single child node is simply wrong, because it doesn't allow for column number variances. Please remove it." This seems like straightforward cleanup, and maybe another candidate for a regression test. (Actually, I notice that the patch has NO regression tests at all, which surely can't be right for something of this type, though admittedly since we didn't have EXPLAIN (COSTS OFF) when this was first written it might have been difficult to write anything meaningful at the time.) 6. The dummy call to cost_sort() seems like a crock; what that function does doesn't seem particularly relevant to the cost of the merge operation. Just off the top of my head, it looks like the cost of the merge step will be roughly O(lg n) * the cost of comparing two tuples * the total number of tuples from all child paths. In practice it might be less, because once some of the paths run out of tuples the number of comparisons will drop, I think. But the magnitude of that effect seems difficult to predict, and may be rather small, so perhaps we should just ignore it. 7. I think there's some basic code cleanup needed here, also: comment formatting, variable naming, etc. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Any reason why the default_with_oids GUC is still there?
... or did we just forget to remove it? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: general purpose requirements
On Mon, Sep 20, 2010 at 1:45 PM, Markus Wanner wrote: > Hm.. I see. So in other words, you are saying > min_spare_background_workers isn't flexible enough in case one has > thousands of databases but only uses a few of them frequently. Yes, I think that is true. > I understand that reasoning and the wish to keep the number of GUCs as > low as possible. I'll try to drop the min_spare_background_workers from > the bgworker patches. OK. At least for me, what is important is not only how many GUCs there are but how likely they are to require tuning and how easy it will be to know what the appropriate value is. It seems fairly easy to tune the maximum number of background workers, and it doesn't seem hard to tune an idle timeout, either. Both of those are pretty straightforward trade-offs between, on the one hand, consuming more system resources, and on the other hand, better throughput and/or latency. On the other hand, the minimum number of workers to keep around per-database seems hard to tune. If performance is bad, do I raise it or lower it? And it's certainly not really a hard minimum because it necessarily bumps up against the limit on overall number of workers if the number of databases grows too large; one or the other has to give. I think we need to look for a way to eliminate the maximum number of workers per database, too. Your previous point about not wanting one database to gobble up all the available slots makes sense, but again, it's not obvious how to set this sensibly. If 99% of your activity is in one database, you might want to use all the slots for that database, at least until there's something to do in some other database. I feel like the right thing here is for the number of workers for any given database to fluctuate in some natural way that is based on the workload. If one database has all the activity, it gets all the slots, at least until somebody else needs them. Of course, you need to design the algorithm so as to avoid starvation... > The rest of the bgworker infrastructure should behave pretty much like > what you have described. Parallelism in starting bgworkers could be a > nice improvement, especially if we kill the min_space_background_workers > mechanism. Works for me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Shutting down server from a backend process, e.g. walrceiver
Hi, I want to shut down the server under certain conditions that can be checked inside a backend process. For instance, while running symmetric replication, if the primary dies, I want the the walreceiver to detect that and shutdown the standby. The reason for shutdown is that I want to execute some other stuff before I start the standby as a primary. Creating a trigger file doesn't help as it converts the standby into primary at run time. Using proc_exit() inside walreceiver only terminates the walreceiver process, which postgres starts again. The other way I see is using ereport(PANIC, ...). Is there some other way to shutdown the main server from within a backend process? Thanks.
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 5:42 PM, Thom Brown wrote: > On 20 September 2010 22:14, Robert Haas wrote: >> Well, if you need to talk to "all the other standbys" and see who has >> the furtherst-advanced xlog pointer, it seems like you have to have a >> list somewhere of who they all are. > > When they connect to the master to get the stream, don't they in > effect, already talk to the primary with the XLogRecPtr being relayed? > Can the connection IP, port, XLogRecPtr and request time of the > standby be stored from this communication to track the states of each > standby? They would in effect be registering upon WAL stream > request... and no doubt this is a horrifically naive view of how it > works. Sure, but the point is that we can want DISCONNECTED slaves to affect master behavior in a variety of ways (master retains WAL for when they reconnect, master waits for them to connect before acking commits, master shuts down if they're not there, master tries to stream WAL backwards from them before entering normal running). I just work here, but it seems to me that such things will be easier if the master has an explicit notion of what's out there. Can we make it all work without that? Possibly, but I think it will be harder to understand. With standby registration, you can DECLARE the behavior you want. You can tell the master "replicate synchronously to Bob". And that's it. Without standby registration, what's being proposed is basically that you can tell the master "replicate synchronously to one server" and you can tell Bob "you are a server to which the master can replicate synchronously" and you can tell the other servers "you are not a server to which Bob can replicate synchronously". That works, but to me it seems less straightforward. And that's actually a relatively simple example. Suppose you want to tell the master "keep enough WAL for Bob to catch up when he reconnects, but if he gets more than 1GB behind, forget about him". I'm sure someone can devise a way of making that work without standby registration, too, but I'm not too sure off the top of my head what it will be. With standby registration, you can just write something like this in standbys.conf (syntax invented): [bob] wal_keep_segments=64 I feel like that's really nice and simple. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable snapshot isolation error logging
Dan S wrote: > Well I guess one would like some way to find out which statements > in the involved transactions are the cause of the serialization > failure and what programs they reside in. Unless we get the conflict list optimization added after the base patch, you might get anywhere from one to three of the two to three transactions involved in the serialization failure. We can also report the position they have in the "dangerous structure" and mention that there are other, unidentified, transactions participating in the conflict. Once I get through with the issue I'm working on based on Heikki's observations, I'll take a look at this. > Also which relations were involved, the sql-statements may contain > many relations but just one or a few might be involved in the > failure, right ? The conflicts would have occurred on specific relations, but we don't store all that -- it would be prohibitively expensive. What we track is that transaction T0's read couldn't see the write from transaction T1. Once you know that, SSI doesn't require that you know which or how many relations were involved in that -- you've established that T0 must logically come before T1. That in itself is no problem, of course. But if you also establish that T1 must come before TN (where TN might be T0 or a third transaction), you've got a "pivot" at T1. You're still not dead in the water yet, but if that third logical transaction actually *commits* first, you're probably in trouble. The only way out is that if T0 is not TN, T0 is read only, and TN did *not* commit before T0 got its snapshot, you're OK. Where it gets complicated is that in the algorithm in the paper, which we are following for the initial commit attempt, each transaction keeps one "conflictIn" and one "conflictOut" pointer for checking all this. If you already have a conflict with one transaction and then detect a conflict of the same type with another, you change the conflict pointer to a self-reference -- which means you conflict with *all* other concurrent transactions in that direction. You also have lost the ability to report all transaction which are involved in the conflict. > The tuples involved if available. > > I don't know how helpful it would be to know the pages involved > might be, I certainly wouldn't know what to do with that info. That information would only be available on the *read* side. We count on MVCC data on the *write* side, and I'm not aware of any way for a transaction to list everything it's written. Since we're not recording the particular points of conflict between transactions, there's probably not a lot of point in listing it anyway -- there might be a conflict on any number of tuples out of a great many read or written. > All this is of course to be able to guess at which statements to > modify or change execution order of, take an explicit lock on and > so on to reduce serialization failure rate. I understand the motivation, but the best this technique is likely to be able to provide is the transactions involved, and that's not always going to be complete unless we convert those single- transaction conflict fields to lists. > If holding a list of the involved transactions turns out to be > expensive, maybe one should be able to turn it on by a GUC only > when you have a problem and need the extra information to track it > down. That might be doable. If we're going to add such a GUC, though, it should probably be considered a tuning GUC, with the "list" setting recommended for debugging problems. Of course, if you change it from "field" to "list" the problem might disappear. Hmmm. Unless we also had a "debug" setting which kept track of the list but ignored it for purposes of detecting the dangerous structures described above. Of course, you will always know what transaction was canceled. That does give you something to look at. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 20 September 2010 22:14, Robert Haas wrote: > Well, if you need to talk to "all the other standbys" and see who has > the furtherst-advanced xlog pointer, it seems like you have to have a > list somewhere of who they all are. When they connect to the master to get the stream, don't they in effect, already talk to the primary with the XLogRecPtr being relayed? Can the connection IP, port, XLogRecPtr and request time of the standby be stored from this communication to track the states of each standby? They would in effect be registering upon WAL stream request... and no doubt this is a horrifically naive view of how it works. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 4:10 PM, Dimitri Fontaine wrote: > Robert Haas writes: >> So the "wait forever" case is, in my opinion, >> sufficient to demonstrate that we need it, but it's not even my >> primary reason for wanting to have it. > > You're talking about standby registration on the master. You can solve > this case without it, because when a slave is not connected it's not > giving any feedback (vote, weight, ack) to the master. All you have to > do is have the quorum setup in a way that disconnecting your slave means > you can't reach the quorum any more. Have it SIGHUP and you can even > choose to fix the setup, rather than fix the standby. I suppose that could work. >> The most important reason why I think we should have standby >> registration is for simplicity of configuration. Yes, it adds another >> configuration file, but that configuration file contains ALL of the >> information about which standbys are synchronous. Without standby >> registration, this information will inevitably be split between the >> master config and the various slave configs and you'll have to look at >> all the configurations to be certain you understand how it's going to >> end up working. > > So, here, we have two quite different things to be concerned > about. First is the configuration, and I say that managing a distributed > setup will be easier for the DBA. Yeah, I disagree with that, but I suppose it's a question of opinion. > Then there's how to obtain a nice view about the distributed system, > which again we can achieve from the master without manually registering > the standbys. After all, the information you want needs to be there. I think that without standby registration it will be tricky to display information like "the last time that standby foo was connected". Yeah, you could set a standby name on the standby server and just have the master remember details for every standby name it's ever seen, but then how do you prune the list? Heikki mentioned another application for having a list of the current standbys only (rather than "every standby that has ever existed") upthread: you can compute the exact amount of WAL you need to keep around. >> As a particular manifestation of this, and as >> previously argued and +1'd upthread, the ability to change the set of >> standbys to which the master is replicating synchronously without >> changing the configuration on the master or any of the existing slaves >> seems seems dangerous. > > Well, you still need to open the HBA for the new standby to be able to > connect, and to somehow take a base backup, right? We're not exactly > transparent there, yet, are we? Sure, but you might have that set relatively open on a trusted network. >> Another reason why I think we should have standby registration is to >> allow eventually allow the "streaming WAL backwards" configuration >> which has previously been discussed. IOW, you could stream the WAL to >> the slave in advance of fsync-ing it on the master. After a power >> failure, the machines in the cluster can talk to each other and figure >> out which one has the furthest-advanced WAL pointer and stream from >> that machine to all the others. This is an appealing configuration >> for people using sync rep because it would allow the fsyncs to be done >> in parallel rather than sequentially as is currently necessary - but >> if you're using it, you're certainly not going to want the master to >> enter normal running without waiting to hear from the slave. > > I love the idea. > > Now it seems to me that all you need here is the master sending one more > information with each WAL "segment", the currently fsync'ed position, > which pre-9.1 is implied as being the current LSN from the stream, > right? I don't see how that would help you. > Here I'm not sure to follow you in details, but it seems to me > registering the standbys is just another way of achieving the same. To > be honest, I don't understand a bit how it helps implement your idea. Well, if you need to talk to "all the other standbys" and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. Maybe there's some way to get this to work without standby registration, but I don't really understand the resistance to the idea, and I fear it's going to do nothing good for our reputation for ease of use (or lack thereof). The idea of making this all work without standby registration strikes me as akin to the notion of having someone decide whether they're running a three-legged race by checking whether their leg is currently tied to someone else's leg. You can probably make that work by patching around the various failure cases, but why isn't simpler to just tell the poor guy "Hi, Joe. You're running a three-legged race with Jane today. Hans and Juanita will be following you across the field, too, but don't worry about whether they're keeping up."? -- Robert Haas EnterpriseDB: ht
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 20:05, Magnus Hagander wrote: > On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander wrote: >> On Mon, Sep 20, 2010 at 19:49, Tom Lane wrote: >>> Magnus Hagander writes: On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: > Please fix and re-run. >>> Uh, what the heck. I ran the exact same command as last time.. Hmm: Stefan rbeooted the machine in between, I wonder if that changed something. >>> >>> I'm not sure we ever checked that. My comparisons against the tarballs >>> were done from my own run of the conversion script. I'm using C locale >>> here, probably you aren't? >> >> Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty >> what changes. >> Hmm >> >> Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". >> but it comes back with "-" in the dates, so it seems to not care about that. >> >> ("locale" clearly shows it's changed everything to C though) >> >> Is there a cvs setting for this somewhere that you know of? > > Think I found it. > > debian applies a patch to change it. If I set DateStyle=old in > CVSROOT/config, cvs export behaves sanely. I'll re-run with that > setting. Ok, I've pushed a new repository to both gitmaster and the postgresql-migration.git mirror, that has this setting. NOTE! Do a complete wipe of your repository before you clone this again - it's a completely new repo that will have different SHA1s. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, I'm somewhat sorry to have to play this game, as I sure don't feel smarter by composing this email. Quite the contrary. Robert Haas writes: > So the "wait forever" case is, in my opinion, > sufficient to demonstrate that we need it, but it's not even my > primary reason for wanting to have it. You're talking about standby registration on the master. You can solve this case without it, because when a slave is not connected it's not giving any feedback (vote, weight, ack) to the master. All you have to do is have the quorum setup in a way that disconnecting your slave means you can't reach the quorum any more. Have it SIGHUP and you can even choose to fix the setup, rather than fix the standby. So no need for registration here, it's just another way to solve the problem. Not saying it's better or worse, just another. Now we could have a summary function on the master showing all the known slaves, their last time of activity, their known current setup, etc, all from the master, but read-only. Would that be useful enough? > The most important reason why I think we should have standby > registration is for simplicity of configuration. Yes, it adds another > configuration file, but that configuration file contains ALL of the > information about which standbys are synchronous. Without standby > registration, this information will inevitably be split between the > master config and the various slave configs and you'll have to look at > all the configurations to be certain you understand how it's going to > end up working. So, here, we have two quite different things to be concerned about. First is the configuration, and I say that managing a distributed setup will be easier for the DBA. Then there's how to obtain a nice view about the distributed system, which again we can achieve from the master without manually registering the standbys. After all, the information you want needs to be there. > As a particular manifestation of this, and as > previously argued and +1'd upthread, the ability to change the set of > standbys to which the master is replicating synchronously without > changing the configuration on the master or any of the existing slaves > seems seems dangerous. Well, you still need to open the HBA for the new standby to be able to connect, and to somehow take a base backup, right? We're not exactly transparent there, yet, are we? > Another reason why I think we should have standby registration is to > allow eventually allow the "streaming WAL backwards" configuration > which has previously been discussed. IOW, you could stream the WAL to > the slave in advance of fsync-ing it on the master. After a power > failure, the machines in the cluster can talk to each other and figure > out which one has the furthest-advanced WAL pointer and stream from > that machine to all the others. This is an appealing configuration > for people using sync rep because it would allow the fsyncs to be done > in parallel rather than sequentially as is currently necessary - but > if you're using it, you're certainly not going to want the master to > enter normal running without waiting to hear from the slave. I love the idea. Now it seems to me that all you need here is the master sending one more information with each WAL "segment", the currently fsync'ed position, which pre-9.1 is implied as being the current LSN from the stream, right? Here I'm not sure to follow you in details, but it seems to me registering the standbys is just another way of achieving the same. To be honest, I don't understand a bit how it helps implement your idea. Regards, -- Dimitri Fontaine PostgreSQL DBA, Architecte -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> This is not even close to matching the tarballs :-(. Seems to be a > >> locale problem: the diffs look like > >> > >> 1c1 > >> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 > >> 18:25:16 tgl Exp $ */ > >> --- > > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 > > 18:25:16 tgl Exp $ */ > > > As a curiosity, I do prefer the dashed dates. I have had a number of > > cases where I have to change dashes to slashes when passing ISO dates as > > parameters to CVS. Shame they improve it just as we are leaving CVS. > > Yeah. It appears that this was prompted by a desire to match ISO style > going forward. I wouldn't be against that necessarily if we were > keeping the keywords and not getting rid of them. But since we are > going to get rid of them going forward, I think what we want this > conversion to do is match what's in the historical tarballs. Agreed, no question. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable snapshot isolation error logging
Well I guess one would like some way to find out which statements in the involved transactions are the cause of the serialization failure and what programs they reside in. Also which relations were involved, the sql-statements may contain many relations but just one or a few might be involved in the failure, right ? The tuples involved if available. I don't know how helpful it would be to know the pages involved might be, I certainly wouldn't know what to do with that info. All this is of course to be able to guess at which statements to modify or change execution order of, take an explicit lock on and so on to reduce serialization failure rate. If holding a list of the involved transactions turns out to be expensive, maybe one should be able to turn it on by a GUC only when you have a problem and need the extra information to track it down. Best Regards Dan S 2010/9/20 Kevin Grittner > Dan S wrote: > > > I wonder if the SSI implementation will give some way of detecting > > the cause of a serialization failure. > > Something like the deadlock detection maybe where you get the > > sql-statements involved. > > I've been wondering what detail to try to include. There will often > be three transactions involved in an SSI serialization failure, > although the algorithm we're using (based on the referenced papers) > may only know about one or two of them at the point of failure, > because conflicts with multiple other transactions get collapsed to > a self-reference. (One "optimization" I want to try is to maintain > a list of conflicts rather than doing the above -- in which case we > could always show all three transactions; but we may run out of time > for that, and even if we don't, the decreased rollbacks might not > pay for the cost of maintaining such a list.) > > The other information we would have would be the predicate locks > held by whatever transactions we know about at the point of > cancellation, based on what reads they've done; however, we wouldn't > know about the writes done by those transaction, or which of the > reads resulting in conflicts. > > So, given the above, any thoughts on what we *should* show? > > -Kevin >
Re: [HACKERS] Git conversion status
Peter Eisentraut writes: > On mån, 2010-09-20 at 15:09 -0400, Tom Lane wrote: >> I wouldn't be against that necessarily if we were >> keeping the keywords and not getting rid of them. But since we are >> going to get rid of them going forward, I think what we want this >> conversion to do is match what's in the historical tarballs. > Stupid question: Why don't you get rid of the key words beforehand? That *definitely* wouldn't match the tarballs. One of the base requirements we set at the beginning of the whole SCM conversion discussion was that we be able to reproduce the historical release tarballs as nearly as possible. Now, if there were some reason that we couldn't match $PostgreSQL$ tags at all, I'd have grumbled and accepted it. But we're 99.44% of the way there, and I don't see some Debian maintainer's idea of how things ought to work as a reason for not being 100% of the way there. What I got the last time I did this locally, and expect to see when we have the final conversion, is an exact match for every tarball 8.0.0 and later. Earlier than that we have discrepancies because some files are now in Attic, and/or the cvsroot path moved around, and/or the project's module name moved around. That sort of thing I've resigned myself to just grumbling about. But if we can have an exact match for everything from 8.0.0 forward, we should not give that up for trivial reasons. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On 09/20/2010 09:06 PM, Tom Lane wrote: Stefan Kaltenbrunner writes: http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html is what I'm refering too and what the debian people provided a patch to work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are not seeing it... Hm, that is talking about the output of "cvs log". It doesn't say anything one way or the other about what gets put into $Header$ keyword expansions. A look into the 1.12.13 source code says that dates in keywords are always printed with this: sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday, hour, min, sec); (see printable_date in src/rcs.c). So I'm still of the opinion that debian fixed that which wasn't broken. I tried searching the nongnu archives and found this: http://lists.nongnu.org/archive/html/info-cvs/2004-03/msg00359.html which leads me to think that the upstream developers considered and ultimately rejected moving to ISO style in keyword expansion. Probably the debian maintainer decided he knew better and changed it anyway; there seems to be a lot of that going around among debian packagers. wow - now that I look closer it seems you are right... The patch in debian against the upstream package (see: http://ftp.de.debian.org/debian/pool/main/c/cvs/cvs_1.12.13-12.diff.gz) has this hunk: --- cvs-1.12.13-old/src/rcs.c 2006-02-26 23:03:04.0 +0800 +++ cvs-1.12.13/src/rcs.c 2006-02-26 23:03:05.0 +0800 @@ -33,6 +33,8 @@ # endif #endif +int datesep = '-'; + /* The RCS -k options, and a set of enums that must match the array. These come first so that we can use enum kflag in function prototypes. */ @@ -3537,8 +3539,8 @@ &sec); if (year < 1900) year += 1900; -sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday, - hour, min, sec); +sprintf (buf, "%04d%c%02d%c%02d %02d:%02d:%02d", year, datesep, on, + datesep, mday, hour, min, sec); return xstrdup (buf); } so the broke that in early 2006 and nobody noticed so far... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On mån, 2010-09-20 at 15:09 -0400, Tom Lane wrote: > I wouldn't be against that necessarily if we were > keeping the keywords and not getting rid of them. But since we are > going to get rid of them going forward, I think what we want this > conversion to do is match what's in the historical tarballs. Stupid question: Why don't you get rid of the key words beforehand? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Bruce Momjian writes: > Tom Lane wrote: >> This is not even close to matching the tarballs :-(. Seems to be a >> locale problem: the diffs look like >> >> 1c1 >> < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 >> 18:25:16 tgl Exp $ */ >> --- > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 > tgl Exp $ */ > As a curiosity, I do prefer the dashed dates. I have had a number of > cases where I have to change dashes to slashes when passing ISO dates as > parameters to CVS. Shame they improve it just as we are leaving CVS. Yeah. It appears that this was prompted by a desire to match ISO style going forward. I wouldn't be against that necessarily if we were keeping the keywords and not getting rid of them. But since we are going to get rid of them going forward, I think what we want this conversion to do is match what's in the historical tarballs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Stefan Kaltenbrunner writes: > http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html > is what I'm refering too and what the debian people provided a patch to > work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are > not seeing it... Hm, that is talking about the output of "cvs log". It doesn't say anything one way or the other about what gets put into $Header$ keyword expansions. A look into the 1.12.13 source code says that dates in keywords are always printed with this: sprintf (buf, "%04d/%02d/%02d %02d:%02d:%02d", year, mon, mday, hour, min, sec); (see printable_date in src/rcs.c). So I'm still of the opinion that debian fixed that which wasn't broken. I tried searching the nongnu archives and found this: http://lists.nongnu.org/archive/html/info-cvs/2004-03/msg00359.html which leads me to think that the upstream developers considered and ultimately rejected moving to ISO style in keyword expansion. Probably the debian maintainer decided he knew better and changed it anyway; there seems to be a lot of that going around among debian packagers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Tom Lane wrote: > Magnus Hagander writes: > > Since there haven't been any commits in cvs during the day, the test > > conversoin I created after lunch should be identical to a new one I'd > > run now, so let's use that one :-) > > This is not even close to matching the tarballs :-(. Seems to be a > locale problem: the diffs look like > > 1c1 > < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 > 18:25:16 tgl Exp $ */ > --- > > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 > > 18:25:16 tgl Exp $ */ > > Please fix and re-run. As a curiosity, I do prefer the dashed dates. I have had a number of cases where I have to change dashes to slashes when passing ISO dates as parameters to CVS. Shame they improve it just as we are leaving CVS. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On 09/20/2010 08:33 PM, Tom Lane wrote: Stefan Kaltenbrunner writes: On 09/20/2010 08:21 PM, Tom Lane wrote: Well, I'm testing with an unmodified copy of 1.12.13, and I got output matching our historical tarballs. So I'm blaming debian for this one. As far as I know magnus is using a debian based CVS server for his testing so that would certainly be 1.12.x - are you too? No server anywhere: I'm reading from a local repository which is a tarball copy of the one on cvs.postgresql.org. 1.12.13 is the only version in question. (I believe Magnus is not using a server either; the cvs2git documentation says that it will only work from a local repo, and even if that's not true I shudder to think how long it would take over a network.) http://lists.nongnu.org/archive/html/info-cvs/2004-07/msg00106.html is what I'm refering too and what the debian people provided a patch to work around for(starting with1:1.12.9-17 in 2005) - nut sure why you are not seeing it... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Stefan Kaltenbrunner writes: > On 09/20/2010 08:21 PM, Tom Lane wrote: >> Well, I'm testing with an unmodified copy of 1.12.13, and I got output >> matching our historical tarballs. So I'm blaming debian for this one. > As far as I know magnus is using a debian based CVS server for his > testing so that would certainly be 1.12.x - are you too? No server anywhere: I'm reading from a local repository which is a tarball copy of the one on cvs.postgresql.org. 1.12.13 is the only version in question. (I believe Magnus is not using a server either; the cvs2git documentation says that it will only work from a local repo, and even if that's not true I shudder to think how long it would take over a network.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On 09/20/2010 08:21 PM, Tom Lane wrote: Stefan Kaltenbrunner writes: On 09/20/2010 08:05 PM, Magnus Hagander wrote: debian applies a patch to change it. If I set DateStyle=old in CVSROOT/config, cvs export behaves sanely. I'll re-run with that setting. actually as I understand it the behaviour changed in cvs 1.12.x and debian applied a patch to provide the old output for backwards compatibility... Well, I'm testing with an unmodified copy of 1.12.13, and I got output matching our historical tarballs. So I'm blaming debian for this one. not sure - if I read the CVS changelog the "new style" output only triggers if both the server AND the client are > 1.12.x (for some value of x on both). As far as I know magnus is using a debian based CVS server for his testing so that would certainly be 1.12.x - are you too? Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Monday 20 September 2010 20:22:55 Tom Lane wrote: > Andres Freund writes: > > On Monday 20 September 2010 20:15:50 Tom Lane wrote: > >> BTW, while poking around in this morning's attempt I noticed > >> .git/description, containing > >> > >> Unnamed repository; edit this file 'description' to name the repository. > >> > >> No idea if this is shown anywhere or if there is any practical way to > >> change it once the repo's been published. Might be an idea to stick > >> something in there. > > > > Its mostly used for display in gitweb and can be changed anytime. > > Hm, I might've misinterpreted its semantics. Is that file copied by > "git clone", or is it something that's unique to each physical > repository? Unique to each "physical repository" (like everything in .git - unless you count the cloned 'objects'). Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > On Mon, Sep 20, 2010 at 20:15, Tom Lane wrote: >> BTW, while poking around in this morning's attempt I noticed >> .git/description, containing >> >> Unnamed repository; edit this file 'description' to name the repository. > That said, where was it set to that? A locally initialized repo, or on > the clone? That's what I found in the result of git clone ssh://g...@gitmaster.postgresql.org/postgresql.git If git clone isn't meant to copy it, then this is a non-issue. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Andres Freund writes: > On Monday 20 September 2010 20:15:50 Tom Lane wrote: >> BTW, while poking around in this morning's attempt I noticed >> .git/description, containing >> >> Unnamed repository; edit this file 'description' to name the repository. >> >> No idea if this is shown anywhere or if there is any practical way to >> change it once the repo's been published. Might be an idea to stick >> something in there. > Its mostly used for display in gitweb and can be changed anytime. Hm, I might've misinterpreted its semantics. Is that file copied by "git clone", or is it something that's unique to each physical repository? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Stefan Kaltenbrunner writes: > On 09/20/2010 08:05 PM, Magnus Hagander wrote: >> debian applies a patch to change it. If I set DateStyle=old in >> CVSROOT/config, cvs export behaves sanely. I'll re-run with that >> setting. > actually as I understand it the behaviour changed in cvs 1.12.x and > debian applied a patch to provide the old output for backwards > compatibility... Well, I'm testing with an unmodified copy of 1.12.13, and I got output matching our historical tarballs. So I'm blaming debian for this one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable snapshot isolation error logging
Dan S wrote: > I wonder if the SSI implementation will give some way of detecting > the cause of a serialization failure. > Something like the deadlock detection maybe where you get the > sql-statements involved. I've been wondering what detail to try to include. There will often be three transactions involved in an SSI serialization failure, although the algorithm we're using (based on the referenced papers) may only know about one or two of them at the point of failure, because conflicts with multiple other transactions get collapsed to a self-reference. (One "optimization" I want to try is to maintain a list of conflicts rather than doing the above -- in which case we could always show all three transactions; but we may run out of time for that, and even if we don't, the decreased rollbacks might not pay for the cost of maintaining such a list.) The other information we would have would be the predicate locks held by whatever transactions we know about at the point of cancellation, based on what reads they've done; however, we wouldn't know about the writes done by those transaction, or which of the reads resulting in conflicts. So, given the above, any thoughts on what we *should* show? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 20:15, Tom Lane wrote: > BTW, while poking around in this morning's attempt I noticed > .git/description, containing > > Unnamed repository; edit this file 'description' to name the repository. > > No idea if this is shown anywhere or if there is any practical way to > change it once the repo's been published. Might be an idea to stick > something in there. That's, AFAIK, only used for gitweb. That said, where was it set to that? A locally initialized repo, or on the clone? Because I changed it in the repository before I published it I think (i now deleted the whole repo to make room for the new conversion, so i can't doublecheck that :D) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Monday 20 September 2010 20:15:50 Tom Lane wrote: > BTW, while poking around in this morning's attempt I noticed > .git/description, containing > > Unnamed repository; edit this file 'description' to name the repository. > > No idea if this is shown anywhere or if there is any practical way to > change it once the repo's been published. Might be an idea to stick > something in there. Its mostly used for display in gitweb and can be changed anytime. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
BTW, while poking around in this morning's attempt I noticed .git/description, containing Unnamed repository; edit this file 'description' to name the repository. No idea if this is shown anywhere or if there is any practical way to change it once the repo's been published. Might be an idea to stick something in there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On 09/20/2010 08:05 PM, Magnus Hagander wrote: On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander wrote: On Mon, Sep 20, 2010 at 19:49, Tom Lane wrote: Magnus Hagander writes: On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: Please fix and re-run. Uh, what the heck. I ran the exact same command as last time.. Hmm: Stefan rbeooted the machine in between, I wonder if that changed something. I'm not sure we ever checked that. My comparisons against the tarballs were done from my own run of the conversion script. I'm using C locale here, probably you aren't? Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty what changes. Hmm Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". but it comes back with "-" in the dates, so it seems to not care about that. ("locale" clearly shows it's changed everything to C though) Is there a cvs setting for this somewhere that you know of? Think I found it. debian applies a patch to change it. If I set DateStyle=old in CVSROOT/config, cvs export behaves sanely. I'll re-run with that setting. actually as I understand it the behaviour changed in cvs 1.12.x and debian applied a patch to provide the old output for backwards compatibility... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
On 09/20/2010 08:06 PM, Kevin Grittner wrote: > Obviously, if there were a dynamic way to add to the entries as > needed, there would be one less setting (hard-coded or GUC) to worry > about getting right. Too low means transactions need to be > canceled. Too high means you're wasting space which could otherwise > go to caching. And of course, the optimal number could change from > day to day or hour to hour. Yeah, same problem as with lots of the other users shared memory. It certainly makes sense to decouple the two projects, so you'll have to pick some number that sounds good to you now. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 20:07, Tom Lane wrote: > Magnus Hagander writes: >> debian applies a patch to change it. > > [ rolls eyes... ] Thank you, debian. Indeed. For the archives, that's DateFormat=old, not DateStyle. Oops. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > debian applies a patch to change it. [ rolls eyes... ] Thank you, debian. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Markus Wanner wrote: > I'm wondering how you want to implement the memory allocation part Based on the feedback I've received, it appears that the only sane way to do that in the current shared memory environment is to allocate a fixed size of memory to hold these entries on postmaster startup. To minimize the chance that we'll be forced to cancel running transactions to deal with the limit, it will need to be sized to some multiple of max_connections. Obviously, if there were a dynamic way to add to the entries as needed, there would be one less setting (hard-coded or GUC) to worry about getting right. Too low means transactions need to be canceled. Too high means you're wasting space which could otherwise go to caching. And of course, the optimal number could change from day to day or hour to hour. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 7:57 PM, Magnus Hagander wrote: > On Mon, Sep 20, 2010 at 19:49, Tom Lane wrote: >> Magnus Hagander writes: >>> On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: Please fix and re-run. >> >>> Uh, what the heck. I ran the exact same command as last time.. Hmm: >>> Stefan rbeooted the machine in between, I wonder if that changed >>> something. >> >> I'm not sure we ever checked that. My comparisons against the tarballs >> were done from my own run of the conversion script. I'm using C locale >> here, probably you aren't? > > Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty > what changes. > Hmm > > Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". > but it comes back with "-" in the dates, so it seems to not care about that. > > ("locale" clearly shows it's changed everything to C though) > > Is there a cvs setting for this somewhere that you know of? Think I found it. debian applies a patch to change it. If I set DateStyle=old in CVSROOT/config, cvs export behaves sanely. I'll re-run with that setting. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see > exaclty what changes. > Hmm > Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs > export". but it comes back with "-" in the dates, so it seems to not > care about that. I thought "cvs export" removed keywords entirely ... try a checkout instead. Also, are you sure you don't have any LC_xxx variables set? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 19:49, Tom Lane wrote: > Magnus Hagander writes: >> On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: >>> Please fix and re-run. > >> Uh, what the heck. I ran the exact same command as last time.. Hmm: >> Stefan rbeooted the machine in between, I wonder if that changed >> something. > > I'm not sure we ever checked that. My comparisons against the tarballs > were done from my own run of the conversion script. I'm using C locale > here, probably you aren't? Correct, I'm in en_US. I'm trying a "cvs export" in "C" now to see exaclty what changes. Hmm Nope, doesn't seem to change. I just set my LANG=C, and ran a "cvs export". but it comes back with "-" in the dates, so it seems to not care about that. ("locale" clearly shows it's changed everything to C though) Is there a cvs setting for this somewhere that you know of? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Serializable snapshot isolation error logging
Hi ! I wonder if the SSI implementation will give some way of detecting the cause of a serialization failure. Something like the deadlock detection maybe where you get the sql-statements involved. Best Regards Dan S
Re: [HACKERS] Do we need a ShmList implementation?
On 09/20/2010 06:09 PM, Kevin Grittner wrote: > Yeah, I mostly followed that thread. If such a feature was present, > it might well make sense to use it for this; however, I've got > enough trouble selling the SSI technology without making it > dependent on something else which was clearly quite controversial, > and which seemed to have some technical hurdles of its own left to > clear. :-/ Okay, well understandable. I'm wondering how you want to implement the memory allocation part, though. > At the point where there is an implementation which is accepted by > the community, I'll certainly take another look. Fair enough, thanks. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: >> Please fix and re-run. > Uh, what the heck. I ran the exact same command as last time.. Hmm: > Stefan rbeooted the machine in between, I wonder if that changed > something. I'm not sure we ever checked that. My comparisons against the tarballs were done from my own run of the conversion script. I'm using C locale here, probably you aren't? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: general purpose requirements
Robert, On 09/20/2010 06:57 PM, Robert Haas wrote: > Gee, that doesn't seem slow enough to worry about to me. If we > suppose that you need 2 * CPUs + spindles processes to fully load the > system, that means you should be able to ramp up from zero to > consuming every available system resource in under a second; except > perhaps on a system with a huge RAID array, which might need 2 or 3 > seconds. If you parallelize the worker startup, as you suggest, I'd > think you could knock quite a bit more off of this, but why all the > worry about startup latency? Once the system is chugging along, none > of this should matter very much, I would think. If you need to > repeatedly kill off some workers bound to one database and start some > new ones to bind to a different database, that could be sorta painful, > but if you can actually afford to keep around the workers for all the > databases you care about, it seems fine. Hm.. I see. So in other words, you are saying min_spare_background_workers isn't flexible enough in case one has thousands of databases but only uses a few of them frequently. I understand that reasoning and the wish to keep the number of GUCs as low as possible. I'll try to drop the min_spare_background_workers from the bgworker patches. The rest of the bgworker infrastructure should behave pretty much like what you have described. Parallelism in starting bgworkers could be a nice improvement, especially if we kill the min_space_background_workers mechanism. > Neat stuff. Thanks. Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
On Mon, Sep 20, 2010 at 19:34, Tom Lane wrote: > Magnus Hagander writes: >> Since there haven't been any commits in cvs during the day, the test >> conversoin I created after lunch should be identical to a new one I'd >> run now, so let's use that one :-) > > This is not even close to matching the tarballs :-(. Seems to be a > locale problem: the diffs look like > > 1c1 > < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 > 18:25:16 tgl Exp $ */ > --- >> /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 >> tgl Exp $ */ > > Please fix and re-run. Uh, what the heck. I ran the exact same command as last time.. Hmm: Stefan rbeooted the machine in between, I wonder if that changed something. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compile/install of git
On 09/20/2010 01:16 PM, Mark Wong wrote: On Mon, Sep 20, 2010 at 9:42 AM, Andrew Dunstan wrote: On 09/20/2010 12:24 PM, Mark Wong wrote: On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjianwrote: Well, I can run tests for folks before they apply a patch and "red" the build farm. I can also research fixes easier because I am using the OS, rather than running blind tests. I am just telling you what people told me. I've been slowly trying to rebuild something that was in use at the OSDL to test patches. I just proofed something that I think works with the git repository: http://207.173.203.223:5000/patch/show/48 If you click on the PASS or FAIL text, it will display the SHA1, author and commit message that the patch was applied to. Think this will be useful? The issue has always been how much we want to ask people to trust code that is not committed. My answer is "not at all." Reviewers and committers will presumably eyeball the code before trying to compile/run it, but any automated system of code testing for uncommitted code is way too risky, IMNSHO. I was hoping this would be more of a reviewing tool, not something that would be an excuse for someone to not try running with a patch. For example, if patch doesn't apply, configure, or build the output is captured and can be referenced. Also specifically in Bruce's example if there is enough concern about making the buildfarm red I thought this could help in these few specific aspects. But maybe I don't understand the scope of testing Bruce is referring to. :) The whole point of the buildfarm is to identify quickly any platform-dependent problems. Committers can't be expected to have access to the whole range of platforms we support, so as long as they make sure that things are working well on their systems they should be able to rely on the buildfarm to cover the others. But that also means that the buildfarm should contain instances of all the supported platforms. I don't think we should be afraid of sending the buildfarm red. If we do it's an indication that it's doing its job. If you're a committer and you haven't made it go red a few times you're either very lucky or not very active. Making it go red isn't a problem. Leaving it red is, but we've really been pretty darn good about that. Having someone act in effect as an informal buildfarm member is less than satisfactory, IMNSHO. For one thing, it is likely to be less timely about notifying us of problems than the automated system. And it's also much less likely to catch problems on the back branches. So if you want platform X supported (even BSD/OS, regardless of the fact that it's way out of date), the first thing you should do is set up a buildfarm member for it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander writes: > Since there haven't been any commits in cvs during the day, the test > conversoin I created after lunch should be identical to a new one I'd > run now, so let's use that one :-) This is not even close to matching the tarballs :-(. Seems to be a locale problem: the diffs look like 1c1 < /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 18:25:16 tgl Exp $ */ --- > /* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008-09-05 18:25:16 > tgl Exp $ */ Please fix and re-run. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Git conversion status
Magnus Hagander wrote: > Hi! > > CVS has been frozen, and all commit access locked out. > > Since there haven't been any commits in cvs during the day, the test > conversoin I created after lunch should be identical to a new one I'd > run now, so let's use that one :-) > > So I've moved it in place. It's on > http://git.postgresql.org/gitweb?p=postgresql-migration.git. Git > access available at > git://git.postgresql.org/git/postgresql-migration.git. > > Committers can (and should! please test!) clone from git clone > ssh://g...@gitmaster.postgresql.org/postgresql.git. > > Please do *NOT* commit or push anything to this repository yet though: > The repo is there - all the scripts to manage it are *not*. So don't > commit until I confirm that it is. > > But please clone and verify the stuff we have now. Git clone worked just fine. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compile/install of git
On Mon, Sep 20, 2010 at 9:42 AM, Andrew Dunstan wrote: > > > On 09/20/2010 12:24 PM, Mark Wong wrote: >> >> On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian wrote: >>> >>> Well, I can run tests for folks before they apply a patch and "red" the >>> build farm. I can also research fixes easier because I am using the OS, >>> rather than running blind tests. I am just telling you what people told >>> me. >> >> I've been slowly trying to rebuild something that was in use at the >> OSDL to test patches. I just proofed something that I think works >> with the git repository: >> >> http://207.173.203.223:5000/patch/show/48 >> >> If you click on the PASS or FAIL text, it will display the SHA1, >> author and commit message that the patch was applied to. Think this >> will be useful? > > > The issue has always been how much we want to ask people to trust code that > is not committed. My answer is "not at all." Reviewers and committers will > presumably eyeball the code before trying to compile/run it, but any > automated system of code testing for uncommitted code is way too risky, > IMNSHO. I was hoping this would be more of a reviewing tool, not something that would be an excuse for someone to not try running with a patch. For example, if patch doesn't apply, configure, or build the output is captured and can be referenced. Also specifically in Bruce's example if there is enough concern about making the buildfarm red I thought this could help in these few specific aspects. But maybe I don't understand the scope of testing Bruce is referring to. :) Regards, Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: general purpose requirements
On Mon, Sep 20, 2010 at 11:30 AM, Markus Wanner wrote: > Well, Apache pre-forks 5 processes in total (by default, that is, for > high volume webservers a higher MinSpareServers setting is certainly not > out of question). While bgworkers currently needs to fork > min_spare_background_workers processes per database. > > AIUI, that's the main problem with the current architecture. Assuming that "the main problem" refers more or less to the words "per database", I agree. >>> I haven't measured the actual time it takes, but given the use case of a >>> connection pool, I so far thought it's obvious that this process takes too >>> long. >> >> Maybe that would be a worthwhile exercise... > > On my laptop I'm measuring around 18 bgworker starts per second, i.e. > roughly 50 ms per bgworker start. That's certainly just a ball-park figure.. Gee, that doesn't seem slow enough to worry about to me. If we suppose that you need 2 * CPUs + spindles processes to fully load the system, that means you should be able to ramp up from zero to consuming every available system resource in under a second; except perhaps on a system with a huge RAID array, which might need 2 or 3 seconds. If you parallelize the worker startup, as you suggest, I'd think you could knock quite a bit more off of this, but why all the worry about startup latency? Once the system is chugging along, none of this should matter very much, I would think. If you need to repeatedly kill off some workers bound to one database and start some new ones to bind to a different database, that could be sorta painful, but if you can actually afford to keep around the workers for all the databases you care about, it seems fine. >> How do you accumulate the change sets? > > Logical changes get collected at the heapam level. They get serialized > and streamed (via imessages and a group communication system) to all > nodes. Application of change sets is highly parallelized and should be > pretty efficient. Commit ordering is decided by the GCS to guarantee > consistency across all nodes, conflicts get resolved by aborting the > later transaction. Neat stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] work_mem / maintenance_work_mem maximums
Greetings, After watching a database import go abysmally slow on a pretty beefy box with tons of RAM, I got annoyed and went to hunt down why in the world PG wasn't using but a bit of memory. Turns out to be a well known and long-standing issue: http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg101139.html Now, we could start by fixing guc.c to correctly have the max value for these be MaxAllocSize/1024, for starters, then at least our users would know when they set a higher value it's not going to be used. That, in my mind, is a pretty clear bug fix. Of course, that doesn't help us poor data-warehousing bastards with 64G+ machines. Sooo.. I don't know much about what the limit is or why it's there, but based on the comments, I'm wondering if we could just move the limit to a more 'sane' place than the-function-we-use-to-allocate. If we need a hard limit due to TOAST, let's put it there, but I'm hopeful we could work out a way to get rid of this limit in repalloc and that we can let sorts and the like (uh, index creation) use what memory the user has decided it should be able to. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Git conversion status
Hi! CVS has been frozen, and all commit access locked out. Since there haven't been any commits in cvs during the day, the test conversoin I created after lunch should be identical to a new one I'd run now, so let's use that one :-) So I've moved it in place. It's on http://git.postgresql.org/gitweb?p=postgresql-migration.git. Git access available at git://git.postgresql.org/git/postgresql-migration.git. Committers can (and should! please test!) clone from git clone ssh://g...@gitmaster.postgresql.org/postgresql.git. Please do *NOT* commit or push anything to this repository yet though: The repo is there - all the scripts to manage it are *not*. So don't commit until I confirm that it is. But please clone and verify the stuff we have now. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Tom Lane wrote: > There's nothing vestigial about SHM_QUEUE --- it's used by the > lock manager. But it's intended to link together structs whose > existence is managed by somebody else. Yep, that's exactly my problem. > I'm not excited about inventing an API with just one use-case; > it's unlikely that you actually end up with anything generally > useful. (SHM_QUEUE seems like a case in point...) Especially > when there are so many other constraints on what shared memory is > usable for. You might as well just do this internally to the > SERIALIZABLEXACT management code. Fair enough. I'll probably abstract it within the SSI patch anyway, just because it will keep the other code cleaner where the logic is necessarily kinda messy anyway, and I think it'll reduce the chance of weird memory bugs. I just won't get quite so formal about the interface. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
On Mon, 2010-09-20 at 12:35 -0400, Tom Lane wrote: > "Kevin Grittner" writes: > > Simon Riggs wrote: > >> My understanding is that we used to have that and it was removed > >> for the reasons Heikki states. There are still vestigial bits > >> still in code. > > There's nothing vestigial about SHM_QUEUE --- it's used by the lock > manager. Yes, I was talking about an implementation that allocated memory as well. There are sections of IFDEF'd out code there... -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compile/install of git
On 09/20/2010 12:24 PM, Mark Wong wrote: On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian wrote: Well, I can run tests for folks before they apply a patch and "red" the build farm. I can also research fixes easier because I am using the OS, rather than running blind tests. I am just telling you what people told me. I've been slowly trying to rebuild something that was in use at the OSDL to test patches. I just proofed something that I think works with the git repository: http://207.173.203.223:5000/patch/show/48 If you click on the PASS or FAIL text, it will display the SHA1, author and commit message that the patch was applied to. Think this will be useful? The issue has always been how much we want to ask people to trust code that is not committed. My answer is "not at all." Reviewers and committers will presumably eyeball the code before trying to compile/run it, but any automated system of code testing for uncommitted code is way too risky, IMNSHO. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compile/install of git
On Mon, Sep 20, 2010 at 12:24 PM, Mark Wong wrote: > On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian wrote: >> Andrew Dunstan wrote: >>> >>> >>> On 09/18/2010 10:22 AM, Bruce Momjian wrote: >>> > Dave Page wrote: >>> >> On Fri, Sep 17, 2010 at 10:02 PM, Bruce Momjian wrote: >>> >>> FYI, I have compiled/installed git 1.7.3.rc2 on my BSD/OS 4.3.1 machine >>> >>> with the attached minor changes. >>> >> I thought you were replacing that old thing with pile of hardware that >>> >> Matthew was putting together? >>> > Matthew was busy this summer so I am going to try to get some of his >>> > time by January to switch to Ubuntu. And some people are complaining we >>> > will lose a BSD test machine once I switch. >>> > >>> >>> Test machines belong in the buildfarm. And why would they complain about >>> losing a machine running a totally out of date and unsupported OS? Maybe >>> you should run BeOS instead. >> >> Well, I can run tests for folks before they apply a patch and "red" the >> build farm. I can also research fixes easier because I am using the OS, >> rather than running blind tests. I am just telling you what people told >> me. > > I've been slowly trying to rebuild something that was in use at the > OSDL to test patches. I just proofed something that I think works > with the git repository: > > http://207.173.203.223:5000/patch/show/48 > > If you click on the PASS or FAIL text, it will display the SHA1, > author and commit message that the patch was applied to. Think this > will be useful? Seems interesting. You might need to take precautions against someone uploading a trojan, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
"Kevin Grittner" writes: > Simon Riggs wrote: >> My understanding is that we used to have that and it was removed >> for the reasons Heikki states. There are still vestigial bits >> still in code. There's nothing vestigial about SHM_QUEUE --- it's used by the lock manager. But it's intended to link together structs whose existence is managed by somebody else. >> Not exactly impressed with the SHM_QUEUE stuff though, so I >> appreciate the sentiment that Kevin expresses. > So, if I just allocated a fixed memory space to provide an API > similar to my previous post, does that sound reasonable to you? I'm not excited about inventing an API with just one use-case; it's unlikely that you actually end up with anything generally useful. (SHM_QUEUE seems like a case in point...) Especially when there are so many other constraints on what shared memory is usable for. You might as well just do this internally to the SERIALIZABLEXACT management code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
On 20/09/10 19:04, Kevin Grittner wrote: Heikki Linnakangas wrote: In the SSI patch, you'd also need a way to insert an existing struct into a hash table. You currently work around that by using a hash element that contains only the hash key, and a pointer to the SERIALIZABLEXACT struct. It isn't too bad I guess, but I find it a bit confusing. Hmmm... Mucking with the hash table implementation to accommodate that seems like it's a lot of work and risk for pretty minimal benefit. Are you sure it's worth it? No, I'm not sure at all. Well, we generally try to avoid dynamic structures in shared memory, because shared memory can't be resized. But don't HTAB structures go beyond their estimated sizes as needed? Yes, but not in a very smart way. The memory allocated for hash table elements are never free'd. So if you use up all the "slush fund" shared memory for SIREAD locks, it can't be used for anything else anymore, even if the SIREAD locks are later released. Any chance of collapsing together entries of already-committed transactions in the SSI patch, to put an upper limit on the number of shmem list entries needed? If you can do that, then a simple array allocated at postmaster startup will do fine. I suspect it can be done, but I'm quite sure that any such scheme would increase the rate of serialization failures. Right now I'm trying to see how much I can do to *decrease* the rate of serialization failures, so I'm not eager to go there. :-/ I see. It's worth spending some mental power on, an upper limit would make life a lot easier. It doesn't matter much if it's 2*max_connections or 100*max_connections, as long as it's finite. If it is necessary, the most obvious way to manage this is just to force cancellation of the oldest running serializable transaction and running ClearOldPredicateLocks(), perhaps iterating, until we free an entry to service the new request. Hmm, that's not very appealing either. But perhaps it's still better than not letting any new transactions to begin. We could say "snapshot too old" in the error message :-). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq changes for synchronous replication
On Fri, 2010-09-17 at 18:22 +0900, Fujii Masao wrote: > On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas > wrote: > > That said, there's a few small things that can be progressed regardless of > > the details of synchronous replication. There's the changes to trigger > > failover with a signal, and it seems that we'll need some libpq changes to > > allow acknowledgments to be sent back to the master regardless of the rest > > of the design. We can discuss those in separate threads in parallel. > > Agreed. The attached patch introduces new function which is used > to send ACK back from walreceiver. The function sends a message > to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData > to be called even during COPY OUT. Does this differ from Zoltan's code? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Simon Riggs wrote: > My understanding is that we used to have that and it was removed > for the reasons Heikki states. There are still vestigial bits > still in code. > > Not exactly impressed with the SHM_QUEUE stuff though, so I > appreciate the sentiment that Kevin expresses. So, if I just allocated a fixed memory space to provide an API similar to my previous post, does that sound reasonable to you? For the record, my intention would be to hide the SHM_QUEUE structures in this API -- an entry would be just the structure you're interested in working with. If practical, I would prefer for ShmList to be a pointer to an opaque structure; users of this shouldn't really be exposed to or depend upon the implementation. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compile/install of git
On Sat, Sep 18, 2010 at 7:59 AM, Bruce Momjian wrote: > Andrew Dunstan wrote: >> >> >> On 09/18/2010 10:22 AM, Bruce Momjian wrote: >> > Dave Page wrote: >> >> On Fri, Sep 17, 2010 at 10:02 PM, Bruce Momjian wrote: >> >>> FYI, I have compiled/installed git 1.7.3.rc2 on my BSD/OS 4.3.1 machine >> >>> with the attached minor changes. >> >> I thought you were replacing that old thing with pile of hardware that >> >> Matthew was putting together? >> > Matthew was busy this summer so I am going to try to get some of his >> > time by January to switch to Ubuntu. And some people are complaining we >> > will lose a BSD test machine once I switch. >> > >> >> Test machines belong in the buildfarm. And why would they complain about >> losing a machine running a totally out of date and unsupported OS? Maybe >> you should run BeOS instead. > > Well, I can run tests for folks before they apply a patch and "red" the > build farm. I can also research fixes easier because I am using the OS, > rather than running blind tests. I am just telling you what people told > me. I've been slowly trying to rebuild something that was in use at the OSDL to test patches. I just proofed something that I think works with the git repository: http://207.173.203.223:5000/patch/show/48 If you click on the PASS or FAIL text, it will display the SHA1, author and commit message that the patch was applied to. Think this will be useful? Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
On Mon, 2010-09-20 at 18:37 +0300, Heikki Linnakangas wrote: > > SHM_QUEUE objects provide the infrastructure for maintaining a > > shared memory linked list, but they don't do anything about the > > allocation and release of the space for the objects. So it occurs > > to me that I'm using an HTAB for this collection because it provides > > the infrastructure for managing the memory for the collection, > > rather than because I need hash lookup. :-( It works, but that > > hardly seems optimal. > > > Have I missed something we already have which could meet that need? > > Well, we generally try to avoid dynamic structures in shared memory, > because shared memory can't be resized. So, you'd typically use an array > with a fixed number of elements. One could even argue that we > specifically *don't* want to have the kind of infrastructure you > propose, to discourage people from writing patches that need dynamic > shmem structures. My understanding is that we used to have that and it was removed for the reasons Heikki states. There are still vestigial bits still in code. Not exactly impressed with the SHM_QUEUE stuff though, so I appreciate the sentiment that Kevin expresses. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq changes for synchronous replication
Heikki Linnakangas writes: > It doesn't feel right to always accept PQputCopyData in COPY OUT mode, > though. IMHO there should be a new COPY IN+OUT mode. Yeah, I was going to make the same complaint. Breaking basic error-checking functionality in libpq is not very acceptable. > It should be pretty safe to add a CopyInOutResponse message to the > protocol without a protocol version bump. Thoughts on that? Not if it's something that an existing application might see. If it can only happen in replication mode it's OK. Personally I think this demonstrates that piggybacking replication data transfer on the COPY protocol was a bad design to start with. It's probably time to split them apart. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Markus Wanner wrote: > On 09/20/2010 05:12 PM, Kevin Grittner wrote: >> SHM_QUEUE objects provide the infrastructure for maintaining a >> shared memory linked list, but they don't do anything about the >> allocation and release of the space for the objects. > > Did you have a look at my dynshmem stuff? It tries to solve the > problem of dynamic allocation from shared memory. Not just for > lists, but very generally. Yeah, I mostly followed that thread. If such a feature was present, it might well make sense to use it for this; however, I've got enough trouble selling the SSI technology without making it dependent on something else which was clearly quite controversial, and which seemed to have some technical hurdles of its own left to clear. :-/ At the point where there is an implementation which is accepted by the community, I'll certainly take another look. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Heikki Linnakangas wrote: > In the SSI patch, you'd also need a way to insert an existing > struct into a hash table. You currently work around that by using > a hash element that contains only the hash key, and a pointer to > the SERIALIZABLEXACT struct. It isn't too bad I guess, but I find > it a bit confusing. Hmmm... Mucking with the hash table implementation to accommodate that seems like it's a lot of work and risk for pretty minimal benefit. Are you sure it's worth it? Perhaps better commenting around the SERIALIZABLEXID structure to indicate it's effectively a used for a non-primary key index into the other collection? > Well, we generally try to avoid dynamic structures in shared > memory, because shared memory can't be resized. But don't HTAB structures go beyond their estimated sizes as needed? I was trying to accommodate the situation where one collection might not be anywhere near its limit, but some other collection has edged past. Unless I'm misunderstanding things (which is always possible), the current HTAB implementation takes advantage of the "slush fund" of unused space to some degree. I was just trying to maintain the same flexibility with the list. I was thinking of returning a size based on the *maximum* allowed allocations from the estimated size function, and actually limiting it to that size. So it wasn't so much a matter of grabbing more than expected, but leaving something for the hash table slush if possible. Of course I was also thinking that this would allow one to be a little bit more generous with he maximum, as it might have benefit elsewhere... > So, you'd typically use an array with a fixed number of elements. That's certainly a little easier, if you think it's better. > Any chance of collapsing together entries of already-committed > transactions in the SSI patch, to put an upper limit on the number > of shmem list entries needed? If you can do that, then a simple > array allocated at postmaster startup will do fine. I suspect it can be done, but I'm quite sure that any such scheme would increase the rate of serialization failures. Right now I'm trying to see how much I can do to *decrease* the rate of serialization failures, so I'm not eager to go there. :-/ If it is necessary, the most obvious way to manage this is just to force cancellation of the oldest running serializable transaction and running ClearOldPredicateLocks(), perhaps iterating, until we free an entry to service the new request. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
Kevin, On 09/20/2010 05:12 PM, Kevin Grittner wrote: > SHM_QUEUE objects provide the infrastructure for maintaining a > shared memory linked list, but they don't do anything about the > allocation and release of the space for the objects. Did you have a look at my dynshmem stuff? It tries to solve the problem of dynamic allocation from shared memory. Not just for lists, but very generally. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need a ShmList implementation?
On 20/09/10 18:12, Kevin Grittner wrote: On the Serializable Snapshot Isolation thread, Heikki pointed out a collection of objects in an HTAB which didn't really need its key on VirtualTransactionId, but there isn't really any other useful key, either. One of these objects may live and die, seeing use from multiple processes, without ever getting a TransactionId assigned; and it needs to be in a collection in shared memory the whole time. This suggests to me that some sort of list would be better. In the SSI patch, you'd also need a way to insert an existing struct into a hash table. You currently work around that by using a hash element that contains only the hash key, and a pointer to the SERIALIZABLEXACT struct. It isn't too bad I guess, but I find it a bit confusing. SHM_QUEUE objects provide the infrastructure for maintaining a shared memory linked list, but they don't do anything about the allocation and release of the space for the objects. So it occurs to me that I'm using an HTAB for this collection because it provides the infrastructure for managing the memory for the collection, rather than because I need hash lookup. :-( It works, but that hardly seems optimal. Have I missed something we already have which could meet that need? Well, we generally try to avoid dynamic structures in shared memory, because shared memory can't be resized. So, you'd typically use an array with a fixed number of elements. One could even argue that we specifically *don't* want to have the kind of infrastructure you propose, to discourage people from writing patches that need dynamic shmem structures. Any chance of collapsing together entries of already-committed transactions in the SSI patch, to put an upper limit on the number of shmem list entries needed? If you can do that, then a simple array allocated at postmaster startup will do fine. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What happened to the is_ family of functions proposal?
On 20 September 2010 16:54, Andrew Dunstan wrote: > > > On 09/20/2010 10:29 AM, Colin 't Hart wrote: >> >> Hi, >> >> Back in 2002 these were proposed, what happened to them? >> >> http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php > > > 2002 is a long time ago. > I think to_date is the wrong gadget to use here. You should probably be using > the date input routine and trapping any data exception. e.g.: > > test_date := date_in(textout(some_text)); > > In plpgsql you'd put that inside a begin/exception/end block that traps > SQLSTATE '22000' which is the class covering data exceptions. So it's not possible using pure SQL unless one writes a function? Are the is_ family of functions still desired? Also, where are the to_ conversions done? Thanks, Colin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: general purpose requirements
Hi, On 09/18/2010 05:21 AM, Robert Haas wrote: > Wow, 100 processes??! Really? I guess I don't actually know how large > modern proctables are, but on my MacOS X machine, for example, there > are only 75 processes showing up right now in "ps auxww". My Fedora > 12 machine has 97. That's including a PostgreSQL instance in the > first case and an Apache instance in the second case. So 100 workers > seems like a ton to me. Well, Apache pre-forks 5 processes in total (by default, that is, for high volume webservers a higher MinSpareServers setting is certainly not out of question). While bgworkers currently needs to fork min_spare_background_workers processes per database. AIUI, that's the main problem with the current architecture. >> I haven't measured the actual time it takes, but given the use case of a >> connection pool, I so far thought it's obvious that this process takes too >> long. > > Maybe that would be a worthwhile exercise... On my laptop I'm measuring around 18 bgworker starts per second, i.e. roughly 50 ms per bgworker start. That's certainly just a ball-park figure.. One could parallelize the communication channel between the coordinator and postmaster, so as to be able to start multiple bgworkers in parallel, but the initial latency remains. It's certainly quick enough for autovacuum. But equally certainly not acceptable for Postgres-R, where latency is the worst enemy in the first place. For autonomous transactions and parallel querying, I'd also say that I'd rather not like to have such a latency. > I think the kicker here is the idea of having a certain number of > extra workers per database. Agreed, but I don't see any better way. Short of a re-connecting feature. > So > if you knew you only had 1 database, keeping around 2 or 3 or 5 or > even 10 workers might seem reasonable, but since you might have 1 > database or 1000 databases, it doesn't. Keeping 2 or 3 or 5 or 10 > workers TOTAL around could be reasonable, but not per-database. As > Tom said upthread, we don't want to assume that we're the only thing > running on the box and are therefore entitled to take up all the > available memory/disk/process slots/whatever. And even if we DID feel > so entitled, there could be hundreds of databases, and it certainly > doesn't seem practical to keep 1000 workers around "just in case". Agreed. Looks like Postgres-R has a slightly different focus, because if you need multi-master replication, you probably don't have 1000s of databases and/or lots of other services on the same machine. > I don't know whether an idle Apache worker consumes more or less > memory than an idle PostgreSQL worker, but another difference between > the Apache case and the PostgreSQL case is that presumably all those > backend processes have attached shared memory and have ProcArray > slots. We know that code doesn't scale terribly well, especially in > terms of taking snapshots, and that's one reason why high-volume > PostgreSQL installations pretty much require a connection pooler. I > think the sizes of the connection pools I've seen recommended are > considerably smaller than 100, more like 2 * CPUs + spindles, or > something like that. It seems like if you actually used all 100 > workers at the same time performance might be pretty awful. Sounds reasonable, yes. > I was taking a look at the Mammoth Replicator code this week > (parenthetical note: I couldn't figure out where mcp_server was or how > to set it up) and it apparently has a limitation that only one > database in the cluster can be replicated. I'm a little fuzzy on how > Mammoth works, but apparently this problem of scaling to large numbers > of databases is not unique to Postgres-R. Postgres-R is able to replicate multiple databases. Maybe not thousands, but still designed for it. > What is the granularity of replication? Per-database? Per-table? Currently per-cluster (i.e. all your databases at once). > How do you accumulate the change sets? Logical changes get collected at the heapam level. They get serialized and streamed (via imessages and a group communication system) to all nodes. Application of change sets is highly parallelized and should be pretty efficient. Commit ordering is decided by the GCS to guarantee consistency across all nodes, conflicts get resolved by aborting the later transaction. > Some kind of bespoke hook, WAL scanning, ...? No hooks, please! ;-) Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Do we need a ShmList implementation?
On the Serializable Snapshot Isolation thread, Heikki pointed out a collection of objects in an HTAB which didn't really need its key on VirtualTransactionId, but there isn't really any other useful key, either. One of these objects may live and die, seeing use from multiple processes, without ever getting a TransactionId assigned; and it needs to be in a collection in shared memory the whole time. This suggests to me that some sort of list would be better. SHM_QUEUE objects provide the infrastructure for maintaining a shared memory linked list, but they don't do anything about the allocation and release of the space for the objects. So it occurs to me that I'm using an HTAB for this collection because it provides the infrastructure for managing the memory for the collection, rather than because I need hash lookup. :-( It works, but that hardly seems optimal. Have I missed something we already have which could meet that need? If not, how would people feel about a ShmList implementation? A quick first draft for the API (which can almost certainly be improved, so don't be shy), is: ShmList ShmInitList(const char *name, Size entrySize, int initalEntryAlloc, int maxExtensions); Size ShmListEstimateSize(ShmList list); void *CreateShmListEntry(ShmList list); void ReleaseShmListEntry(ShmList list, void *entry); int ShmListSize(ShmList list); void *ShmListFirst(ShmList list); void *ShmListNext(ShmList list, void *entry); I see this as grabbing the initial allocation, filling it with zeros, and then creating a linked list of available entries. Internally the entries would be a SHM_QUEUE structure followed by space for the entrySize passed on init. A "create entry" call would remove an entry from the available list, link it into the collection, and return a pointer to the structure. Releasing an entry would remove it from the collection list, zero it, and link it to the available list. Hopefully the rest is fairly self-evident -- if not, let me know. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: general purpose requirements
On 09/18/2010 05:43 AM, Tom Lane wrote: > The part of that that would worry me is open files. PG backends don't > have any compunction about holding open hundreds of files. Apiece. > You can dial that down but it'll cost you performance-wise. Last > I checked, most Unix kernels still had limited-size FD arrays. Thank you very much, that's a helpful hint. I did some quick testing and managed to fork up to around 2000 backends, at which point my (laptop) system got unresponsive. To be honest, that's really surprising me. (I had to increased the SHM and SEM kernel limits to be able to start Postgres with that many processes at all. Obviously, Linux doesn't seem to like that... on a second test I got a kernel panic) > And as you say, ProcArray manipulations aren't going to be terribly > happy about large numbers of idle backends, either. Very understandable, yes. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq changes for synchronous replication
On 17/09/10 12:22, Fujii Masao wrote: On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas wrote: That said, there's a few small things that can be progressed regardless of the details of synchronous replication. There's the changes to trigger failover with a signal, and it seems that we'll need some libpq changes to allow acknowledgments to be sent back to the master regardless of the rest of the design. We can discuss those in separate threads in parallel. Agreed. The attached patch introduces new function which is used to send ACK back from walreceiver. The function sends a message to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData to be called even during COPY OUT. Oh, that's simple. It doesn't feel right to always accept PQputCopyData in COPY OUT mode, though. IMHO there should be a new COPY IN+OUT mode. It should be pretty safe to add a CopyInOutResponse message to the protocol without a protocol version bump. Thoughts on that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What happened to the is_ family of functions proposal?
On 09/20/2010 10:29 AM, Colin 't Hart wrote: Hi, Back in 2002 these were proposed, what happened to them? http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php 2002 is a long time ago. Also I note: co...@ruby:~/workspace/eyedb$ psql psql (8.4.4) Type "help" for help. colin=> select to_date('731332', 'YYMMDD'); to_date 1974-02-01 (1 row) colin=> The fact that this wraps would seem to me to make the implementation of is_date() difficult. I think to_date is the wrong gadget to use here. You should probably be using the date input routine and trapping any data exception. e.g.: test_date := date_in(textout(some_text)); In plpgsql you'd put that inside a begin/exception/end block that traps SQLSTATE '22000' which is the class covering data exceptions. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] What happened to the is_ family of functions proposal?
Hi, Back in 2002 these were proposed, what happened to them? http://archives.postgresql.org/pgsql-sql/2002-09/msg00406.php Also I note: co...@ruby:~/workspace/eyedb$ psql psql (8.4.4) Type "help" for help. colin=> select to_date('731332', 'YYMMDD'); to_date 1974-02-01 (1 row) colin=> The fact that this wraps would seem to me to make the implementation of is_date() difficult. I'm trying to query character strings for valid dates but can't see how to do this quickly... but for that discussion I will move to pgsql-general :-) Cheers, Colin
Re: [HACKERS] Serializable Snapshot Isolation
I wrote: > Heikki Linnakangas wrote: > >> ISTM you never search the SerializableXactHash table using a hash >> key, except the one call in CheckForSerializableConflictOut, but >> there you already have a pointer to the SERIALIZABLEXACT struct. >> You only re-find it to make sure it hasn't gone away while you >> trade the shared lock for an exclusive one. If we find another >> way to ensure that, ISTM we don't need SerializableXactHash at >> all. My first thought was to forget about VirtualTransactionId >> and use TransactionId directly as the hash key for >> SERIALIZABLEXACT. The problem is that a transaction doesn't have >> a transaction ID when RegisterSerializableTransaction is called. >> We could leave the TransactionId blank and only add the >> SERIALIZABLEXACT struct to the hash table when an XID is >> assigned, but there's no provision to insert an existing struct >> to a hash table in the current hash table API. >> >> So, I'm not sure of the details yet, but it seems like it could >> be made simpler somehow.. > > After tossing it around in my head for a bit, the only thing that > I see (so far) which might work is to maintain a *list* of > SERIALIZABLEXACT objects in memory rather than a using a hash > table. The recheck after releasing the shared lock and acquiring > an exclusive lock would then go through SerializableXidHash. I > think that can work, although I'm not 100% sure that it's an > improvement. I'll look it over in more detail. I'd be happy to > hear your thoughts on this or any other suggestions. I haven't come up with any better ideas. Pondering this one, it seems to me that a list would be better than a hash table if we had a list which would automatically allocate and link new entries, and would maintain a list of available entries for (re)use. I wouldn't want to sprinkle such an implementation in with predicate locking and SSI code, but if there is a feeling that such a thing would be worth having in shmqueue.c or some new file which uses the SHM_QUEUE structure to provide an API for such functionality, I'd be willing to write that and use it in the SSI code. Without something like that, I have so far been unable to envision an improvement along the lines Heikki is suggesting here. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Configuring Text Search parser?
Hi. I'm trying to migrate an application off an existing Full Text Search engine and onto PostgreSQL .. one of my main (remaining) headaches are the fact that PostgreSQL treats _ as a seperation charachter whereas the existing behaviour is to "not split". That means: testdb=# select ts_debug('database_tag_number_999'); ts_debug -- (asciiword,"Word, all ASCII",database,{english_stem},english_stem,{databas}) (blank,"Space symbols",_,{},,) (asciiword,"Word, all ASCII",tag,{english_stem},english_stem,{tag}) (blank,"Space symbols",_,{},,) (asciiword,"Word, all ASCII",number,{english_stem},english_stem,{number}) (blank,"Space symbols",_,{},,) (uint,"Unsigned integer",999,{simple},simple,{999}) (7 rows) Where the incoming data, by design contains a set of tags which includes _ and are expected to be one "lexeme". I've tried patching my way out of this using this patch. $ diff -w -C 5 src/backend/tsearch/wparser_def.c.orig src/backend/tsearch/wparser_def.c *** src/backend/tsearch/wparser_def.c.orig 2010-09-20 15:58:37.06460 +0200 --- src/backend/tsearch/wparser_def.c 2010-09-20 15:58:41.193335577 +0200 *** *** 967,986 --- 967,988 static const TParserStateActionItem actionTPS_InNumWord[] = { {p_isEOF, 0, A_BINGO, TPS_Base, NUMWORD, NULL}, {p_isalnum, 0, A_NEXT, TPS_InNumWord, 0, NULL}, {p_isspecial, 0, A_NEXT, TPS_InNumWord, 0, NULL}, + {p_iseqC, '_', A_NEXT, TPS_InNumWord, 0, NULL}, {p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL}, {p_iseqC, '/', A_PUSH, TPS_InFileFirst, 0, NULL}, {p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL}, {p_iseqC, '-', A_PUSH, TPS_InHyphenNumWordFirst, 0, NULL}, {NULL, 0, A_BINGO, TPS_Base, NUMWORD, NULL} }; static const TParserStateActionItem actionTPS_InAsciiWord[] = { {p_isEOF, 0, A_BINGO, TPS_Base, ASCIIWORD, NULL}, {p_isasclet, 0, A_NEXT, TPS_Null, 0, NULL}, + {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL}, {p_iseqC, '.', A_PUSH, TPS_InHostFirstDomain, 0, NULL}, {p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL}, {p_iseqC, '-', A_PUSH, TPS_InHostFirstAN, 0, NULL}, {p_iseqC, '-', A_PUSH, TPS_InHyphenAsciiWordFirst, 0, NULL}, {p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL}, *** *** 995,1004 --- 997,1007 static const TParserStateActionItem actionTPS_InWord[] = { {p_isEOF, 0, A_BINGO, TPS_Base, WORD_T, NULL}, {p_isalpha, 0, A_NEXT, TPS_Null, 0, NULL}, {p_isspecial, 0, A_NEXT, TPS_Null, 0, NULL}, + {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL}, {p_isdigit, 0, A_NEXT, TPS_InNumWord, 0, NULL}, {p_iseqC, '-', A_PUSH, TPS_InHyphenWordFirst, 0, NULL}, {NULL, 0, A_BINGO, TPS_Base, WORD_T, NULL} }; This will obviously break other peoples applications, so my questions would be: If this should be made configurable.. how should it be done? As a sidenote... Xapian doesn't split on _ .. Lucene does. Thanks. -- Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 8:50 AM, Simon Riggs wrote: > Please respond to the main point: Following some thought and analysis, > AFAICS there is no sensible use case that requires standby registration. I disagree. You keep analyzing away the cases that require standby registration, but I don't believe that they're not real. Aidan Van Dyk's case upthread of wanting to make sure that the standby is up and replicating synchronously before the master starts processing transactions seems perfectly legitimate to me. Sure, it's paranoid, but so what? We're all about paranoia, at least as far as data loss is concerned. So the "wait forever" case is, in my opinion, sufficient to demonstrate that we need it, but it's not even my primary reason for wanting to have it. The most important reason why I think we should have standby registration is for simplicity of configuration. Yes, it adds another configuration file, but that configuration file contains ALL of the information about which standbys are synchronous. Without standby registration, this information will inevitably be split between the master config and the various slave configs and you'll have to look at all the configurations to be certain you understand how it's going to end up working. As a particular manifestation of this, and as previously argued and +1'd upthread, the ability to change the set of standbys to which the master is replicating synchronously without changing the configuration on the master or any of the existing slaves seems seems dangerous. Another reason why I think we should have standby registration is to allow eventually allow the "streaming WAL backwards" configuration which has previously been discussed. IOW, you could stream the WAL to the slave in advance of fsync-ing it on the master. After a power failure, the machines in the cluster can talk to each other and figure out which one has the furthest-advanced WAL pointer and stream from that machine to all the others. This is an appealing configuration for people using sync rep because it would allow the fsyncs to be done in parallel rather than sequentially as is currently necessary - but if you're using it, you're certainly not going to want the master to enter normal running without waiting to hear from the slave. Just to be clear, that is a list of three independent reasons any one of which I think is sufficient for wanting standby registration. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 20/09/10 15:50, Simon Riggs wrote: On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote: On 20/09/10 12:17, Simon Riggs wrote: err... what is the difference between a timeout and stonith? STONITH ("Shoot The Other Node In The Head") means that the other node is somehow disabled so that it won't unexpectedly come back alive. A timeout means that the slave hasn't been seen for a while, but it might reconnect just after the timeout has expired. You've edited my reply to change the meaning of what was a rhetorical question, as well as completely ignoring the main point of my reply. Please respond to the main point: Following some thought and analysis, AFAICS there is no sensible use case that requires standby registration. Ok, I had completely missed your point then. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote: > On 20/09/10 12:17, Simon Riggs wrote: > > err... what is the difference between a timeout and stonith? > > STONITH ("Shoot The Other Node In The Head") means that the other node > is somehow disabled so that it won't unexpectedly come back alive. A > timeout means that the slave hasn't been seen for a while, but it might > reconnect just after the timeout has expired. You've edited my reply to change the meaning of what was a rhetorical question, as well as completely ignoring the main point of my reply. Please respond to the main point: Following some thought and analysis, AFAICS there is no sensible use case that requires standby registration. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 20/09/10 12:17, Simon Riggs wrote: err... what is the difference between a timeout and stonith? STONITH ("Shoot The Other Node In The Head") means that the other node is somehow disabled so that it won't unexpectedly come back alive. A timeout means that the slave hasn't been seen for a while, but it might reconnect just after the timeout has expired. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_comments
On Mon, Sep 20, 2010 at 1:07 AM, Tom Lane wrote: > Robert Haas writes: >> In view of the foregoing problems, I'd like to propose adding a new >> system view, tentatively called pg_comments, which lists all of the >> comments for everything in the system in such a way that it's >> reasonably possible to do further filtering out the output in ways >> that you might care about; and which also gives objects the names and >> types in a format that matches what the COMMENT command will accept as >> input. Patch attached. > > Unless you propose to break psql's hard-won backwards compatibility, > this isn't going to accomplish anything towards making describe.c > simpler or shorter. Also, it seems to me that what you've mostly done > is to move complexity from describe.c (where the query can be fixed > easily if it's found to be broken) to system_views.sql (where it cannot > be changed without an initdb). Those are legitimate gripes, but... > How about improving the query in-place in describe.c instead? ...I still don't care much for this option. It doesn't do anything to easy the difficulty of ad-hoc queries, which I think is important (and seems likely to be even more important for security labels - because people who use that feature at all are going to label the heck out of everything, whereas comments are never strictly necessary), and it isn't useful for clients other than psql. Most of this code hasn't been touched since 2002, despite numerous, relevant changes since then. You could take as support for your position that we need the ability to fix future bugs without initdb, but my reading of it is that that code is just too awful to be easily maintained and so no one has bothered. (It also supports my previous contention that we need a way to make minor system catalog updates without forcing initdb, but that's a problem for another day.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Report: removing the inconsistencies in our CVS->git conversion
On Sun, Sep 19, 2010 at 18:52, Tom Lane wrote: > Andrew Dunstan writes: >> On 09/19/2010 12:25 PM, Tom Lane wrote: >>> # We don't want to change line numbers, so we simply reduce the keyword >>> # string to the file pathname part. For example, >>> # $PostgreSQL: pgsql/src/port/unsetenv.c,v 1.12 2010/09/07 14:10:30 momjian >>> Exp $ >>> # becomes >>> # $PostgreSQL: pgsql/src/port/unsetenv.c,v 1.12 2010/09/07 14:10:30 momjian >>> Exp $ > >> These before and after lines look identical to me. > > Sigh ... obviously didn't finish editing the comment :-( > Of course the last line should read > > # src/port/unsetenv.c I've applied those to my repo, and am now re-running a final conversion before we do the "live one". -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 09:27 +0300, Heikki Linnakangas wrote: > On 18/09/10 22:59, Robert Haas wrote: > > On Sat, Sep 18, 2010 at 4:50 AM, Simon Riggs wrote: > >> Waiting might sound attractive. In practice, waiting will make all of > >> your connections lock up and it will look to users as if their master > >> has stopped working as well. (It has!). I can't imagine why anyone would > >> ever want an option to select that; its the opposite of high > >> availability. Just sounds like a serious footgun. > > > > Nevertheless, it seems that some people do want exactly that behavior, > > no matter how crazy it may seem to you. > > Yeah, I agree with both of you. I have a hard time imaging a situation > where you would actually want that. It's not high availability, it's > high durability. When a transaction is acknowledged as committed, you > know it's never ever going to disappear even if a meteor strikes the > current master server within the next 10 milliseconds. In practice, > people want high availability instead. > > That said, the timeout option also feels a bit wishy-washy to me. With a > timeout, acknowledgment of a commit means "your transaction is safely > committed in the master and slave. Or not, if there was some glitch with > the slave". That doesn't seem like a very useful guarantee; if you're > happy with that why not just use async replication? > > However, the "wait forever" behavior becomes useful if you have a > monitoring application outside the DB that decides when enough is enough > and tells the DB that the slave can be considered dead. So "wait > forever" actually means "wait until I tell you that you can give up". > The monitoring application can STONITH to ensure that the slave stays > down, before letting the master proceed with the commit. err... what is the difference between a timeout and stonith? None. We still proceed without the slave in both cases after the decision point. In all cases, we would clearly have a user accessible function to stop particular sessions, or all sessions, from waiting for standby to return. You would have 3 choices: * set automatic timeout * set wait forever and then wait for manual resolution * set wait forever and then trust to external clusterware Many people have asked for timeouts and I agree it's probably the easiest thing to do if you just have 1 standby. > With that in mind, we have to make sure that a transaction that's > waiting for acknowledgment of the commit from a slave is woken up if the > configuration changes. There's a misunderstanding here of what I've said and its a subtle one. My patch supports a timeout of 0, i.e. wait forever. Which means I agree that functionality is desired and should be included. This operates by saying that if a currently-connected-standby goes down we will wait until the timeout. So I agree all 3 choices should be available to users. Discussion has been about what happens to ought-to-have-been-connected standbys. Heikki had argued we need standby registration because if a server *ought* to have been there, yet isn't currently there when we wait for sync rep, we would still wait forever for it to return. To do this you require standby registration. But there is a hidden issue there: If you care about high availability AND sync rep you have two standbys. If one goes down, the other is still there. In general, if you want high availability on N servers then you have N+1 standbys. If one goes down, the other standbys provide the required level of durability and we do not wait. So the only case where standby registration is required is where you deliberately choose to *not* have N+1 redundancy and then yet still require all N standbys to acknowledge. That is a suicidal config and nobody would sanely choose that. It's not a large or useful use case for standby reg. (But it does raise the question again of whether we need quorum commit). My take is that if the above use case occurs it is because one standby has just gone down and the standby is, for a hopefully short period, in a degraded state and that the service responds to that. So in my proposal, if a standby is not there *now* we don't wait for it. Which cuts out a huge bag of code, specification and such like that isn't required to support sane use cases. More stuff to get wrong and regret in later releases. The KISS principle, just like we apply in all other cases. If we did have standby registration, then I would implement it in a table, not in an external config file. That way when we performed a failover the data would be accessible on the new master. But I don't suggest we have CREATE/ALTER STANDBY syntax. We already have CREATE/ALTER SERVER if we wanted to do it in SQL. If we did that, ISTM we should choose functions. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to yo
Re: [HACKERS] pgxs docdir question
Tom Lane writes: > Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= writes: >> Where does PGXS makefile get /usr/share/doc/pgsql/contrib directory >> from? > >> While building 3rd party RPMs using PGXS, even if I specify docdir in >> Makefile, README.* files are installed to this directory, which breaks >> parallel installation path as of 9.0+ > > Maybe you need to fool with MODULEDIR. See > http://archives.postgresql.org/pgsql-committers/2010-01/msg00025.php Well it's been working fine in debian without that for a long time now. I've taken the liberty to CC Martin Pitt, because I don't have the time to look at how things are done exactly in his debian packaging there. http://bazaar.launchpad.net/%7Epitti/postgresql/common/files https://code.launchpad.net/postgresql Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, On 09/17/2010 01:56 PM, Fujii Masao wrote: And standby registration is required when we support "wait forever when synchronous standby isn't connected at the moment" option that Heikki explained upthread. That requirement can be reduced to say that the master only needs to known how many synchronous standbys *should* be connected. IIUC that's pretty much exactly the quorum_commit GUC that Simon proposed, because it doesn't make sense to have more synchronous standbys connected than quorum_commit (as Simon pointed out downthread). I'm unsure about what's better, the full list (giving a good overview, but more to configure) or the single sum GUC (being very flexible and closer to how things work internally). But that seems to be a UI question exclusively. Regarding the "wait forever" option: I don't think continuing is a viable alternative, as it silently ignores the requested level of persistence. The only alternative I can see is to abort with an error. As far as comparison is allowed, that's what Postgres-R currently does if there's no majority of nodes. It allows to emit an error message and helpful hints, as opposed to letting the admin figure out what and where it's hanging. Not throwing false errors has the same requirements as "waiting forever", so that's an orthogonal issue, IMO. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres Licensing
You seem to be working for EnterpriseDB, which is a company specializing on postgres. So how does EnterpriseDB sell the advanced server? By modifying it, I guess! So that is something similar I want to do. Getting a few dollars for some hard work is not bad for me. Plus I love to find new things, so it would be fun as well. I would surely include the PostgreSQL licence in the product (if I get successful to) and sell it to a few peopl who are looking for some specific features and are pissed off with the great MySQL. Thanks to both DAVE and Heikki :) -Vaibhav (*_*) On Sun, Sep 19, 2010 at 11:58 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 20/09/10 09:48, Vaibhav Kaushal wrote: > >> 1. PostgreSQL can be distributed freely according to the license terms. >> Can >> it be sold (for a price) without changing anything in the source? >> > > Yes. > > You will have a hard time finding anyone to buy it, though, because you can > download it for free from the PostgreSQL website. > > > 2. Does the license restrict me from adding my closed source additions to >> the project and then sell the product? I want to add in a few files here >> and >> there which would be closed source in nature, while all the changes made >> to >> the original files will be open, and then sell the modified database with >> a >> dual license. Is this possible? >> > > In general, yes. I don't know what exactly you mean by the dual license, > but you are free to mix proprietary code with the PostgreSQL sources, and > sell or distribute for free the combined product with or without sources. > The only requirement of the PostgreSQL license is that all copies must > include the copyright notices and the license text. > > (Disclaimer: I am not a lawyer) > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com >
Re: [HACKERS] Postgres Licensing
On Mon, Sep 20, 2010 at 7:48 AM, Vaibhav Kaushal wrote: > May be this is the wrong place to ask the question. Still, answer me if > someone can or please redirect me to some place where it can be answered. My > questions are: > > 1. PostgreSQL can be distributed freely according to the license terms. Can > it be sold (for a price) without changing anything in the source? Yes. > 2. Does the license restrict me from adding my closed source additions to > the project and then sell the product? I want to add in a few files here and > there which would be closed source in nature, while all the changes made to > the original files will be open, and then sell the modified database with a > dual license. Is this possible? You should check with your own counsel of course (I am not a lawyer), but essentially the licence allows you produce derivative closed-source products and release them under different licences as long as the terms of the original licence are met (which basically means you can't sue UC Berkeley, or remove the original licence/copyright notices). > May be you guys are hard core OSS enthusiasts and may flame me. I request > not to and please consider my question. We like people building cool stuff with our code - and like the freedom to do so that our licence allows. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers