Re: [HACKERS] arrays as pl/perl input arguments [PATCH]
On Thu, Jan 13, 2011 at 12:06:33AM -0700, Alex Hunsaker wrote: I had supposed that it would be possible to do the string conversion lazily, ie, only if the string value was actually demanded. Yep, In-fact if we wanted we could even die (or throw an exception in other language speak :) ) when the string value is demanded. I played with this a little and it is fairly easy to make a variable such that $a is the string representation and $a[0] the first value of the array. The problem is that you can't pass such a variable into a subroutine. I was thinking however, if the parameters if the function have names you can use, then you can make it work. $_[0] would still go the old way, but the named parameters could be the array. == cut == #!/usr/bin/perl -w use strict; no strict 'vars'; package MyClass; sub TIESCALAR { my $class = shift; my $self = shift; return bless $self, $class; } sub FETCH { my $self = shift; return join(,, @$self); } my @a=(1,2); tie $a, MyClass, \@a; print \$a='$a'\n; print \$a[0]='$a[0]'\n; -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
[HACKERS] Bug in walreceiver
Hi, When the master shuts down or crashes, there seems to be the case where walreceiver exits without flushing WAL which has already been written. This might lead startup process to replay un-flushed WAL and break a Write-Ahead-Logging rule. walreceiver.c /* Wait a while for data to arrive */ if (walrcv_receive(NAPTIME_PER_CYCLE, type, buf, len)) { /* Accept the received data, and process it */ XLogWalRcvProcessMsg(type, buf, len); /* Receive any more data we can without sleeping */ while (walrcv_receive(0, type, buf, len)) XLogWalRcvProcessMsg(type, buf, len); /* * If we've written some records, flush them to disk and let the * startup process know about them. */ XLogWalRcvFlush(); } The problematic case happens when the latter walrcv_receive emits ERROR. In this case, the WAL received by the former walrcv_receive is not guaranteed to have been flushed yet. The attached patch ensures that all WAL received is flushed to disk before walreceiver exits. This patch should be backported to 9.0, I think. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center flush_before_walreceiver_exit_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in walreceiver
On 13.01.2011 10:28, Fujii Masao wrote: When the master shuts down or crashes, there seems to be the case where walreceiver exits without flushing WAL which has already been written. This might lead startup process to replay un-flushed WAL and break a Write-Ahead-Logging rule. Hmm, that can happen at a crash even with no replication involved. If you kill -9 postmaster, and some WAL had been written but not fsync'd, on crash recovery we will happily recover the unsynced WAL. We could prevent that by fsyncing all WAL before applying it - presumably fsyncing a file that has already been flushed is quick. But is it worth the trouble? walreceiver.c /* Wait a while for data to arrive */ if (walrcv_receive(NAPTIME_PER_CYCLE,type,buf,len)) { /* Accept the received data, and process it */ XLogWalRcvProcessMsg(type, buf, len); /* Receive any more data we can without sleeping */ while (walrcv_receive(0,type,buf,len)) XLogWalRcvProcessMsg(type, buf, len); /* * If we've written some records, flush them to disk and let the * startup process know about them. */ XLogWalRcvFlush(); } The problematic case happens when the latter walrcv_receive emits ERROR. In this case, the WAL received by the former walrcv_receive is not guaranteed to have been flushed yet. The attached patch ensures that all WAL received is flushed to disk before walreceiver exits. This patch should be backported to 9.0, I think. Yeah, we probably should do that, even though it doesn't completely close the window tahat unsynced WAL is replayed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bug in pg_dump
The example from Tom Lane below results in a database which is not possible to correctly dump using pg_dump. The view v1 strangely becomes a table in the dump output?! It's probably a quite useless database to dump in the first place, but that is no excuse to generate an invalid dump, it would be better to throw an exception and complain about your database is retarded, refusing to dump or something like that. regression=# \d List of relations Schema | Name | Type | Owner +--+---+-- public | tt | table | postgres public | v1 | view | postgres public | v2 | view | postgres (3 rows) ubuntu@ubuntu:/crypt/postgresql-9.1alpha3/src/bin/pg_dump$ ./pg_dump regression | grep -v -E '^--' | grep -E '^.+$' | grep -v SET CREATE OR REPLACE PROCEDURAL LANGUAGE plpgsql; ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO ubuntu; CREATE TABLE tt ( f1 integer, f2 integer ); ALTER TABLE public.tt OWNER TO postgres; CREATE TABLE v1 ( f1 integer, f2 integer ); ALTER TABLE public.v1 OWNER TO postgres; CREATE VIEW v2 AS SELECT v1.f1, v1.f2 FROM v1; ALTER TABLE public.v2 OWNER TO postgres; COPY tt (f1, f2) FROM stdin; \. CREATE RULE _RETURN AS ON SELECT TO v1 DO INSTEAD SELECT v2.f1, v2.f2 FROM v2; REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON SCHEMA public FROM ubuntu; GRANT ALL ON SCHEMA public TO ubuntu; GRANT ALL ON SCHEMA public TO PUBLIC; 2011/1/12 Tom Lane t...@sss.pgh.pa.us: regression=# create table tt(f1 int, f2 int); CREATE TABLE regression=# create view v1 as select * from tt; CREATE VIEW regression=# create view v2 as select * from v1; CREATE VIEW regression=# create or replace view v1 as select * from v2; CREATE VIEW regression=# drop view v1; ERROR: cannot drop view v1 because other objects depend on it DETAIL: view v2 depends on view v1 HINT: Use DROP ... CASCADE to drop the dependent objects too. regression=# drop view v2; ERROR: cannot drop view v2 because other objects depend on it DETAIL: view v1 depends on view v2 HINT: Use DROP ... CASCADE to drop the dependent objects too. This isn't particularly *useful*, maybe, but it's hardly impossible. And if we analyzed function dependencies in any detail, circular dependencies among functions would be possible (and useful). regards, tom lane -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
On 13.01.2011 02:01, Kevin Grittner wrote: Anssi Kääriäinenanssi.kaariai...@thl.fi wrote: So, count(*) queries are more than twice as slow compared to the old serializable transaction isolation level. I got this down from more than twice the run time to running 33% longer through remembering the last relation for which a search for a predicate lock held by the current transaction found a match at the coarsest (relation) level. It's a bit of a hack and 33% isn't very impressive, even for a worst case (and this is one type of worst case) -- especially given how often people use SELECT count(*) FROM table_x as a performance test. :-( I can see a way to improve on this if there's a low-cost way to determine from within the heapam.c:heapgettup_pagemode function whether it's returning tuples for a table scan. It seems likely that this is somehow contained in the HeapScanDesc structure, but I'm not seeing it. Can anyone point me in the right direction, or tell me that this avenue is a dead end? Pardon my ignorance, but where exactly is the extra overhead coming from? Searching for a predicate lock? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting
On 13.01.2011 04:29, Itagaki Takahiro wrote: On Thu, Jan 13, 2011 at 00:14, Fujii Masaomasao.fu...@gmail.com wrote: pg_ctl failover ? At the moment, the location of the trigger file is configurable, but if we accept a constant location like $PGDATA/failover pg_ctl could do the whole thing, create the file and send signal. pg_ctl on Window already knows how to send the signal via the named pipe signal emulation. The attached patch implements the above-mentioned pg_ctl failover. I have three comments: - Will we call it failover? We will use the command also in switchover operations. pg_ctl promote might be more neutral, but users might be hard to imagine replication feature from promote. I agree that failover or even switchover is too specific. You might want promote a server even if you keep the old master still running, if you're creating a temporary copy of the master repository for testing purposes etc. +1 for promote. People unfamiliar with the replication stuff might not immediately understand that it's related to replication, but they wouldn't have any use for the option anyway. It should be clear to anyone who needs it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Fri, 7 Jan 2011 10:57:17 +0900 Itagaki Takahiro itagaki.takah...@gmail.com wrote: On Mon, Dec 20, 2010 at 20:42, Itagaki Takahiro itagaki.takah...@gmail.com wrote: I added comments and moved some setup codes for COPY TO to BeginCopyTo() for maintainability. CopyTo() still contains parts of initialization, but I've not touched it yet because we don't need the arrangement for now. I updated the COPY FROM API patch. - GetCopyExecutorState() is removed because FDWs will use their own context. I rebased file_fdw patch to recent copy_export patch, and have some comments. The patch just rearranges codes for COPY FROM to export those functions. It also modifies some of COPY TO codes internally for code readability. - BeginCopyFrom(rel, filename, attnamelist, options) - EndCopyFrom(cstate) - NextCopyFrom(cstate, OUT values, OUT nulls, OUT tupleOid) - CopyFromErrorCallback(arg) This API set seems to be enough to implement file_fdw using COPY routines. But EndCopyFrom() seems not to be able to release memory which is allocated in BeginCopy() and BeginCopyFrom(). I found this behavior by executing a query which generates nested loop plan (outer 100 row * inner 10 row), and at last postgres grows up to 300MB+ from 108MB (VIRT of top command). Attached patch would avoid this leak by adding per-copy context to CopyState. This would be overkill, and ResetCopyFrom() might be reasonable though. Anyway, I couldn't find performance degrade with this patch (tested on my Linux box). == # csv_accounts and csv_branches are generated by: 1) pgbench -i -s 10 2) COPY pgbench_accounts to '/path/to/accounts.csv' WITH CSV; 3) COPY pgbench_branches to '/path/to/branches.csv' WITH CSV; Original (There is no memory swap during measurement) postgres=# explain analyze select * from csv_accounts b, csv_branches t where t.bid = b.bid; QUERY PLAN - Nested Loop (cost=0.00..11717.01 rows=1 width=200) (actual time=0.300..100833.057 rows=100 loops=1) Join Filter: (b.bid = t.bid) - Foreign Scan on csv_accounts b (cost=0.00..11717.00 rows=1 width=100) (actual time=0.148..4437.595 rows=100 loops=1) - Foreign Scan on csv_branches t (cost=0.00..0.00 rows=1 width=100) (actual time=0.014..0.039 rows=10 loops=100) Total runtime: 102882.308 ms (5 rows) Patched, Using per-copy context to release memory postgres=# explain analyze select * from csv_accounts b, csv_branches t where t.bid = b.bid; QUERY PLAN - Nested Loop (cost=0.00..11717.01 rows=1 width=200) (actual time=0.226..100931.864 rows=100 loops=1) Join Filter: (b.bid = t.bid) - Foreign Scan on csv_accounts b (cost=0.00..11717.00 rows=1 width=100) (actual time=0.085..4439.777 rows=100 loops=1) - Foreign Scan on csv_branches t (cost=0.00..0.00 rows=1 width=100) (actual time=0.015..0.039 rows=10 loops=100) Total runtime: 102684.276 ms (5 rows) == This memory leak would not be problem when using from COPY command because it handles only one CopyState in a query, and it will be cleaned up with parent context. Some items to be considered: - BeginCopyFrom() could receive filename as an option instead of a separated argument. If do so, file_fdw would be more simple, but it's a change only for file_fdw. COPY commands in the core won't be improved at all. ISTM that current design would be better. - NextCopyFrom() returns values/nulls arrays rather than a HeapTuple. I expect the caller store the result into tupletableslot with ExecStoreVirtualTuple(). It is designed for performance, but if the caller always needs an materialized HeapTuple, HeapTuple is better for the result type. I tried to add tableoid to TupleTableSlot as tts_tableoid, but it seems to make codes such as slot_getaddr() and other staff tricky. How about to implement using materialized tuples to avoid unnecessary (at least for functionality) changes. I would like to send this virtual-tuple-optimization to next development cycle because it would not effect the interface heavily. I'll post materialized-tuple version of foreign_scan patch soon. Regards, -- Shigeru Hanada 20110113-copy_context.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Possible bug in pg_settings/pg_depend
Are multiple identical entires in pg_depend possible? If so, how do they occur, and what is the purpose of representing exactly the same dependency two times in pg_depend? I expected the following query not to return any rows, but it did: glue=# select count(*), * from pg_depend group by classid, objid, objsubid, refclassid, refobjid, refobjsubid, deptype having count(*) 1; count | classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype ---+-+---+--++--+-+- 2 |2618 | 11015 |0 | 1259 |11012 | 1 | n (1 row) Debug data: glue=# select version(); version - PostgreSQL 9.1alpha3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit (1 row) glue=# select pg_describe_object(2618,11015,0); pg_describe_object rule pg_settings_u on view pg_settings (1 row) glue=# select pg_describe_object(1259,11012,1); pg_describe_object -- view pg_settings column name (1 row) -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add function dependencies
Tom Lane t...@sss.pgh.pa.us writes: That seems pretty silly/broken. You should only be touching *direct* dependencies of the extension, IMO. If there's something that's missed by that algorithm, the way to fix it is to add more direct dependencies at extension creation time; not to start a tree walk that is pretty nearly guaranteed to land on things that don't belong to the extension. Well the current patch is walking the tree because that's what I need for listing extension's objects (in \dx ext), e.g. I want to follow from an opclass to its functions in that context. Now I reused this walker for ALTER EXTENSION SET SCHEMA, where it could well be that I don't need walking down the dependency tree. Will think about it and try it (very localised change). Thanks for comments. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in walreceiver
On Thu, Jan 13, 2011 at 5:59 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.01.2011 10:28, Fujii Masao wrote: When the master shuts down or crashes, there seems to be the case where walreceiver exits without flushing WAL which has already been written. This might lead startup process to replay un-flushed WAL and break a Write-Ahead-Logging rule. Hmm, that can happen at a crash even with no replication involved. If you kill -9 postmaster, and some WAL had been written but not fsync'd, on crash recovery we will happily recover the unsynced WAL. Right. If postmaster restarts immediately after kill -9, WAL which has not reached to the disk might be replayed. Then if the server crashes when min recovery point indicates such an unsynced WAL, the database would get corrupted. As you say, that is not just about replication. But that is more likely to happen in the standby because unsynced WAL appears while recovery is in progress. This is one of reasons why walreceiver doesn't let the startup process know that new WAL has arrived before flushing it, I think. So I believe that the patch is somewhat worth applying. BTW, another good point of the patch is that we can track the last WAL receive location correctly. Since WalRcv-receivedUpto is updated after WAL flush, if the patch is not applied, the location of WAL received just before walreceiver exits might not be saved in WalRcv-receivedUpto. We could prevent that by fsyncing all WAL before applying it - presumably fsyncing a file that has already been flushed is quick. But is it worth the trouble? No. It looks overkill though it would completely prevent the problem. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error code for terminating connection due to conflict with recovery
On Thu, Jan 13, 2011 at 2:13 AM, Tatsuo Ishii is...@postgresql.org wrote: Ok. Here is the patch for this. I use 40P02, instead of 40004. Please add this to the currently open CommitFest: https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object, patch v2
On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andreas Karlsson andr...@proxel.se writes: Here is a very simple change of the patch to make the output look more like the syntax of ALTER OPERATOR FAMILY to improve consistency. IMO, what this patch needs is to not output the types unless they are actually different from the default (which can be inferred from the AM type and the function arguments). That would fix my concern about it emitting information that is 99.44% useless. I guess we could do that, but I don't understand how you're supposed to infer them, which means probably a lot of other people won't either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq documentation cleanups (repost 3)
On Wed, Jan 12, 2011 at 8:54 PM, Bruce Momjian br...@momjian.us wrote: I am also attaching a few more of Leslie's changes that I think are useful. The first clarifies a confusion Leslie had about the fact that return is referencing the return value of the function and not the value returned in the pointer. Hmm. Well, if that's the confusion, I don't think inserting the words by the function is the right way to fix it - it certainly isn't returned by anything else. You could change it to say It is also possible for *errmsg to be NULL even when the return value is also NULL; this indicates... The second change is, I think, better wording. OK. The third moves the deprecated text to the start of the function description. Leslie pointed out that that is how we do it for other libpq functions, so we should move it for consistency. That seems to me to read pretty awkwardly. You could perhaps strike the chunk and the whole first paragraph and simply write PQoidStatus is an older, deprecated version of PQoidValue. It returns its result as a character string rather than an Oid, and is not thread-safe. and then cut directly to the synopsis. That would be consistent with what we've done elsewhere; moving just that one sentence is not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_dump
On 2011-01-13 11:31 AM +0200, Joel Jacobson wrote: The example from Tom Lane below results in a database which is not possible to correctly dump using pg_dump. The view v1 strangely becomes a table in the dump output?! CREATE RULE _RETURN AS ON SELECT TO v1 DO INSTEAD SELECT v2.f1, v2.f2 FROM v2; This statement turns the table into a view. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_primary_conninfo
On Wed, Jan 12, 2011 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So I'm thinking to make ProcessConfigFile() parse not only postgresql.conf but also recovery.conf rather than move all the recovery parameters to postgresql.conf. Comments? +1. Actually moving the settings can be done later in about 5 seconds if we all agree it's a good idea, but let's not get bogged down in that now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_dump
* Joel Jacobson wrote: The example from Tom Lane below results in a database which is not possible to correctly dump using pg_dump. The view v1 strangely becomes a table in the dump output?! This is no bug, it's a feature (tm). pg_dump is clever enough to detect the circular dependency and break it open by creating v1 in two steps. A view in PostgreSQL is simply an empty table with an ON SELECT DO INSTEAD rule named _RETURN on it. pg_dump first creates the empty table, then view v2 depending on that table, and finally the _RETURN rule turning v1 into a view and reintroducing the circular dependency. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting
On Thu, Jan 13, 2011 at 5:00 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.01.2011 04:29, Itagaki Takahiro wrote: On Thu, Jan 13, 2011 at 00:14, Fujii Masaomasao.fu...@gmail.com wrote: pg_ctl failover ? At the moment, the location of the trigger file is configurable, but if we accept a constant location like $PGDATA/failover pg_ctl could do the whole thing, create the file and send signal. pg_ctl on Window already knows how to send the signal via the named pipe signal emulation. The attached patch implements the above-mentioned pg_ctl failover. I have three comments: - Will we call it failover? We will use the command also in switchover operations. pg_ctl promote might be more neutral, but users might be hard to imagine replication feature from promote. I agree that failover or even switchover is too specific. You might want promote a server even if you keep the old master still running, if you're creating a temporary copy of the master repository for testing purposes etc. +1 for promote. People unfamiliar with the replication stuff might not immediately understand that it's related to replication, but they wouldn't have any use for the option anyway. It should be clear to anyone who needs it. I agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing GIN for empty/null/full-scan cases
Tom Lane t...@sss.pgh.pa.us writes: David E. Wheeler da...@kineticode.com writes: On Jan 12, 2011, at 4:35 PM, Tom Lane wrote: No, what we need is a decent extension package manager ;-) Yeah. Maybe you can do that this weekend? Or, I dunno, while you sleep tonight? Supposedly it's in the queue for the upcoming CF :-) Hehe, and some provision have been made to support upgrading from 9.0 to 9.1 too: http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html#AEN50748 But that won't solve the dump-from-9.0 and restore-into-9.1 by itself, the only way for us to solve that problem that I can think of would be to backpatch a new feature. Do it the old-way until you upgrade from 9.1 to later might be our answer here. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Walreceiver fsyncs excessively
While testing Fujii-san's patch to flush any already-written WAL on error in walreceiver, I added a debugging elog to XLogWalRcvFlush() to print out how far it has written and flushed. I saw an unexpected pattern while the standby catches up with the master: LOG: streaming replication successfully connected to primary LOG: flushing flush=0/0 write=0/1E02 LOG: flushing flush=0/1E02 write=0/1E04 LOG: flushing flush=0/1E04 write=0/1E06 LOG: flushing flush=0/1E06 write=0/1E08 LOG: flushing flush=0/1E08 write=0/1E0A LOG: flushing flush=0/1E0A write=0/1E0C LOG: flushing flush=0/1E0C write=0/1E0E LOG: flushing flush=0/1E0E write=0/1E10 LOG: flushing flush=0/1E10 write=0/1E12 LOG: flushing flush=0/1E12 write=0/1E14 The master sends the WAL at full-speed, but walreceiver always fsyncs it in 128 kB chunks. That's excessive, it should be able to read and write to disk the whole WAL segment before flushing. There's a little flaw in the walreceiver logic that tries to read all the available WAL before flushing and sleeping. The way libpqrcv_receive is written, when it's called with timeout==0 it will not call PQconsumeInput. So what happens is that when walreceiver main loop calls libpqrcv_receive() in a loop to fetch all WAL that's available without blocking, it actually only reads the WAL that's in the libpq receive buffer - it will not read the WAL that's in the TCP read buffer. Attached patch fixes libpqrcv_receive() so that it calls PQconsumeInput() before concluding that there's no data available. The excessive fsyncing can lead to very bad performance, so this needs to be appled to 9.0 too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 5aac85d..9e8504b 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -351,28 +351,33 @@ libpqrcv_receive(int timeout, unsigned char *type, char **buffer, int *len) PQfreemem(recvBuf); recvBuf = NULL; - /* - * If the caller requested to block, wait for data to arrive. But if this - * is the first call after connecting, don't wait, because there might - * already be some data in libpq buffer that we haven't returned to - * caller. - */ - if (timeout 0 !justconnected) + /* Try to receive a CopyData message */ + rawlen = PQgetCopyData(streamConn, recvBuf, 1); + if (rawlen == 0) { - if (!libpq_select(timeout)) - return false; + /* + * No data available yet. If the caller requested to block, wait for + * more data to arrive. But if this is the first call after connecting, + * don't wait, because there might already be some data in libpq buffer + * that we haven't returned to caller. + */ + if (timeout 0 !justconnected) + { + if (!libpq_select(timeout)) +return false; + } + justconnected = false; if (PQconsumeInput(streamConn) == 0) ereport(ERROR, (errmsg(could not receive data from WAL stream: %s, PQerrorMessage(streamConn; - } - justconnected = false; - /* Receive CopyData message */ - rawlen = PQgetCopyData(streamConn, recvBuf, 1); - if (rawlen == 0) /* no data available yet, then return */ - return false; + /* Now that we've consumed some input, try again */ + rawlen = PQgetCopyData(streamConn, recvBuf, 1); + if (rawlen == 0) + return false; + } if (rawlen == -1) /* end-of-streaming or error */ { PGresult *res; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting
On Thu, Jan 13, 2011 at 7:00 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: +1 for promote. People unfamiliar with the replication stuff might not immediately understand that it's related to replication, but they wouldn't have any use for the option anyway. It should be clear to anyone who needs it. I did s/failover/promote. Here is the updated patch. - pg_ctl should unlink failover_files when it failed to send failover signals. Done. And, I changed some descriptions about trigger in high-availability.sgml and recovery-config.sgml. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pg_ctl_failover_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Walreceiver fsyncs excessively
On Thu, Jan 13, 2011 at 9:01 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: While testing Fujii-san's patch to flush any already-written WAL on error in walreceiver, I added a debugging elog to XLogWalRcvFlush() to print out how far it has written and flushed. I saw an unexpected pattern while the standby catches up with the master: LOG: streaming replication successfully connected to primary LOG: flushing flush=0/0 write=0/1E02 LOG: flushing flush=0/1E02 write=0/1E04 LOG: flushing flush=0/1E04 write=0/1E06 LOG: flushing flush=0/1E06 write=0/1E08 LOG: flushing flush=0/1E08 write=0/1E0A LOG: flushing flush=0/1E0A write=0/1E0C LOG: flushing flush=0/1E0C write=0/1E0E LOG: flushing flush=0/1E0E write=0/1E10 LOG: flushing flush=0/1E10 write=0/1E12 LOG: flushing flush=0/1E12 write=0/1E14 The master sends the WAL at full-speed, but walreceiver always fsyncs it in 128 kB chunks. That's excessive, it should be able to read and write to disk the whole WAL segment before flushing. There's a little flaw in the walreceiver logic that tries to read all the available WAL before flushing and sleeping. The way libpqrcv_receive is written, when it's called with timeout==0 it will not call PQconsumeInput. So what happens is that when walreceiver main loop calls libpqrcv_receive() in a loop to fetch all WAL that's available without blocking, it actually only reads the WAL that's in the libpq receive buffer - it will not read the WAL that's in the TCP read buffer. Attached patch fixes libpqrcv_receive() so that it calls PQconsumeInput() before concluding that there's no data available. The excessive fsyncing can lead to very bad performance, so this needs to be appled to 9.0 too. Seems good. Can we remove the justconnected flag, thanks to the patch? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Warning compiling pg_dump (MinGW, Windows XP)
Hello, Pgsql-hackers. I'm getting such warnings: pg_dump.c: In function 'dumpSequence': pg_dump.c:11449:2: warning: unknown conversion type character 'l' in format pg_dump.c:11449:2: warning: too many arguments for format pg_dump.c:11450:2: warning: unknown conversion type character 'l' in format pg_dump.c:11450:2: warning: too many arguments for format Line numbers my not be the same in the official sources, because I've made some changes. But the lines are: snprintf(bufm, sizeof(bufm), INT64_FORMAT, SEQ_MINVALUE); snprintf(bufx, sizeof(bufx), INT64_FORMAT, SEQ_MAXVALUE); In my oppinion configure failed for MinGw+Windows in this case. Am I right? Can someone give me a hint how to avoid this? Thanks in advance -- With best wishes, Pavel mailto:pa...@gf.microolap.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add function dependencies
Thanks to the new pg_stat_xact_user_functions and pg_stat_xact_user_tables views in 9.1, it will be possible to automatically sample which functions uses which functions/tables to generate a nice directional graph of the dependency tree, based on recent real-life activity, excluding any unused relations/functions not-in-use anymore. It's actually a feature to not include these, as they make the graph a lot more complicated. If you want a graph on the activity during Mondays between 2:30pm and 2:31pm, such a graph could easily be generated, or if you want it for 30 days (which would probably include a lot more edges in the graph), it can also be generated. :-) It would be quite easy to automatically inject some small code snippets to the top and bottom of each user function, to get the diff of select * from pg_stat_xact_user_functions and pg_stat_xact_user_tables between the entry point of each function and the exit point. It would be a lot nicer if it would be possible to automatically let PostgreSQL sample such data for you, providing nice system views with information on the sampled data per function, allowing you to query it and ask, - What functions has funciton public.myfunc(int) called and what tables has it inserted/selected/updated/deleted from since the last time I resetted the statistics? Just an idea... -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] kill -KILL: What happens?
Folks, I've noticed over the years that we give people dire warnings never to send a KILL signal to the postmaster, but I'm unsure as to what are potential consequences of this, as in just exactly how this can result in problems. Is there some reference I can look to for explanations of the mechanism(s) whereby the damage occurs? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Pardon my ignorance, but where exactly is the extra overhead coming from? Searching for a predicate lock? Right. As each tuple is read we need to ensure that there is a predicate lock to cover it. Since finer-grained locks can be combined into coarser-grained locks we need to start with the fine grained and move toward checking the coarser grains, to avoid missing a lock during promotion. So for each tuple we calculate a hash, find a partition, lock it, and lookup the tuple as a lock target. When that's not found we do the same thing for the page. When that's not found we do the same thing for the relation. But we acquired a relation lock up front, when we determined that this would be a heap scan, so we could short-circuit this whole thing if within the heapgettup_pagemode function we could determine that this was a scan of the whole relation. The profiling also showed that it was spending an obscene amount of time calculating hash values (over 10% of total run time!). I'm inclined to think that a less sophisticated algorithm (like just adding oid, page, and tuple offset numbers) would generate very *real* savings with the down side being a very hypothetical *possible* cost to longer chains in the HTAB. But that's a separate issue, best settled on the basis of benchmarks rather than theoretical discussions. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: where exactly is the extra overhead coming from? Keep in mind that this is a sort of worst case scenario. The data is fully cached in shared memory and we're doing a sequential pass just counting the rows. In an earlier benchmark (which I should re-do after all this refactoring), random access queries against a fully cached data set only increased run time by 1.8%. Throw some disk access into the mix, and the overhead is likely to get lost in the noise. But, as I said, count(*) seems to be the first thing many people try as a benchmark, and this is a symptom of a more general issue, so I'd like to find a good solution. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
On 13.01.2011 16:51, Kevin Grittner wrote: Right. As each tuple is read we need to ensure that there is a predicate lock to cover it. Since finer-grained locks can be combined into coarser-grained locks we need to start with the fine grained and move toward checking the coarser grains, to avoid missing a lock during promotion. So for each tuple we calculate a hash, find a partition, lock it, and lookup the tuple as a lock target. When that's not found we do the same thing for the page. When that's not found we do the same thing for the relation. But we acquired a relation lock up front, when we determined that this would be a heap scan, so we could short-circuit this whole thing if within the heapgettup_pagemode function we could determine that this was a scan of the whole relation. That sounds simple enough. Add a boolean field to HeapScanDesc, rs_relpredicatelocked, and set it when you acquire the relation lock. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: That sounds simple enough. Add a boolean field to HeapScanDesc, rs_relpredicatelocked, and set it when you acquire the relation lock. I'll take a look at doing that. Thanks! -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
David Fetter da...@fetter.org writes: I've noticed over the years that we give people dire warnings never to send a KILL signal to the postmaster, but I'm unsure as to what are potential consequences of this, as in just exactly how this can result in problems. Is there some reference I can look to for explanations of the mechanism(s) whereby the damage occurs? There's no risk of data corruption, if that's what you're thinking of. It's just that you're then looking at having to manually clean up the child processes and then restart the postmaster; a process that is not only tedious but does offer the possibility of screwing yourself. In particular the risk is that someone clueless enough to do this would next decide that removing $PGDATA/postmaster.pid, rather than killing all the existing children, is the quickest way to get the postmaster restarted. Once he's done that, his data will shortly be hosed beyond recovery, because now he has two noncommunicating sets of backends massaging the same files via separate sets of shared buffers. The reason this sequence of events doesn't seem improbable is that the error you get when you try to start a new postmaster, if there are still old backends running, is FATAL: pre-existing shared memory block (key 5490001, ID 15609) is still in use HINT: If you're sure there are no old server processes still running, remove the shared memory block or just delete the file postmaster.pid. Maybe we should rewrite that HINT --- while it's *possible* that removing the shmem block or deleting postmaster.pid is the right thing to do, it's not exactly *likely*. I think we need to put a bit more emphasis on the If ... part. Like If you are prepared to swear on your mother's grave that there are no old server processes still running, consider removing postmaster.pid. But first check for existing processes again. (BTW, I notice that this interlock against starting a new postmaster appears to be broken in HEAD, which is likely not unrelated to the fact that the contents of postmaster.pid seem to be totally bollixed :-() regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote: On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net wrote: No, do this at top if (walsnd-state == state) return; Keep spinlocks when actually setting it. I think this is safe... Aha. Thanks for the pointers, pfa a new version. ...but I think you also need to take the spinlock when reading the value. Even when it can only ever be set by one process (the owning walsender), and the variable is atomic (as it should be, since it's a single enum/int)? Anyway, it should be as simple as copying it out to a local variable when it's already in the spinlock and then use that, right? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.01.2011 16:51, Kevin Grittner wrote: But we acquired a relation lock up front, when we determined that this would be a heap scan, so we could short-circuit this whole thing if within the heapgettup_pagemode function we could determine that this was a scan of the whole relation. That sounds simple enough. Add a boolean field to HeapScanDesc, rs_relpredicatelocked, and set it when you acquire the relation lock. Heikki, I can't thank you enough. The fix is here: http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=64ca508a0e2fa9c21dc76a5d6a5f549c27f511fa The timings are now: begin transaction isolation level repeatable read; Time: 324.938 ms Time: 228.045 ms Time: 227.963 ms begin transaction isolation level serializable; Time: 311.954 ms Time: 311.928 ms Time: 311.848 ms begin transaction isolation level serializable, read only; Time: 227.471 ms Time: 228.137 ms Time: 227.778 ms begin transaction isolation level serializable, read only, deferrable; Time: 227.899 ms Time: 249.772 ms Time: 228.026 ms begin transaction isolation level repeatable read; Time: 231.173 ms Time: 245.041 ms Time: 228.149 ms I'm surprised the difference is still that high as a percentage, and will investigate, but this seems survivable. When I do the math, the difference comes out to 83.885 nanoseconds per row. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Walreceiver fsyncs excessively
On 13.01.2011 14:34, Fujii Masao wrote: On Thu, Jan 13, 2011 at 9:01 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Attached patch fixes libpqrcv_receive() so that it calls PQconsumeInput() before concluding that there's no data available. The excessive fsyncing can lead to very bad performance, so this needs to be appled to 9.0 too. Seems good. Can we remove the justconnected flag, thanks to the patch? Yes, good point. Committed with justconnected removed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql crashes on encoding mismatch
2011/1/13 Tom Lane t...@sss.pgh.pa.us: Hitoshi Harada umi.tan...@gmail.com writes: I found a crash case (assertion failure) when runing psql -f utf8_encoded_script.sql against client_encoding = shift_jis in postgresql.conf. Though encoding mismatch is obviously user's fault, a crash doesn't explain anything to him. I'm not too impressed with this patch: it seems like the most it will accomplish is to move the failure to some other, equally obscure, place --- because you'll still have a string that's invalidly encoded. Moreover, if you've got wrongly encoded data, it wouldn't be hard at all for it to mess up psql's lexing; consider cases such as a character-that's-not-as-long-as-we-think just in front of a quote mark. Shouldn't we instead try to verify the multibyte encoding somewhere upstream of here? I had thought it before going into the patch, too. However, the fact that psql(fe-misc.c) doesn't have PQverfiymb() although it has PQmblen() implied to me that encoding verification should be done in server side perhaps. I might be too ignorant to imagine the lexing problem of your quote mark, but my crash sample has multibyte characters in sql comment, which is ignored in the server parsing. If we decided that the case raises error, wouldn't some existing applications be broken? I can imagine they are in the same situation of encoding mismatch and are run without problem I found by chance. Just for reference I attach the case sql file. To reproduce it: 1. initdb 2. edit client_encoding = shift_jis in postgresql.conf 3. start postgres 4. psql -f case_utf8.sql Note: the line break should be LF as the file stands. CR-LF cannot reproduce the problem. Regards, -- Hitoshi Harada case_utf8.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] arrays as pl/perl input arguments [PATCH]
On Thu, Jan 13, 2011 at 01:06, Martijn van Oosterhout klep...@svana.org wrote: On Thu, Jan 13, 2011 at 12:06:33AM -0700, Alex Hunsaker wrote: I had supposed that it would be possible to do the string conversion lazily, ie, only if the string value was actually demanded. Yep, In-fact if we wanted we could even die (or throw an exception in other language speak :) ) when the string value is demanded. I played with this a little and it is fairly easy to make a variable such that $a is the string representation and $a[0] the first value of the array. The problem is that you can't pass such a variable into a subroutine. [ snip ] my @a=(1,2); tie $a, MyClass, \@a; print \$a='$a'\n; print \$a[0]='$a[0]'\n; Erm... the reason you can't seem to pass it to any subroutines is its actually 2 variables: $a, @a. When you print $a\n; you are using the tied variable that uses @a; And when you print $a[0]\n; you are accessing the array directly. I think you just had an unfortunate variable name, otherwise strict would have complained appropriately. :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq documentation cleanups (repost 3)
Robert Haas wrote: On Wed, Jan 12, 2011 at 8:54 PM, Bruce Momjian br...@momjian.us wrote: I am also attaching a few more of Leslie's changes that I think are useful. ?The first clarifies a confusion Leslie had about the fact that return is referencing the return value of the function and not the value returned in the pointer. Hmm. Well, if that's the confusion, I don't think inserting the words by the function is the right way to fix it - it certainly isn't returned by anything else. You could change it to say It is also possible for *errmsg to be NULL even when the return value is also NULL; this indicates... The second change is, I think, better wording. OK. The third moves the deprecated text to the start of the function description. ?Leslie pointed out that that is how we do it for other libpq functions, so we should move it for consistency. That seems to me to read pretty awkwardly. You could perhaps strike the chunk and the whole first paragraph and simply write PQoidStatus is an older, deprecated version of PQoidValue. It returns its result as a character string rather than an Oid, and is not thread-safe. and then cut directly to the synopsis. That would be consistent with what we've done elsewhere; moving just that one sentence is not. OK, I have made the adjustments you mentioned with my own wording (attached and applied). Let me know of any more needed adjustments. Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 58e593d..fe661b8 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -972,8 +972,8 @@ PQconninfoOption *PQconninfoParse(const char *conninfo, char **errmsg); If literalerrmsg/ is not symbolNULL/, then literal*errmsg/ is set to symbolNULL/ on success, else to a functionmalloc/'d error string explaining the problem. (It is also possible for literal*errmsg/ to be - set to symbolNULL/ even when symbolNULL/ is returned; this indicates an out-of-memory - situation.) + set to symbolNULL/ and the function to return symbolNULL/; + this indicates an out-of-memory condition.) /para para @@ -1352,7 +1352,7 @@ ConnStatusType PQstatus(const PGconn *conn); para See the entry for functionPQconnectStartParams/, functionPQconnectStart/ and functionPQconnectPoll/ with regards to other status codes that - might be seen. + might be returned. /para /listitem /varlistentry @@ -3163,23 +3163,15 @@ Oid PQoidValue(const PGresult *res); listitem para - Returns a string with the OID of the inserted row, if the - acronymSQL/acronym command was an commandINSERT/command - that inserted exactly one row, or a commandEXECUTE/command of - a prepared statement consisting of a suitable - commandINSERT/command. (The string will be literal0/ if - the commandINSERT/command did not insert exactly one row, or - if the target table does not have OIDs.) If the command was not - an commandINSERT/command, returns an empty string. + This function is deprecated in favor of + functionPQoidValue/function and is not thread-safe. + It returns a string with the OID of the inserted row, while + functionPQoidValue/function returns the OID value. synopsis char *PQoidStatus(const PGresult *res); /synopsis /para - para - This function is deprecated in favor of - functionPQoidValue/function. It is not thread-safe. - /para /listitem /varlistentry /variablelist -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 10:41:28AM -0500, Tom Lane wrote: David Fetter da...@fetter.org writes: I've noticed over the years that we give people dire warnings never to send a KILL signal to the postmaster, but I'm unsure as to what are potential consequences of this, as in just exactly how this can result in problems. Is there some reference I can look to for explanations of the mechanism(s) whereby the damage occurs? There's no risk of data corruption, if that's what you're thinking of. It's just that you're then looking at having to manually clean up the child processes and then restart the postmaster; a process that is not only tedious but does offer the possibility of screwing yourself. Does this mean that there's no cross-platform way to ensure that killing a process results in its children's timely (i.e. before damage can occur) death? That such a way isn't practical from a performance point of view? In particular the risk is that someone clueless enough to do this would next decide that removing $PGDATA/postmaster.pid, rather than killing all the existing children, is the quickest way to get the postmaster restarted. Once he's done that, his data will shortly be hosed beyond recovery, because now he has two noncommunicating sets of backends massaging the same files via separate sets of shared buffers. Right. The reason this sequence of events doesn't seem improbable is that the error you get when you try to start a new postmaster, if there are still old backends running, is FATAL: pre-existing shared memory block (key 5490001, ID 15609) is still in use HINT: If you're sure there are no old server processes still running, remove the shared memory block or just delete the file postmaster.pid. Maybe we should rewrite that HINT --- while it's *possible* that removing the shmem block or deleting postmaster.pid is the right thing to do, it's not exactly *likely*. I think we need to put a bit more emphasis on the If ... part. Like If you are prepared to swear on your mother's grave that there are no old server processes still running, consider removing postmaster.pid. But first check for existing processes again. Maybe the hint could give an OS-tailored way to check this... (BTW, I notice that this interlock against starting a new postmaster appears to be broken in HEAD, which is likely not unrelated to the fact that the contents of postmaster.pid seem to be totally bollixed :-() D'oh! Well, I hope knowing it's a problem gives some kind of glimmer as to how to solve it :) Is this worth writing tests for? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] C++ keywords in headers (was Re: [GENERAL] #include funcapi.h)
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of lun dic 27 13:54:56 -0300 2010: [ lightbulb ] ... although we could improve that quite a bit if we processed each .h file separately instead of insisting on smashing everything into one compilation. Let me go try that. FWIW I have this patch lingering about that I wrote months ago, to check for header problems (not C++ stuff, just things like forgetting to include some necessary header in some other header). Since it needs a lot of polish (needs to ignore certain headers, and avoid leave lingering files around), I didn't commit it; and I haven't updated it to the new Make recursive stuff, either. src/tools/pginclude/ already contains several scripts for this sort of thing. Bruce runs them by hand occasionally, although I just found out that he's evidently not run the does-each-header-compile-standalone test in awhile. It would probably pay to automate these. It is true I have not run those tests in a while. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander mag...@hagander.net wrote: On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote: On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net wrote: No, do this at top if (walsnd-state == state) return; Keep spinlocks when actually setting it. I think this is safe... Aha. Thanks for the pointers, pfa a new version. ...but I think you also need to take the spinlock when reading the value. Even when it can only ever be set by one process (the owning walsender), and the variable is atomic (as it should be, since it's a single enum/int)? The fact that it can only be modified by one process makes it safe for *that process* to read it without taking the lock, but another process that wants to read it still needs the lock, I believe - otherwise you might get a slightly stale value. That's probably not a *huge* deal in this case, but I think it'd be better to get it right because people tend to copy these sorts of things elsewhere, and it'd be bad if it got copied into some place more critical. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
On Wed, Jan 12, 2011 at 09:09:31PM +0100, Joel Jacobson wrote: (sorry for top posting, No worries. iPhone + drunk) A dangerous combination indeed. I hear water, NSAIDs and time can help with the hangover ;) pg_depend_before is a select * from pg_depend before creating the test db model Please put a self-contained example on the snippets page, and please also to check that it actually runs before doing so. You'd mangled some aliases in the query you sent, which leads me to believe you hadn't actually tried running it. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system views for walsender activity
On Thu, Jan 13, 2011 at 18:43, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander mag...@hagander.net wrote: On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote: On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net wrote: No, do this at top if (walsnd-state == state) return; Keep spinlocks when actually setting it. I think this is safe... Aha. Thanks for the pointers, pfa a new version. ...but I think you also need to take the spinlock when reading the value. Even when it can only ever be set by one process (the owning walsender), and the variable is atomic (as it should be, since it's a single enum/int)? The fact that it can only be modified by one process makes it safe for *that process* to read it without taking the lock, but another process that wants to read it still needs the lock, I believe - otherwise you might get a slightly stale value. That's probably not a *huge* deal in this case, but I think it'd be better to get it right because people tend to copy these sorts of things elsewhere, and it'd be bad if it got copied into some place more critical. ok, thanks for the pointers - fix applied. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
David Fetter da...@fetter.org writes: On Thu, Jan 13, 2011 at 10:41:28AM -0500, Tom Lane wrote: It's just that you're then looking at having to manually clean up the child processes and then restart the postmaster; a process that is not only tedious but does offer the possibility of screwing yourself. Does this mean that there's no cross-platform way to ensure that killing a process results in its children's timely (i.e. before damage can occur) death? That such a way isn't practical from a performance point of view? The simple, easy, cross-platform solution is this: don't kill -9 the postmaster. Send it one of the provisioned shutdown signals and let it kill its children for you. At least on Unix I don't believe there is any other solution. You could try looking at ps output but there's a fundamental race condition, ie the postmaster could spawn another child just before you kill it, whereupon the child is reassigned to init and there's no longer a good way to tell that it came from that postmaster. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Tom Lane t...@sss.pgh.pa.us wrote: At least on Unix I don't believe there is any other solution. You could try looking at ps output but there's a fundamental race condition, ie the postmaster could spawn another child just before you kill it, whereupon the child is reassigned to init and there's no longer a good way to tell that it came from that postmaster. Couldn't you run `ps auxf` and kill any postgres process which is not functioning as postmaster (those are pretty easy to distinguish) and which isn't the child of such a process? Is there ever a reason to allow such an orphan to run? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in pg_settings/pg_depend
Joel Jacobson j...@gluefinance.com writes: Are multiple identical entires in pg_depend possible? Yes, probably. It's certainly possible to have the same linkage occur with different deptypes. We don't try hard to avoid dups because they don't matter. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object, patch v2
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: IMO, what this patch needs is to not output the types unless they are actually different from the default (which can be inferred from the AM type and the function arguments). That would fix my concern about it emitting information that is 99.44% useless. I guess we could do that, but I don't understand how you're supposed to infer them, which means probably a lot of other people won't either. Read the CREATE OPERATOR CLASS source code. (It likely would be best to refactor that a bit so it would expose some way to obtain the implied defaults --- I don't think that's done explicitly now, and it's certainly not exported from opclasscmds.c.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: At least on Unix I don't believe there is any other solution. You could try looking at ps output but there's a fundamental race condition, ie the postmaster could spawn another child just before you kill it, whereupon the child is reassigned to init and there's no longer a good way to tell that it came from that postmaster. Couldn't you run `ps auxf` and kill any postgres process which is not functioning as postmaster (those are pretty easy to distinguish) and which isn't the child of such a process? Is there ever a reason to allow such an orphan to run? That's not terribly hard to do by hand, especially since the cautious DBA could also do things like checking a process' CWD to verify which postmaster it had belonged to. I can't see automating it though. We already have a perfectly good solution to the automated shutdown problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Tom Lane t...@sss.pgh.pa.us wrote: I can't see automating it though. We already have a perfectly good solution to the automated shutdown problem. Oh, I totally agree with that. I somehow thought we'd gotten off into how someone could recover after shooting their foot. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in pg_settings/pg_depend
2011/1/13 Tom Lane t...@sss.pgh.pa.us: Yes, probably. It's certainly possible to have the same linkage occur with different deptypes. We don't try hard to avoid dups because they don't matter. with different deptypes, yes, but in this case there were two linkages of the same deptype. Just seems a bit strange I only found one such in the entire database, smells like some kind of bug, but might not be, I dunno, just thought it was worth investigating a bit, but if you're sure about it I of course trust you. -- Best regards, Joel Jacobson Glue Finance E: j...@gluefinance.com T: +46 70 360 38 01 Postal address: Glue Finance AB Box 549 114 11 Stockholm Sweden Visiting address: Glue Finance AB Birger Jarlsgatan 14 114 34 Stockholm Sweden -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Jan13, 2011, at 19:00 , Tom Lane wrote: At least on Unix I don't believe there is any other solution. You could try looking at ps output but there's a fundamental race condition, ie the postmaster could spawn another child just before you kill it, whereupon the child is reassigned to init and there's no longer a good way to tell that it came from that postmaster. Maybe I'm totally confused, but ... Couldn't normal backends call PostmasterIsAlive and exit if not, just like the startup process, the stats collector, autovacuum, bgwriter, walwriter, walreceiver, walsender and the wal archiver already do? I assumed they do, but now that I grepped the code it seems they don't. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_depend explained
2011/1/13 David Fetter da...@fetter.org: Please put a self-contained example on the snippets page, and please also to check that it actually runs before doing so. You'd mangled some aliases in the query you sent, which leads me to believe you hadn't actually tried running it. I actually hadn't really solved the problem at the time I wrote my last email, it turned out I had to do things a bit differently to avoid running into problems with corner cases. I will put together a self-contained example like you suggested and get back shortly :-) -- Best regards, Joel Jacobson Glue Finance -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Wed, Jan 12, 2011 at 10:39, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jan 10, 2011 at 11:09 PM, Magnus Hagander mag...@hagander.net wrote: I've committed the backend side of this, without that. Still working on the client, and on cleaning up Heikki's patch for grammar/parser support. Great work! I have some comments: While walsender is sending a base backup, WalSndWakeup should not send the signal to that walsender? True, it's not necessary. How bad does it actually hurt things though? Given that the walsender running the backup isn't actually waiting on the latch, it doesn't actually send a signal, does it? In sendFile or elsewhere, we should periodically check whether postmaster is alive and whether the flag was set by the signal? That, however, we probably should. At the end of the backup by walsender, it forces a switch to a new WAL file and waits until the last WAL file has been archived. So we should change postmaster so that it doesn't cause the archiver to end before walsender ends when shutdown is requested? Um. I have to admit I'm not entirely following what you mean enough to confirm it, but it *sounds* correct :-) What scenario exactly is the problematic one? Also, when shutdown is requested, the walsender which is streaming WAL should not end before another walsender which is sending a backup ends, to stream the backup-end WAL? Not sure I see the reason for that. If we're shutting down in the middle of the base backup, we don't have any support for continuing that one after we're back up - you have to start over. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 12:45:07PM -0600, Kevin Grittner wrote: Tom Lane t...@sss.pgh.pa.us wrote: I can't see automating it though. We already have a perfectly good solution to the automated shutdown problem. Oh, I totally agree with that. I somehow thought we'd gotten off into how someone could recover after shooting their foot. I get that we can't prevent all pilot error, but I was hoping we could bullet-proof this a little more, especially in light of a certain extremely popular server OS's OOM killer's default behavior. Yes, I get that that behavior is crazy, and stupid, and that people should shut it off, but it *is* our problem if we let the postmaster start (or continue) when it's set that way. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Florian Pflug f...@phlo.org writes: Couldn't normal backends call PostmasterIsAlive and exit if not, just like the startup process, the stats collector, autovacuum, bgwriter, walwriter, walreceiver, walsender and the wal archiver already do? I assumed they do, but now that I grepped the code it seems they don't. That's intentional: they keep going until the user closes the session or someone sends them a signal to do otherwise. The other various background processes have to watch PostmasterIsAlive because there is no session to close. Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use. It sucks because you don't get a signal on parent death. With the arrival of the latch code, having to check for PostmasterIsAlive frequently is the only reason for an idle background process to consume CPU at all. Another problem with the scheme is that it only works as long as the background process is providing a *non critical* service. Eventually we are probably going to need some way for bgwriter/walwriter to stay alive long enough to service orphaned backends, rather than disappearing instantly if the postmaster goes away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
David Fetter da...@fetter.org writes: I get that we can't prevent all pilot error, but I was hoping we could bullet-proof this a little more, especially in light of a certain extremely popular server OS's OOM killer's default behavior. Yes, I get that that behavior is crazy, and stupid, and that people should shut it off, but it *is* our problem if we let the postmaster start (or continue) when it's set that way. Packagers who are paying attention have fixed that ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use. It sucks because you don't get a signal on parent death. With the arrival of the latch code, having to check for PostmasterIsAlive frequently is the only reason for an idle background process to consume CPU at all. What we really need is SIGPARENT. I wonder if the Linux folks would consider adding such a thing. Might be useful to others as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible bug in pg_settings/pg_depend
On Thu, Jan 13, 2011 at 2:04 PM, Joel Jacobson j...@gluefinance.com wrote: 2011/1/13 Tom Lane t...@sss.pgh.pa.us: Yes, probably. It's certainly possible to have the same linkage occur with different deptypes. We don't try hard to avoid dups because they don't matter. with different deptypes, yes, but in this case there were two linkages of the same deptype. Just seems a bit strange I only found one such in the entire database, smells like some kind of bug, but might not be, I dunno, just thought it was worth investigating a bit, but if you're sure about it I of course trust you. Instead of trusting him, you could investigate why it happens. A quick test shows this eliminates both dependencies: drop rule pg_settings_u on pg_settings; It appears that both of the dependencies in question are from that rule and pointing to pg_settings.name, and it looks like that rule mentions the name column of pg_settings twice. With a little further experimentation you can probably tease out whether each of the two mentions produced a separate dependency... my guess is yes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use. It sucks because you don't get a signal on parent death. With the arrival of the latch code, having to check for PostmasterIsAlive frequently is the only reason for an idle background process to consume CPU at all. What we really need is SIGPARENT. I wonder if the Linux folks would consider adding such a thing. Might be useful to others as well. That's pretty much a dead-end idea unfortunately; it would never be portable enough to let us change our system structure to rely on it. Even more to the point, go away when the postmaster does isn't really the behavior we want anyway. Go away when the last backend does is what we want. I wonder whether we could have some sort of latch-like counter that would count the number of active backends and deliver signals when the count went to zero. However, if the goal is to defend against random applications of SIGKILL, there's probably no way to make this reliable in userspace. Another idea is to have a postmaster minder process that respawns the postmaster when it's killed. The hard part of that is that the minder can't be connected to shared memory (else its OOM cross-section is just as big as the postmaster's), and that makes it difficult for it to tell when all the children have gone away. I suppose it could be coded to just retry every few seconds until success. This doesn't improve the behavior of background processes at all, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use. It sucks because you don't get a signal on parent death. With the arrival of the latch code, having to check for PostmasterIsAlive frequently is the only reason for an idle background process to consume CPU at all. What we really need is SIGPARENT. I wonder if the Linux folks would consider adding such a thing. Might be useful to others as well. That's pretty much a dead-end idea unfortunately; it would never be portable enough to let us change our system structure to rely on it. Even more to the point, go away when the postmaster does isn't really the behavior we want anyway. Go away when the last backend does is what we want. I'm not convinced. I was thinking that we could simply treat it like SIGQUIT, if it's available. I doubt there's a real use case for continuing to run queries after the postmaster and all the background processes are dead. Expedited death seems like much better behavior. Even checking PostmasterIsAlive() once per query would be reasonable, except that it'd add a system call to check for a condition that almost never holds, which I'm not eager to do. I wonder whether we could have some sort of latch-like counter that would count the number of active backends and deliver signals when the count went to zero. However, if the goal is to defend against random applications of SIGKILL, there's probably no way to make this reliable in userspace. I don't think you can get there 100%. We could, however, make a rule that when a background process fails a PostmasterIsAlive() check, it sends SIGQUIT to everyone it can find in the ProcArray, which would at least ensure a timely exit in most real-world cases. Another idea is to have a postmaster minder process that respawns the postmaster when it's killed. The hard part of that is that the minder can't be connected to shared memory (else its OOM cross-section is just as big as the postmaster's), and that makes it difficult for it to tell when all the children have gone away. I suppose it could be coded to just retry every few seconds until success. This doesn't improve the behavior of background processes at all, though. It hardly seems worth it. Given a reliable interlock against multiple postmasters, the real concern is making sure that a half-dead postmaster gets itself all-dead quickly so that the DBA can start up a new one before he gets fired. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On Tue, Jan 11, 2011 at 11:06:18AM -0800, Josh Berkus wrote: It makes it very convenient to set up standbys, without having to worry that you'll conflict e.g with a nightly backup. I don't imagine people will use streaming base backups for very large databases anyway. Also, imagine that you're provisioning a 10-node replication cluster on EC2. This would make that worlds easier. Hmm, perhaps. My concern is that a naive attempt to do that is going to have 10 base-backups happening at the same time, completely slamming the master, and none of them completing is a reasonable time. Is this possible, or is it that simultaneity will buy you hot caches and backup #2 - #10 all run faster? Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether we could have some sort of latch-like counter that would count the number of active backends and deliver signals when the count went to zero. However, if the goal is to defend against random applications of SIGKILL, there's probably no way to make this reliable in userspace. I don't think you can get there 100%. We could, however, make a rule that when a background process fails a PostmasterIsAlive() check, it sends SIGQUIT to everyone it can find in the ProcArray, which would at least ensure a timely exit in most real-world cases. You're going in the wrong direction there: we're trying to have the system remain sane when the postmaster crashes, not see how quickly it can screw up every remaining session. BTW, in Unix-land we could maybe rely on SysV semaphores' SEM_UNDO feature to keep a trustworthy count of how many live processes there are. But I don't know whether there's anything comparable for Windows. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote: I'm not convinced. I was thinking that we could simply treat it like SIGQUIT, if it's available. I doubt there's a real use case for continuing to run queries after the postmaster and all the background processes are dead. Expedited death seems like much better behavior. Even checking PostmasterIsAlive() once per query would be reasonable, except that it'd add a system call to check for a condition that almost never holds, which I'm not eager to do. If postmaster has a few fds to spare, what about having it open a pipe to every child it spawns. It never has to read/write to it, but postmaster closing will signal the client's fd. The client just has to pop the fd into whatever nrmal poll/select event handlign it uses to notice when the parent's pipe is closed. A FIFO would allow postmaster to not need as many file handles, and clients reading the fifo would notice when the writer (postmaster) closes it. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 3:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether we could have some sort of latch-like counter that would count the number of active backends and deliver signals when the count went to zero. However, if the goal is to defend against random applications of SIGKILL, there's probably no way to make this reliable in userspace. I don't think you can get there 100%. We could, however, make a rule that when a background process fails a PostmasterIsAlive() check, it sends SIGQUIT to everyone it can find in the ProcArray, which would at least ensure a timely exit in most real-world cases. You're going in the wrong direction there: we're trying to have the system remain sane when the postmaster crashes, not see how quickly it can screw up every remaining session. I strongly believe you're in the minority on that one, for the same reasons that I don't think most people would agree with your notion of what should be the default shutdown mode. A database that can't accept new connections is a liability, not an asset. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On Thu, Jan 13, 2011 at 2:19 PM, Ross J. Reedstrom reeds...@rice.edu wrote: On Tue, Jan 11, 2011 at 11:06:18AM -0800, Josh Berkus wrote: It makes it very convenient to set up standbys, without having to worry that you'll conflict e.g with a nightly backup. I don't imagine people will use streaming base backups for very large databases anyway. Also, imagine that you're provisioning a 10-node replication cluster on EC2. This would make that worlds easier. Hmm, perhaps. My concern is that a naive attempt to do that is going to have 10 base-backups happening at the same time, completely slamming the master, and none of them completing is a reasonable time. Is this possible, or is it that simultaneity will buy you hot caches and backup #2 - #10 all run faster? That's going to depend on the situation. If the database fits in memory, then it's just going to work. If it fits on disk, it's less obvious whether it'll be good or bad, but an arbitrary limitation here doesn't serve us well. P.S. Your reply-to header is busted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Jan13, 2011, at 21:01 , Aidan Van Dyk wrote: On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote: I'm not convinced. I was thinking that we could simply treat it like SIGQUIT, if it's available. I doubt there's a real use case for continuing to run queries after the postmaster and all the background processes are dead. Expedited death seems like much better behavior. Even checking PostmasterIsAlive() once per query would be reasonable, except that it'd add a system call to check for a condition that almost never holds, which I'm not eager to do. If postmaster has a few fds to spare, what about having it open a pipe to every child it spawns. It never has to read/write to it, but postmaster closing will signal the client's fd. The client just has to pop the fd into whatever nrmal poll/select event handlign it uses to notice when the parent's pipe is closed. I just started to experiment with that idea, and wrote a small test program to check if that'd work. I'll post the results when I'm done. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com wrote: A database that can't accept new connections is a liability, not an asset. +1 I have so far been unable to imagine a use case for the production databases I use where I would prefer to see backends continue after postmaster failure. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com writes: I strongly believe you're in the minority on that one, for the same reasons that I don't think most people would agree with your notion of what should be the default shutdown mode. A database that can't accept new connections is a liability, not an asset. Killing active sessions when it's not absolutely necessary is not an asset. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 21:37, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I strongly believe you're in the minority on that one, for the same reasons that I don't think most people would agree with your notion of what should be the default shutdown mode. A database that can't accept new connections is a liability, not an asset. Killing active sessions when it's not absolutely necessary is not an asset. It certainly can be. Consider any connection pooling scenario, which would represent the vast majority of larger deployments today - if you don't kill the sessions, they will never go away. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Aidan Van Dyk ai...@highrise.ca writes: If postmaster has a few fds to spare, what about having it open a pipe to every child it spawns. It never has to read/write to it, but postmaster closing will signal the client's fd. The client just has to pop the fd into whatever nrmal poll/select event handlign it uses to notice when the parent's pipe is closed. Hmm. Or more generally: there's one FIFO. The postmaster holds both sides open. Backends hold the write side open. (They can close the read side, but that would just be to free up a FD.) Background children close the write side. Now a background process can use EOF on the read side of the FIFO to tell it that postmaster and all backends have exited. You still don't get a signal, but at least the condition you're testing for is the one we actually want and not an approximation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 1/13/11 12:11 PM, Robert Haas wrote: That's going to depend on the situation. If the database fits in memory, then it's just going to work. If it fits on disk, it's less obvious whether it'll be good or bad, but an arbitrary limitation here doesn't serve us well. FWIW, if we had this feature right now in 9.0 we (PGX) would be using it. We run into the case of DB in memory, multiple slaves fairly often these days. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 09:18:06PM +0100, Florian Pflug wrote: On Jan13, 2011, at 21:01 , Aidan Van Dyk wrote: On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote: I'm not convinced. I was thinking that we could simply treat it like SIGQUIT, if it's available. I doubt there's a real use case for continuing to run queries after the postmaster and all the background processes are dead. Expedited death seems like much better behavior. Even checking PostmasterIsAlive() once per query would be reasonable, except that it'd add a system call to check for a condition that almost never holds, which I'm not eager to do. If postmaster has a few fds to spare, what about having it open a pipe to every child it spawns. It never has to read/write to it, but postmaster closing will signal the client's fd. The client just has to pop the fd into whatever nrmal poll/select event handlign it uses to notice when the parent's pipe is closed. I just started to experiment with that idea, and wrote a small test program to check if that'd work. I'll post the results when I'm done. Great! :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I strongly believe you're in the minority on that one, for the same reasons that I don't think most people would agree with your notion of what should be the default shutdown mode. A database that can't accept new connections is a liability, not an asset. Killing active sessions when it's not absolutely necessary is not an asset. That's a highly arguable point and I certainly don't agree with it. A database with no postmaster and no background processes can't possibly be expected to function in any sort of halfway reasonable way. In particular: 1. No checkpoints will occur, so the time required for recovery will grow longer without bound. 2. All walsenders will exit, so no transactions will be replicated to standbys. 3. Transactions committed asynchronously won't be flushed to disk, and are lost entirely unless enough other WAL activity occurs before the last backend dies to force a WAL write. 4. Autovacuum won't run until the system is properly restarted, and to make matters worse there's no statistics collector, so the information that might trigger a later run will be lost also. 5. At some point, you'll run out of clean buffers, after which performance will start to suck as backends have to do their own writes. 6. At some probably later point, the fsync request queue will fill up, after which performance will go into the toilet. On 9.1devel, this takes less than a minute of moderate activity on my MacOS X machine. All in all, running for any significant period of time in this state is likely a recipe for disaster, even if for some inexplicable reason you don't care about the fact that the system won't accept any new connections. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multiple concurrent base backups
On 13.01.2011 22:57, Josh Berkus wrote: On 1/13/11 12:11 PM, Robert Haas wrote: That's going to depend on the situation. If the database fits in memory, then it's just going to work. If it fits on disk, it's less obvious whether it'll be good or bad, but an arbitrary limitation here doesn't serve us well. FWIW, if we had this feature right now in 9.0 we (PGX) would be using it. We run into the case of DB in memory, multiple slaves fairly often these days. Anyway, here's an updated patch with all the known issues fixed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 5b6a230..400e12e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -60,8 +60,6 @@ /* File path names (all relative to $PGDATA) */ -#define BACKUP_LABEL_FILE backup_label -#define BACKUP_LABEL_OLD backup_label.old #define RECOVERY_COMMAND_FILE recovery.conf #define RECOVERY_COMMAND_DONE recovery.done @@ -338,7 +336,8 @@ typedef struct XLogCtlInsert XLogPageHeader currpage; /* points to header of block in cache */ char *currpos; /* current insertion point in cache */ XLogRecPtr RedoRecPtr; /* current redo point for insertions */ - bool forcePageWrites; /* forcing full-page writes for PITR? */ + int forcePageWrites; /* forcing full-page writes for PITR? */ + bool exclusiveBackup; /* a backup was started with pg_start_backup() */ } XLogCtlInsert; /* @@ -8313,16 +8312,38 @@ pg_start_backup(PG_FUNCTION_ARGS) backupidstr = text_to_cstring(backupid); - startpoint = do_pg_start_backup(backupidstr, fast); + startpoint = do_pg_start_backup(backupidstr, fast, NULL); snprintf(startxlogstr, sizeof(startxlogstr), %X/%X, startpoint.xlogid, startpoint.xrecoff); PG_RETURN_TEXT_P(cstring_to_text(startxlogstr)); } +/* + * do_pg_start_backup is the workhorse of the user-visible pg_start_backup() + * function. It creates the necessary starting checkpoint and constructs the + * backup label file. + * + * There are two kind of backups: exclusive and non-exclusive. An exclusive + * backup is started with pg_start_backup(), and there can be only one active + * at a time. The backup label file of an exclusive backup is written to + * $PGDATA/backup_label, and it is removed by pg_stop_backup(). + * + * A non-exclusive backup is used for the streaming base backups (see + * src/backend/replication/basebackup.c). The difference to exclusive backups + * is that the backup label file is not written to disk. Instead, its would-be + * contents are returned in *labelfile, and the caller is responsible for + * including it in the backup archive as 'backup_label'. There can be many + * non-exclusive backups active at the same time, and they don't conflict + * with exclusive backups either. + * + * Every successfully started non-exclusive backup must be stopped by calling + * do_pg_stop_backup() or do_pg_abort_backup(). + */ XLogRecPtr -do_pg_start_backup(const char *backupidstr, bool fast) +do_pg_start_backup(const char *backupidstr, bool fast, char **labelfile) { + bool exclusive = (labelfile == NULL); XLogRecPtr checkpointloc; XLogRecPtr startpoint; pg_time_t stamp_time; @@ -8332,6 +8353,7 @@ do_pg_start_backup(const char *backupidstr, bool fast) uint32 _logSeg; struct stat stat_buf; FILE *fp; + StringInfoData labelfbuf; if (!superuser() !is_authenticated_user_replication_role()) ereport(ERROR, @@ -8350,6 +8372,12 @@ do_pg_start_backup(const char *backupidstr, bool fast) errmsg(WAL level not sufficient for making an online backup), errhint(wal_level must be set to \archive\ or \hot_standby\ at server start.))); + if (strlen(backupidstr) MAXPGPATH) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(backup label too long (max %d bytes), + MAXPGPATH))); + /* * Mark backup active in shared memory. We must do full-page WAL writes * during an on-line backup even if not doing so at other times, because @@ -8368,15 +8396,19 @@ do_pg_start_backup(const char *backupidstr, bool fast) * ensure adequate interlocking against XLogInsert(). */ LWLockAcquire(WALInsertLock, LW_EXCLUSIVE); - if (XLogCtl-Insert.forcePageWrites) + if (exclusive) { - LWLockRelease(WALInsertLock); - ereport(ERROR, -(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), - errmsg(a backup is already in progress), - errhint(Run pg_stop_backup() and try again.))); + if (XLogCtl-Insert.exclusiveBackup) + { + LWLockRelease(WALInsertLock); + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg(a backup is already in progress), + errhint(Run pg_stop_backup() and try again.))); + } + XLogCtl-Insert.exclusiveBackup = true; } - XLogCtl-Insert.forcePageWrites = true; + XLogCtl-Insert.forcePageWrites++; LWLockRelease(WALInsertLock); /* @@ -8393,7 +8425,7 @@
Re: [HACKERS] reviewers needed!
On Tue, Jan 11, 2011 at 9:17 PM, Robert Haas robertmh...@gmail.com wrote: [ abject plea for reviewers ] So far I have 6 people who have volunteered to be round-robin reviewers, and 7 people who are listed as reviewers on the CF site already. That leaves 45 patches without a reviewer, plus whatever comes in in the next day or so. This is not going to work unless a lot more people pitch in. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reviewers needed!
So far I have 6 people who have volunteered to be round-robin reviewers, and 7 people who are listed as reviewers on the CF site already. That leaves 45 patches without a reviewer, plus whatever comes in in the next day or so. This is not going to work unless a lot more people pitch in. I'll be joining as an RRR on the 24th. I'm too booked before then. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 02:21:44PM -0500, Tom Lane wrote: David Fetter da...@fetter.org writes: I get that we can't prevent all pilot error, but I was hoping we could bullet-proof this a little more, especially in light of a certain extremely popular server OS's OOM killer's default behavior. Yes, I get that that behavior is crazy, and stupid, and that people should shut it off, but it *is* our problem if we let the postmaster start (or continue) when it's set that way. Packagers who are paying attention have fixed that ;-) Are we privileging packaged over unpackaged? Some distro over others? ;) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] auto-sizing wal_buffers
On Thu, Jan 6, 2011 at 11:37 PM, Greg Smith g...@2ndquadrant.com wrote: If it defaulted to 3% of shared_buffers, min 64K max 16MB for the auto setting, it would for the most part become an autotuned parameter. That would make it 0.75 to 1MB at the standard anemic Linux default kernel parameters. Maybe more than some would like, but dropping shared_buffers from 24MB to 23MB to keep this from being ridiculously undersized is probably a win. That percentage would reach 16MB by the time shared_buffers was increased to 533MB, which also seems about right to me. On a really bad setup (brief pause to flip off Apple) with only 4MB to work with total, you'd end up with wal_buffers between 64 and 128K, so very close to the status quo. Code that up, and we could probably even remove the parameter as a tunable altogether. Very few would see a downside relative to any sensible configuration under the current situation, and many people would notice better automagic performance with one less parameter to tweak. Given the recent investigations about the serious downsides of tiny wal_buffers values on new Linux kernels when using open_datasync, a touch more aggression about this setting seems particularly appropriate to consider now. That's been swapped out as the default, but it's still possible people will switch to it. Would anyone like to argue vigorously for or against the above proposal? I'll start: I think this is a good idea. I don't have a strong opinion on whether the exact details of Greg proposes above are precisely optimal, but I think they're in the right ballpark. Furthermore, we already have other things that are tuned in somewhat similar ways (e.g. the size of the fsync request queue defaults to the number of shared buffers) so there's precedent for it. It's one less parameter that you have to set to make things just work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Thu, Jan 13, 2011 at 23:19, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 6, 2011 at 11:37 PM, Greg Smith g...@2ndquadrant.com wrote: If it defaulted to 3% of shared_buffers, min 64K max 16MB for the auto setting, it would for the most part become an autotuned parameter. That would make it 0.75 to 1MB at the standard anemic Linux default kernel parameters. Maybe more than some would like, but dropping shared_buffers from 24MB to 23MB to keep this from being ridiculously undersized is probably a win. That percentage would reach 16MB by the time shared_buffers was increased to 533MB, which also seems about right to me. On a really bad setup (brief pause to flip off Apple) with only 4MB to work with total, you'd end up with wal_buffers between 64 and 128K, so very close to the status quo. Code that up, and we could probably even remove the parameter as a tunable altogether. Very few would see a downside relative to any sensible configuration under the current situation, and many people would notice better automagic performance with one less parameter to tweak. Given the recent investigations about the serious downsides of tiny wal_buffers values on new Linux kernels when using open_datasync, a touch more aggression about this setting seems particularly appropriate to consider now. That's been swapped out as the default, but it's still possible people will switch to it. Would anyone like to argue vigorously for or against the above proposal? I'll start: I think this is a good idea. I don't have a strong opinion on whether the exact details of Greg proposes above are precisely optimal, but I think they're in the right ballpark. Furthermore, we already have other things that are tuned in somewhat similar ways (e.g. the size of the fsync request queue defaults to the number of shared buffers) so there's precedent for it. It's one less parameter that you have to set to make things just work. +1, I like the idea. Would it still be there to override if necessary? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Thu, Jan 13, 2011 at 5:29 PM, Magnus Hagander mag...@hagander.net wrote: +1, I like the idea. Would it still be there to override if necessary? Depends what people want to do. We could make the default 0kB, and define that to mean auto-tune, or we could remove the parameter altogether. I think I was envisioning the latter, but if people are hesitant to do that we could do the former instead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Jan13, 2011, at 21:42 , Tom Lane wrote: Aidan Van Dyk ai...@highrise.ca writes: If postmaster has a few fds to spare, what about having it open a pipe to every child it spawns. It never has to read/write to it, but postmaster closing will signal the client's fd. The client just has to pop the fd into whatever nrmal poll/select event handlign it uses to notice when the parent's pipe is closed. Hmm. Or more generally: there's one FIFO. The postmaster holds both sides open. Backends hold the write side open. (They can close the read side, but that would just be to free up a FD.) Background children close the write side. Now a background process can use EOF on the read side of the FIFO to tell it that postmaster and all backends have exited. You still don't get a signal, but at least the condition you're testing for is the one we actually want and not an approximation. I was thinking along a similar line, and put together small test case to prove that this actually works. The attached test program simulates the interactions of a parent process (think postmaster), some utility processes (think walwriter, bgwriter, ...) and some backends. It uses two pairs of fd created with pipe(), called LifeSignParent and LifeSignParentBackends. The writing end of the former is held open only in the parent process, while the writing end of the latter is held open in the parent process and all regular backend processes. Backend processes use select() to monitor the reading end of the LifeSignParent fd pair. Since nothing is ever written to the writing end, the fd becomes readable only when the parent exits, because that is how select() signals EOF. Once that happens the backend exits. The utility processes do the same, but monitor the reading end of LifeSignParentBackends, and thus exit only after the parent and all regular backends have died. Since the lifesign checking uses select(), any place that already uses select can easily check for vanishing life signs. CHECK_FOR_INTERRUPTS could simply check the life sign once every few seconds. If we want an absolutely reliable signal instead of checking in CHECK_FOR_INTERRUPTS, every backend would need to launch a monitor subprocess which monitors the life sign, and exits once it vanishes. The backend would then get a SIGCHLD once the postmaster dies. Seems like overkill, though. The whole thing won't work on Windows, since even if it's got a pipe() or socketpair() call, with EXEC_BACKEND there's no way of transferring these fds to the child processes. AFAIK, however, Windows has other means with which such life signs can be implemented. For example, I seem to remember that WaitForMultipleObjects() can be used to wait for process-related events. But windows really isn't my area of expertise... I have tested this on the latest Ubunutu LTS release (10.04.1) as well as Mac OS X 10.6.6, and it seems to work correctly on both systems. I'd be happy to hear from anyone who has access to other systems on whether this works or not. The expected output is Launched utility 5095 Launched backend 5097 Launched utility 5096 Launched backend 5099 Launched backend 5098 Utility 5095 detected live parent or backend Backend 5097 detected live parent Utility 5096 detected live parent or backend Backend 5099 detected live parent Backend 5098 detected live parent Parent exiting Backend 5097 exiting after parent died Backend 5098 exiting after parent died Backend 5099 exiting after parent died Utility 5096 exiting after parent and backends died Utility 5095 exiting after parent and backends died Everything after Parent exiting might be interleaved with a shell prompt, of course. best regards, Florian Pflug liveness.c Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
Depends what people want to do. We could make the default 0kB, and define that to mean auto-tune, or we could remove the parameter altogether. I think I was envisioning the latter, but if people are hesitant to do that we could do the former instead. Unfortunately, we might still need a manual parameter for override because of the interaction between wal_buffers and synchronous_commit=off, since it sets the max size of the unflushed data buffer. Discuss? And the auto setting should be -1, not 0kB. We use -1 for use default for several other GUCs. Other than that, I think Greg's numbers are fine, and strongly support having one less thing to tune. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Thu, Jan 13, 2011 at 6:02 PM, Josh Berkus j...@agliodbs.com wrote: Depends what people want to do. We could make the default 0kB, and define that to mean auto-tune, or we could remove the parameter altogether. I think I was envisioning the latter, but if people are hesitant to do that we could do the former instead. Unfortunately, we might still need a manual parameter for override because of the interaction between wal_buffers and synchronous_commit=off, since it sets the max size of the unflushed data buffer. Discuss? Do we have any evidence there's actually a problem in that case, or that a larger value of wal_buffers solves it? I mean, the background writer is going to start a background flush as quickly as it can... And the auto setting should be -1, not 0kB. We use -1 for use default for several other GUCs. No can do. Gotta have things in the same units. Other than that, I think Greg's numbers are fine, and strongly support having one less thing to tune. OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Do magic using pg_depend
https://github.com/gluefinance/pov/blob/master/doc/example_database.sql Please feel free to put any of this on the PostgreSQL wiki like suggested by David Fetter. This is an example of some functionality provided and used by pov (PostgreSQL Object version control system). Most of, if not all, this stuff has already been implemented in pg_dump, but since pov is a SQL-based system it was necessary to implement the same functionality using only sql/plpgsql/plperl. Author: Joel Jacobson, Glue Finance AB, Sweden, j...@gluefinance.com Datestamp: 2011-01-13 23:42 Europe/Stockholm License: MIT (http://www.opensource.org/licenses/mit-license.php) We will learn how to do a lot of PostgreSQL-magic only by using the nice system table pg_depend. Today we will, a) create nice directional graphs of all object dependencies, b) sort all objects in a truly sorted topological creatable order, c) show create/drop commands for most of the objects. Let the show begin! Installation: $ git clone g...@github.com:gluefinance/pov.git $ cd pov $ sh install_example_database.sh a) Generate directional graph in DOT-format. COPY (SELECT diagraph FROM pov.pg_depend_dot) TO '/tmp/example_database.dot'; Then use the dot (http://www.graphviz.org/) to generate graphs in svg, png, or any format. dot -oexample_database.png -Tpng example_database.dot dot -oexample_database.svg -Tsvg example_database.dot Or view it in the SQL prompt: test=# select * from pov.pg_depend_dot; diagraph - digraph pg_depend { function plpgsql_call_handler() 1255.11599.0 - language plpgsql 2612.11602.0 [color=black label=n] function plpgsql_inline_handler(internal) 1255.11600.0 - language plpgsql 2612.11602.0 [color=black label=n] function plpgsql_validator(oid) 1255.11601.0 - language plpgsql 2612.11602.0 [color=black label=n] function plperl_call_handler() 1255.23562.0 - language plperl 2612.23565.0 [color=black label=n] function plperl_inline_handler(internal) 1255.23563.0 - language plperl 2612.23565.0 [color=black label=n] function plperl_validator(oid) 1255.23564.0 - language plperl 2612.23565.0 [color=black label=n] function f1(integer) 1255.23656.0 - view v4 1259.23688.0 [color=black label=n] function f1(integer) 1255.23656.0 - constraint t3_id_check on table t3 2606.23673.0 [color=black label=n] table t1 1259.23651.0 - table t1 column id 1259.23651.1 [color=yellow label=an] table t1 column id 1259.23651.1 - view v1 1259.23676.0 [color=black label=n] table t1 column id 1259.23651.1 - constraint t1_pkey on table t1 2606.23655.0 [color=blue label=a] table t1 column id 1259.23651.1 - constraint t2_id_fkey on table t2 2606.23664.0 [color=black label=n] sequence s1 1259.23657.0 - default for table t3 column id 2604.23672.0 [color=black label=n] table t2 1259.23659.0 - table t2 column id 1259.23659.1 [color=yellow label=an] table t2 column id 1259.23659.1 - view v2 1259.23680.0 [color=black label=n] table t2 column id 1259.23659.1 - constraint t2_pkey on table t2 2606.23663.0 [color=blue label=a] table t2 column id 1259.23659.1 - constraint t2_id_fkey on table t2 2606.23664.0 [color=blue label=a] table t3 1259.23669.0 - table t3 column id 1259.23669.1 [color=yellow label=an] table t3 column id 1259.23669.1 - default for table t3 column id 2604.23672.0 [color=blue label=a] table t3 column id 1259.23669.1 - constraint t3_id_check on table t3 2606.23673.0 [color=yellow label=na] table t3 column id 1259.23669.1 - constraint t3_pkey on table t3 2606.23675.0 [color=blue label=a] view v1 1259.23676.0 - view v1 column id 1259.23676.1 [color=black label=n] view v1 column id 1259.23676.1 - view v3 1259.23684.0 [color=black label=n] view v2 1259.23680.0 - view v2 column id 1259.23680.1 [color=black label=n] view v2 column id 1259.23680.1 - view v3 1259.23684.0 [color=black label=n] view v3 1259.23684.0 - view v3 column id1 1259.23684.1 [color=black label=n] view v3 1259.23684.0 - view v3 column id2 1259.23684.2 [color=black label=n] view v3 column id1 1259.23684.1 - view v4 1259.23688.0 [color=black label=n] view v3 column id2 1259.23684.2 - view v4 1259.23688.0 [color=black label=n] constraint t1_pkey on table t1 2606.23655.0 - constraint t2_id_fkey on table t2 2606.23664.0 [color=black label=n] schema public 2615.2200.0 - function f1(integer) 1255.23656.0 [color=black label=n] schema public 2615.2200.0 - table t1 1259.23651.0 [color=black label=n] schema public 2615.2200.0 - sequence s1 1259.23657.0 [color=black label=n] schema public 2615.2200.0 - table t2 1259.23659.0 [color=black label=n] schema public 2615.2200.0 - table t3 1259.23669.0 [color=black label=n] schema public 2615.2200.0 - view v1
Re: [HACKERS] auto-sizing wal_buffers
Robert, Unfortunately, we might still need a manual parameter for override because of the interaction between wal_buffers and synchronous_commit=off, since it sets the max size of the unflushed data buffer. Discuss? Do we have any evidence there's actually a problem in that case, or that a larger value of wal_buffers solves it? I mean, the background writer is going to start a background flush as quickly as it can... I don't think anyone has done any testing. However, the setting is there and some users might be convinced that they need it. And the auto setting should be -1, not 0kB. We use -1 for use default for several other GUCs. No can do. Gotta have things in the same units. That's certainly not true with, for example, log_temp_files. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
Robert Haas robertmh...@gmail.com wrote: Would anyone like to argue vigorously for or against the above proposal? Greg's numbers look reasonable to me, and there's nobody I'd trust more to come up with reasonable numbers for this. One less tunable is a good thing, especially since this designed to scale from someone slapping it on his laptop for a first quick try, all the way up to industrial strength production environments. I guess a manual override doesn't bother me too much, but I am a bit dubious of its value, and there is value in keeping the GUC count down -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, 2011-01-13 at 11:14 -0800, David Fetter wrote: I get that we can't prevent all pilot error, but I was hoping we could bullet-proof this a little more, especially in light of a certain extremely popular server OS's OOM killer's default behavior. That's a good point. I'm not sure how much action can reasonably be taken, however. Yes, I get that that behavior is crazy, and stupid, and that people should shut it off, but it *is* our problem if we let the postmaster start (or continue) when it's set that way. As an aside, linux has actually changed the heuristic: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a63d83f427fbce97a6cea0db2e64b0eb8435cd10 Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 03:29:13PM -0800, Jeff Davis wrote: On Thu, 2011-01-13 at 11:14 -0800, David Fetter wrote: I get that we can't prevent all pilot error, but I was hoping we could bullet-proof this a little more, especially in light of a certain extremely popular server OS's OOM killer's default behavior. That's a good point. I'm not sure how much action can reasonably be taken, however. We may find out from Florian's experiments :) Yes, I get that that behavior is crazy, and stupid, and that people should shut it off, but it *is* our problem if we let the postmaster start (or continue) when it's set that way. As an aside, linux has actually changed the heuristic: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a63d83f427fbce97a6cea0db2e64b0eb8435cd10 Great! In a decade or so, no more servers will be running with an earlier kernel ;) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error code for terminating connection due to conflict with recovery
On Thu, Jan 13, 2011 at 2:13 AM, Tatsuo Ishii is...@postgresql.org wrote: Ok. Here is the patch for this. I use 40P02, instead of 40004. Please add this to the currently open CommitFest: https://commitfest.postgresql.org/action/commitfest_view/open Done. Comments are welcome. Unless there's objection, I will commit it this weekend. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error code for terminating connection due to conflict with recovery
Tatsuo Ishii is...@postgresql.org writes: Please add this to the currently open CommitFest: https://commitfest.postgresql.org/action/commitfest_view/open Done. Comments are welcome. Unless there's objection, I will commit it this weekend. If you're expecting anyone to actually *review* it during the CF, that's a bit premature. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] arrays as pl/perl input arguments [PATCH]
On Thu, Jan 13, 2011 at 2:06 AM, Martijn van Oosterhout klep...@svana.org wrote: I played with this a little and it is fairly easy to make a variable such that $a is the string representation and $a[0] the first value of the array. The problem is that you can't pass such a variable into a subroutine. I played with this too: #!/usr/bin/perl -w use strict; package Pg::ArrayArg; use overload ''= \as_s, '@{}' = \as_a; sub new { my $proto = shift; my $class = ref $proto || $proto; bless { string = shift, array = shift }, $class; } sub as_s { shift-{ 'string' }; } sub as_a { shift-{ 'array' }; } package main; my $aa = Pg::ArrayArg-new( '{1,2}', [ 1, 2 ] ); printf ref = %s\n, ref $aa; print string = $aa\n; printf string = %s\n, $aa; printf array index = (%s, %s)\n, $aa-[ 0 ], $aa-[ 1 ]; printf array_ref = %s\n, scalar @$aa; print regexp test = ; if ($aa =~ /^{(.*)}$/) { print looks like array\n; printf join of split = %s\n, join ';', split /,/, $1; } else { print doesn't look like array\n; } Suppose one of these compatibility objects is passed into legacy code as $_[0]. The problem is that 'ref $_[0]' will return 'Pg::ArrayArg' instead of what it used to, '' (empty string). Other than that, I think it performs as people would expect. You could even change 'as_s' to generate the string on the fly as requested instead of generating both representations at instantiation. Just my $0.02. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 8
Anssi Kääriäinenanssi.kaariai...@thl.fi wrote: I think I found a problem. This is using SSI v8. The table definition: create table test_t (id integer, val1 text, val2 integer); create index test_idx on test_t(id) where val2 = 1; insert into test_t (select generate_series(0, 1), 'a', 2); insert into test_t (select generate_series(0, 10), 'a', 1); T1: hot2= begin transaction isolation level serializable; hot2= select * from test_t where val2 = 1; hot2= update test_t set val2 = 2 where val2 = 1 and id = 10; T2: hot2= begin transaction isolation level serializable; hot2= select * from test_t where val2 = 1; hot2= update test_t set val2 = 2 where val2 = 1 and id = 9; hot2= commit; T1: hot2= commit; I hope you have no objection to having the code you wrote included in the test suite which is part of the patch. Well, if you do, I'll pull it back out and invent something similar... ;-) http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=2502cccbdd5e5d44be469549b91fe49c0554ec3e -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] arrays as pl/perl input arguments [PATCH]
On Jan 13, 2011, at 4:15 PM, Stephen J. Butler wrote: Suppose one of these compatibility objects is passed into legacy code as $_[0]. The problem is that 'ref $_[0]' will return 'Pg::ArrayArg' instead of what it used to, '' (empty string). Other than that, I think it performs as people would expect. Well, frankly, since up to this patch you *never* got an ARRAY reference argument, who would be calling `ref` on it anyway? You could even change 'as_s' to generate the string on the fly as requested instead of generating both representations at instantiation. Yep. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 5:29 PM, Magnus Hagander mag...@hagander.net wrote: +1, I like the idea. Would it still be there to override if necessary? Depends what people want to do. We could make the default 0kB, and define that to mean auto-tune, or we could remove the parameter altogether. I think I was envisioning the latter, but if people are hesitant to do that we could do the former instead. I think we need to keep the override capability until the autotune algorithm has proven itself in the field for a couple of years. I agree with Josh that a negative value should be used to select the autotune method. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error code for terminating connection due to conflict with recovery
Tatsuo Ishii is...@postgresql.org writes: Please add this to the currently open CommitFest: https://commitfest.postgresql.org/action/commitfest_view/open Done. Comments are welcome. Unless there's objection, I will commit it this weekend. If you're expecting anyone to actually *review* it during the CF, that's a bit premature. No problem to wait for longer. I will wait by the end of January for the present. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Killing active sessions when it's not absolutely necessary is not an asset. That's a highly arguable point and I certainly don't agree with it. Your examples appear to rely on the assumption that background processes exit instantly when the postmaster dies. Which they should not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Jan14, 2011, at 01:32 , Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Killing active sessions when it's not absolutely necessary is not an asset. That's a highly arguable point and I certainly don't agree with it. Your examples appear to rely on the assumption that background processes exit instantly when the postmaster dies. Which they should not. Even if they stay around, no new connections will be possible once the postmaster is gone. So this really comes down to what somebody perceives to be a bigger problem - new connections failing or existing connections being terminated. I don't believe there's one right answer to that. Assume postgres is driving a website, and the postmaster crashes shortly after a pg_dump run started. You probably won't want your website to be offline while pg_dump is finishing its backup. If, on the other hand, your data warehousing database is running a multi-hour query, you might prefer that query to finish, even at the price of not being able to accept new connections. So maybe there should be a GUC for this? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Florian Pflug f...@phlo.org writes: I don't believe there's one right answer to that. Right. Force-kill presumes there is only one right answer. Assume postgres is driving a website, and the postmaster crashes shortly after a pg_dump run started. You probably won't want your website to be offline while pg_dump is finishing its backup. If, on the other hand, your data warehousing database is running a multi-hour query, you might prefer that query to finish, even at the price of not being able to accept new connections. So maybe there should be a GUC for this? No need (and rather inflexible anyway). If you don't want an orphaned backend to continue, you send it SIGTERM. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 7:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Killing active sessions when it's not absolutely necessary is not an asset. That's a highly arguable point and I certainly don't agree with it. Your examples appear to rely on the assumption that background processes exit instantly when the postmaster dies. Which they should not. But they do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Florian Pflug f...@phlo.org writes: I don't believe there's one right answer to that. Right. Force-kill presumes there is only one right answer. Assume postgres is driving a website, and the postmaster crashes shortly after a pg_dump run started. You probably won't want your website to be offline while pg_dump is finishing its backup. If, on the other hand, your data warehousing database is running a multi-hour query, you might prefer that query to finish, even at the price of not being able to accept new connections. So maybe there should be a GUC for this? No need (and rather inflexible anyway). If you don't want an orphaned backend to continue, you send it SIGTERM. It is not easy to make this work in such a way that you can ensure a clean, automatic restart of PostgreSQL after a postmaster death. Which is what at least some people want. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Florian Pflug f...@phlo.org writes: So maybe there should be a GUC for this? No need (and rather inflexible anyway). If you don't want an orphaned backend to continue, you send it SIGTERM. It is not easy to make this work in such a way that you can ensure a clean, automatic restart of PostgreSQL after a postmaster death. Which is what at least some people want. True. It strikes me also that the postmaster does provide some services other than accepting new connections: * ensuring that everybody gets killed if a backend crashes * respawning autovac launcher and other processes that might exit harmlessly * is there still any cross-backend signaling that goes through the postmaster? We got rid of the sinval case, but I don't recall if there's others. While you could probably live without these in the scenario of let my honking big query finish before restarting, you would not want to do without them in unattended operation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] kill -KILL: What happens?
On Thu, Jan 13, 2011 at 8:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Florian Pflug f...@phlo.org writes: So maybe there should be a GUC for this? No need (and rather inflexible anyway). If you don't want an orphaned backend to continue, you send it SIGTERM. It is not easy to make this work in such a way that you can ensure a clean, automatic restart of PostgreSQL after a postmaster death. Which is what at least some people want. True. It strikes me also that the postmaster does provide some services other than accepting new connections: * ensuring that everybody gets killed if a backend crashes * respawning autovac launcher and other processes that might exit harmlessly * is there still any cross-backend signaling that goes through the postmaster? We got rid of the sinval case, but I don't recall if there's others. While you could probably live without these in the scenario of let my honking big query finish before restarting, you would not want to do without them in unattended operation. Yep. I'm pretty doubtful that you're going to want them even in that case, but you're surely not going to want them in unattended operation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing GIN for empty/null/full-scan cases
Robert Haas wrote: On Tue, Jan 4, 2011 at 4:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jan 4, 2011 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: * Existing GIN indexes are upwards compatible so far as on-disk storage goes, but they will of course be missing entries for empty, null, or null-containing items. ?Users who want to do searches that should find such items will need to reindex after updating to 9.1. This is the only part of this proposal that bothers me a little bit. It would be nice if the system could determine whether a GIN index is upgraded from 9.0 or earlier and thus doesn't contain these entries - and avoid trying to use the index for these sorts of queries in cases where it might return wrong answers. I don't think it's really worth the trouble. ?The GIN code has been broken for these types of queries since day one, and yet we've had only maybe half a dozen complaints about it. ?Moreover there's no practical way to avoid trying to use the index, since in many cases the fact that a query requires a full-index scan isn't determinable at plan time. The best we could really do is throw an error at indexscan start, and that doesn't seem all that helpful. ?But it probably wouldn't take much code either, if you're satisfied with that answer. ?(I'm envisioning adding a version ID to the GIN metapage and then checking that before proceeding with a full-index scan.) I'd be satisfied with that answer. It at least makes it a lot more clear when you've got a problem. If this were a more common scenario, I'd probably advocate for a better solution, but the one you propose seems adequate given the frequency of the problem as you describe it. What does pg_upgrade need to do about this for 9.1? Just tell people they might get an GIN error someday? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers