date:20110113

Re: [HACKERS] arrays as pl/perl input arguments [PATCH]

2011-01-13 Thread Martijn van Oosterhout

On Thu, Jan 13, 2011 at 12:06:33AM -0700, Alex Hunsaker wrote:
  I had supposed that it would be possible to do the string conversion
  lazily, ie, only if the string value was actually demanded.
 
 Yep, In-fact if we wanted we could even die (or throw an exception in
 other language speak :) ) when the string value is demanded.

I played with this a little and it is fairly easy to make a variable
such that $a is the string representation and $a[0] the first value of
the array. The problem is that you can't pass such a variable into a
subroutine.

I was thinking however, if the parameters if the function have names
you can use, then you can make it work. $_[0] would still go the old
way, but the named parameters could be the array.

== cut ==
#!/usr/bin/perl -w
use strict;
no strict 'vars';
package MyClass;

sub TIESCALAR {
   my $class = shift;
   my $self = shift;
   return bless $self, $class;
}

sub FETCH {
   my $self = shift;
   return join(,, @$self);
}

my @a=(1,2);

tie $a, MyClass, \@a;

print \$a='$a'\n;
print \$a[0]='$a[0]'\n;


-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature

[HACKERS] Bug in walreceiver

2011-01-13 Thread Fujii Masao

Hi,

When the master shuts down or crashes, there seems to be
the case where walreceiver exits without flushing WAL which
has already been written. This might lead startup process to
replay un-flushed WAL and break a Write-Ahead-Logging rule.

walreceiver.c
   /* Wait a while for data to arrive */
   if (walrcv_receive(NAPTIME_PER_CYCLE, type, buf, len))
   {
   /* Accept the received data, and process it */
   XLogWalRcvProcessMsg(type, buf, len);

   /* Receive any more data we can without sleeping */
   while (walrcv_receive(0, type, buf, len))
   XLogWalRcvProcessMsg(type, buf, len);

   /*
* If we've written some records, flush them to disk 
 and let the
* startup process know about them.
*/
   XLogWalRcvFlush();
   }

The problematic case happens when the latter walrcv_receive
emits ERROR. In this case, the WAL received by the former
walrcv_receive is not guaranteed to have been flushed yet.

The attached patch ensures that all WAL received is flushed to
disk before walreceiver exits. This patch should be backported
to 9.0, I think.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


flush_before_walreceiver_exit_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walreceiver

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 10:28, Fujii Masao wrote:

When the master shuts down or crashes, there seems to be
the case where walreceiver exits without flushing WAL which
has already been written. This might lead startup process to
replay un-flushed WAL and break a Write-Ahead-Logging rule.


Hmm, that can happen at a crash even with no replication involved. If 
you kill -9 postmaster, and some WAL had been written but not fsync'd, 
on crash recovery we will happily recover the unsynced WAL. We could 
prevent that by fsyncing all WAL before applying it - presumably 
fsyncing a file that has already been flushed is quick. But is it worth 
the trouble?



walreceiver.c

/* Wait a while for data to arrive */
if (walrcv_receive(NAPTIME_PER_CYCLE,type,buf,len))
{
/* Accept the received data, and process it */
XLogWalRcvProcessMsg(type, buf, len);

/* Receive any more data we can without sleeping */
while (walrcv_receive(0,type,buf,len))
XLogWalRcvProcessMsg(type, buf, len);

/*
 * If we've written some records, flush them to disk 
and let the
 * startup process know about them.
 */
XLogWalRcvFlush();
}


The problematic case happens when the latter walrcv_receive
emits ERROR. In this case, the WAL received by the former
walrcv_receive is not guaranteed to have been flushed yet.

The attached patch ensures that all WAL received is flushed to
disk before walreceiver exits. This patch should be backported
to 9.0, I think.


Yeah, we probably should do that, even though it doesn't completely 
close the window tahat unsynced WAL is replayed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug in pg_dump

2011-01-13 Thread Joel Jacobson

The example from Tom Lane below results in a database which is not
possible to correctly dump using pg_dump.

The view v1 strangely becomes a table in the dump output?!

It's probably a quite useless database to dump in the first place, but
that is no excuse to generate an invalid dump, it would be better to
throw an exception and complain about your database is retarded,
refusing to dump or something like that.

regression=# \d
List of relations
 Schema | Name | Type  |  Owner
+--+---+--
 public | tt   | table | postgres
 public | v1   | view  | postgres
 public | v2   | view  | postgres
(3 rows)

ubuntu@ubuntu:/crypt/postgresql-9.1alpha3/src/bin/pg_dump$ ./pg_dump
regression | grep -v -E '^--' | grep -E '^.+$' | grep -v SET
CREATE OR REPLACE PROCEDURAL LANGUAGE plpgsql;
ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO ubuntu;
CREATE TABLE tt (
f1 integer,
f2 integer
);
ALTER TABLE public.tt OWNER TO postgres;
CREATE TABLE v1 (
f1 integer,
f2 integer
);
ALTER TABLE public.v1 OWNER TO postgres;
CREATE VIEW v2 AS
SELECT v1.f1, v1.f2 FROM v1;
ALTER TABLE public.v2 OWNER TO postgres;
COPY tt (f1, f2) FROM stdin;
\.
CREATE RULE _RETURN AS ON SELECT TO v1 DO INSTEAD SELECT v2.f1, v2.f2 FROM v2;
REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM ubuntu;
GRANT ALL ON SCHEMA public TO ubuntu;
GRANT ALL ON SCHEMA public TO PUBLIC;



2011/1/12 Tom Lane t...@sss.pgh.pa.us:
 regression=# create table tt(f1 int, f2 int);
 CREATE TABLE
 regression=# create view v1 as select * from tt;
 CREATE VIEW
 regression=# create view v2 as select * from v1;
 CREATE VIEW
 regression=# create or replace view v1 as select * from v2;
 CREATE VIEW
 regression=# drop view v1;
 ERROR:  cannot drop view v1 because other objects depend on it
 DETAIL:  view v2 depends on view v1
 HINT:  Use DROP ... CASCADE to drop the dependent objects too.
 regression=# drop view v2;
 ERROR:  cannot drop view v2 because other objects depend on it
 DETAIL:  view v1 depends on view v2
 HINT:  Use DROP ... CASCADE to drop the dependent objects too.

 This isn't particularly *useful*, maybe, but it's hardly impossible.
 And if we analyzed function dependencies in any detail, circular
 dependencies among functions would be possible (and useful).

                        regards, tom lane


-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 02:01, Kevin Grittner wrote:

Anssi Kääriäinenanssi.kaariai...@thl.fi  wrote:


So, count(*) queries are more than twice as slow compared to the
old serializable transaction isolation level.


I got this down from more than twice the run time to running 33%
longer through remembering the last relation for which a search for
a predicate lock held by the current transaction found a match at
the coarsest (relation) level.  It's a bit of a hack and 33% isn't
very impressive, even for a worst case (and this is one type of
worst case) -- especially given how often people use SELECT count(*)
FROM table_x as a performance test.  :-(

I can see a way to improve on this if there's a low-cost way to
determine from within the heapam.c:heapgettup_pagemode function
whether it's returning tuples for a table scan.  It seems likely
that this is somehow contained in the HeapScanDesc structure, but
I'm not seeing it.  Can anyone point me in the right direction, or
tell me that this avenue is a dead end?


Pardon my ignorance, but where exactly is the extra overhead coming 
from? Searching for a predicate lock?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 04:29, Itagaki Takahiro wrote:

On Thu, Jan 13, 2011 at 00:14, Fujii Masaomasao.fu...@gmail.com  wrote:

pg_ctl failover ? At the moment, the location of the trigger file is
configurable, but if we accept a constant location like $PGDATA/failover
pg_ctl could do the whole thing, create the file and send signal. pg_ctl on
Window already knows how to send the signal via the named pipe signal
emulation.


The attached patch implements the above-mentioned pg_ctl failover.


I have three comments:
- Will we call it failover? We will use the command also in switchover
   operations. pg_ctl promote might be more neutral, but users might be
   hard to imagine replication feature from promote.


I agree that failover or even switchover is too specific. You might 
want promote a server even if you keep the old master still running, if 
you're creating a temporary copy of the master repository for testing 
purposes etc.


+1 for promote. People unfamiliar with the replication stuff might not 
immediately understand that it's related to replication, but they 
wouldn't have any use for the option anyway. It should be clear to 
anyone who needs it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-01-13 Thread Shigeru HANADA

On Fri, 7 Jan 2011 10:57:17 +0900
Itagaki Takahiro itagaki.takah...@gmail.com wrote:
 On Mon, Dec 20, 2010 at 20:42, Itagaki Takahiro
 itagaki.takah...@gmail.com wrote:
  I added comments and moved some setup codes for COPY TO to BeginCopyTo()
  for maintainability. CopyTo() still contains parts of initialization,
  but I've not touched it yet because we don't need the arrangement for now.
 
 I updated the COPY FROM API patch.
 - GetCopyExecutorState() is removed because FDWs will use their own context.

I rebased file_fdw patch to recent copy_export patch, and have some
comments.

 The patch just rearranges codes for COPY FROM to export those functions.
 It also modifies some of COPY TO codes internally for code readability.
 - BeginCopyFrom(rel, filename, attnamelist, options)
 - EndCopyFrom(cstate)
 - NextCopyFrom(cstate, OUT values, OUT nulls, OUT tupleOid)
 - CopyFromErrorCallback(arg)

This API set seems to be enough to implement file_fdw using COPY
routines.

But EndCopyFrom() seems not to be able to release memory which is
allocated in BeginCopy() and BeginCopyFrom().  I found this behavior
by executing a query which generates nested loop plan (outer 100
row * inner 10 row), and at last postgres grows up to 300MB+ from
108MB (VIRT of top command). 

Attached patch would avoid this leak by adding per-copy context to
CopyState.  This would be overkill, and ResetCopyFrom() might be
reasonable though.

Anyway, I couldn't find performance degrade with this patch (tested on
my Linux box).

==
# csv_accounts and csv_branches are generated by:
1) pgbench -i -s 10
2) COPY pgbench_accounts to '/path/to/accounts.csv' WITH CSV;
3) COPY pgbench_branches to '/path/to/branches.csv' WITH CSV;

Original (There is no memory swap during measurement) 
postgres=# explain analyze select * from csv_accounts b, csv_branches t where 
t.bid = b.bid;
   QUERY PLAN
-
 Nested Loop  (cost=0.00..11717.01 rows=1 width=200) (actual 
time=0.300..100833.057 rows=100 loops=1)
   Join Filter: (b.bid = t.bid)
   -  Foreign Scan on csv_accounts b  (cost=0.00..11717.00 rows=1 width=100) 
(actual time=0.148..4437.595 rows=100 loops=1)
   -  Foreign Scan on csv_branches t  (cost=0.00..0.00 rows=1 width=100) 
(actual time=0.014..0.039 rows=10 loops=100)
 Total runtime: 102882.308 ms
(5 rows)

Patched, Using per-copy context to release memory
postgres=# explain analyze select * from csv_accounts b, csv_branches t where 
t.bid = b.bid;
   QUERY PLAN
-
 Nested Loop  (cost=0.00..11717.01 rows=1 width=200) (actual 
time=0.226..100931.864 rows=100 loops=1)
   Join Filter: (b.bid = t.bid)
   -  Foreign Scan on csv_accounts b  (cost=0.00..11717.00 rows=1 width=100) 
(actual time=0.085..4439.777 rows=100 loops=1)
   -  Foreign Scan on csv_branches t  (cost=0.00..0.00 rows=1 width=100) 
(actual time=0.015..0.039 rows=10 loops=100)
 Total runtime: 102684.276 ms
(5 rows)
==

This memory leak would not be problem when using from COPY command
because it handles only one CopyState in a query, and it will be
cleaned up with parent context.

 Some items to be considered:
 - BeginCopyFrom() could receive filename as an option instead of a separated
 argument. If do so, file_fdw would be more simple, but it's a change only for
 file_fdw. COPY commands in the core won't be improved at all.

ISTM that current design would be better.

 - NextCopyFrom() returns values/nulls arrays rather than a HeapTuple. I expect
 the caller store the result into tupletableslot with ExecStoreVirtualTuple().
 It is designed for performance, but if the caller always needs an materialized
 HeapTuple, HeapTuple is better for the result type.

I tried to add tableoid to TupleTableSlot as tts_tableoid, but it
seems to make codes such as slot_getaddr() and other staff tricky.

How about to implement using materialized tuples to avoid unnecessary
(at least for functionality) changes.  I would like to send this
virtual-tuple-optimization to next development cycle because it would
not effect the interface heavily.  I'll post materialized-tuple
version of foreign_scan patch soon.

Regards,
--
Shigeru Hanada


20110113-copy_context.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Possible bug in pg_settings/pg_depend

2011-01-13 Thread Joel Jacobson

Are multiple identical entires in pg_depend possible?
If so, how do they occur, and what is the purpose of representing
exactly the same dependency two times in pg_depend?

I expected the following query not to return any rows, but it did:

glue=# select count(*), * from pg_depend group by classid, objid,
objsubid, refclassid, refobjid, refobjsubid, deptype having count(*) 
1;
 count | classid | objid | objsubid | refclassid | refobjid |
refobjsubid | deptype
---+-+---+--++--+-+-
 2 |2618 | 11015 |0 |   1259 |11012 |   1 | n
(1 row)


Debug data:


glue=# select version();
   version
-
 PostgreSQL 9.1alpha3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
(1 row)

glue=# select pg_describe_object(2618,11015,0);
pg_describe_object

 rule pg_settings_u on view pg_settings
(1 row)

glue=# select pg_describe_object(1259,11012,1);  pg_describe_object
--
 view pg_settings column name
(1 row)



-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add function dependencies

2011-01-13 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 That seems pretty silly/broken.  You should only be touching *direct*
 dependencies of the extension, IMO.  If there's something that's missed
 by that algorithm, the way to fix it is to add more direct dependencies
 at extension creation time; not to start a tree walk that is pretty
 nearly guaranteed to land on things that don't belong to the extension.

Well the current patch is walking the tree because that's what I need
for listing extension's objects (in \dx ext), e.g. I want to follow from
an opclass to its functions in that context.

Now I reused this walker for ALTER EXTENSION SET SCHEMA, where it could
well be that I don't need walking down the dependency tree.  Will think
about it and try it (very localised change).  Thanks for comments.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walreceiver

2011-01-13 Thread Fujii Masao

On Thu, Jan 13, 2011 at 5:59 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 13.01.2011 10:28, Fujii Masao wrote:

 When the master shuts down or crashes, there seems to be
 the case where walreceiver exits without flushing WAL which
 has already been written. This might lead startup process to
 replay un-flushed WAL and break a Write-Ahead-Logging rule.

 Hmm, that can happen at a crash even with no replication involved. If you
 kill -9 postmaster, and some WAL had been written but not fsync'd, on
 crash recovery we will happily recover the unsynced WAL.

Right. If postmaster restarts immediately after kill -9, WAL which has not
reached to the disk might be replayed. Then if the server crashes when
min recovery point indicates such an unsynced WAL, the database would
get corrupted.

As you say, that is not just about replication. But that is more likely to
happen in the standby because unsynced WAL appears while recovery
is in progress. This is one of reasons why walreceiver doesn't let the
startup process know that new WAL has arrived before flushing it, I think.

So I believe that the patch is somewhat worth applying.

BTW, another good point of the patch is that we can track the last WAL
receive location correctly. Since WalRcv-receivedUpto is updated
after WAL flush, if the patch is not applied, the location of WAL received
just before walreceiver exits might not be saved in WalRcv-receivedUpto.

 We could prevent
 that by fsyncing all WAL before applying it - presumably fsyncing a file
 that has already been flushed is quick. But is it worth the trouble?

No. It looks overkill though it would completely prevent the problem.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 2:13 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Ok. Here is the patch for this. I use 40P02, instead of 40004.

Please add this to the currently open CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in pg_describe_object, patch v2

2011-01-13 Thread Robert Haas

On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andreas Karlsson andr...@proxel.se writes:
 Here is a very simple change of the patch to make the output look more
 like the syntax of ALTER OPERATOR FAMILY to improve consistency.

 IMO, what this patch needs is to not output the types unless they are
 actually different from the default (which can be inferred from the AM
 type and the function arguments).  That would fix my concern about it
 emitting information that is 99.44% useless.

I guess we could do that, but I don't understand how you're supposed
to infer them, which means probably a lot of other people won't
either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq documentation cleanups (repost 3)

2011-01-13 Thread Robert Haas

On Wed, Jan 12, 2011 at 8:54 PM, Bruce Momjian br...@momjian.us wrote:
 I am also attaching a few more of Leslie's changes that I think are
 useful.  The first clarifies a confusion Leslie had about the fact that
 return is referencing the return value of the function and not the
 value returned in the pointer.

Hmm.  Well, if that's the confusion, I don't think inserting the words
by the function is the right way to fix it - it certainly isn't
returned by anything else.  You could change it to say It is also
possible for *errmsg to be NULL even when the return value is also
NULL; this indicates...

 The second change is, I think, better wording.

OK.

 The third moves the deprecated text to the start of the function
 description.  Leslie pointed out that that is how we do it for other
 libpq functions, so we should move it for consistency.

That seems to me to read pretty awkwardly.  You could perhaps strike
the chunk and the whole first paragraph and simply write PQoidStatus
is an older, deprecated version of PQoidValue.  It returns its result
as a character string rather than an Oid, and is not thread-safe. and
then cut directly to the synopsis.  That would be consistent with what
we've done elsewhere; moving just that one sentence is not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in pg_dump

2011-01-13 Thread Marko Tiikkaja


On 2011-01-13 11:31 AM +0200, Joel Jacobson wrote:

The example from Tom Lane below results in a database which is not
possible to correctly dump using pg_dump.

The view v1 strangely becomes a table in the dump output?!



CREATE RULE _RETURN AS ON SELECT TO v1 DO INSTEAD SELECT v2.f1, v2.f2 FROM v2;


This statement turns the table into a view.


Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_primary_conninfo

2011-01-13 Thread Robert Haas

On Wed, Jan 12, 2011 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote:
 So I'm thinking to make ProcessConfigFile() parse not only postgresql.conf
 but also recovery.conf rather than move all the recovery parameters to
 postgresql.conf.

 Comments?

+1.

Actually moving the settings can be done later in about 5 seconds if
we all agree it's a good idea, but let's not get bogged down in that
now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in pg_dump

2011-01-13 Thread Christian Ullrich


* Joel Jacobson wrote:


The example from Tom Lane below results in a database which is not
possible to correctly dump using pg_dump.

The view v1 strangely becomes a table in the dump output?!


This is no bug, it's a feature (tm).

pg_dump is clever enough to detect the circular dependency and break it 
open by creating v1 in two steps.


A view in PostgreSQL is simply an empty table with an ON SELECT DO 
INSTEAD rule named _RETURN on it. pg_dump first creates the empty 
table, then view v2 depending on that table, and finally the _RETURN 
rule turning v1 into a view and reintroducing the circular dependency.


--
Christian

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 5:00 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 13.01.2011 04:29, Itagaki Takahiro wrote:

 On Thu, Jan 13, 2011 at 00:14, Fujii Masaomasao.fu...@gmail.com  wrote:

 pg_ctl failover ? At the moment, the location of the trigger file is
 configurable, but if we accept a constant location like
 $PGDATA/failover
 pg_ctl could do the whole thing, create the file and send signal. pg_ctl
 on
 Window already knows how to send the signal via the named pipe signal
 emulation.

 The attached patch implements the above-mentioned pg_ctl failover.

 I have three comments:
 - Will we call it failover? We will use the command also in switchover
   operations. pg_ctl promote might be more neutral, but users might be
   hard to imagine replication feature from promote.

 I agree that failover or even switchover is too specific. You might want
 promote a server even if you keep the old master still running, if you're
 creating a temporary copy of the master repository for testing purposes etc.

 +1 for promote. People unfamiliar with the replication stuff might not
 immediately understand that it's related to replication, but they wouldn't
 have any use for the option anyway. It should be clear to anyone who needs
 it.

I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixing GIN for empty/null/full-scan cases

2011-01-13 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 David E. Wheeler da...@kineticode.com writes:
 On Jan 12, 2011, at 4:35 PM, Tom Lane wrote:
 No, what we need is a decent extension package manager ;-)

 Yeah. Maybe you can do that this weekend? Or, I dunno, while you sleep 
 tonight?

 Supposedly it's in the queue for the upcoming CF :-)

Hehe, and some provision have been made to support upgrading from 9.0 to
9.1 too:

  http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html#AEN50748

But that won't solve the dump-from-9.0 and restore-into-9.1 by itself,
the only way for us to solve that problem that I can think of would be
to backpatch a new feature.  Do it the old-way until you upgrade from
9.1 to later might be our answer here.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Walreceiver fsyncs excessively

2011-01-13 Thread Heikki Linnakangas

While testing Fujii-san's patch to flush any already-written WAL on 
error in walreceiver, I added a debugging elog to XLogWalRcvFlush() to 
print out how far it has written and flushed.


I saw an unexpected pattern while the standby catches up with the master:

LOG:  streaming replication successfully connected to primary
LOG:   flushing flush=0/0 write=0/1E02
LOG:   flushing flush=0/1E02 write=0/1E04
LOG:   flushing flush=0/1E04 write=0/1E06
LOG:   flushing flush=0/1E06 write=0/1E08
LOG:   flushing flush=0/1E08 write=0/1E0A
LOG:   flushing flush=0/1E0A write=0/1E0C
LOG:   flushing flush=0/1E0C write=0/1E0E
LOG:   flushing flush=0/1E0E write=0/1E10
LOG:   flushing flush=0/1E10 write=0/1E12
LOG:   flushing flush=0/1E12 write=0/1E14

The master sends the WAL at full-speed, but walreceiver always fsyncs it 
in 128 kB chunks. That's excessive, it should be able to read and write 
to disk the whole WAL segment before flushing.


There's a little flaw in the walreceiver logic that tries to read all 
the available WAL before flushing and sleeping. The way libpqrcv_receive 
is written, when it's called with timeout==0 it will not call 
PQconsumeInput. So what happens is that when walreceiver main loop calls 
libpqrcv_receive() in a loop to fetch all WAL that's available without 
blocking, it actually only reads the WAL that's in the libpq receive 
buffer - it will not read the WAL that's in the TCP read buffer.


Attached patch fixes libpqrcv_receive() so that it calls 
PQconsumeInput() before concluding that there's no data available. The 
excessive fsyncing can lead to very bad performance, so this needs to be 
appled to 9.0 too.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 5aac85d..9e8504b 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -351,28 +351,33 @@ libpqrcv_receive(int timeout, unsigned char *type, char **buffer, int *len)
 		PQfreemem(recvBuf);
 	recvBuf = NULL;
 
-	/*
-	 * If the caller requested to block, wait for data to arrive. But if this
-	 * is the first call after connecting, don't wait, because there might
-	 * already be some data in libpq buffer that we haven't returned to
-	 * caller.
-	 */
-	if (timeout  0  !justconnected)
+	/* Try to receive a CopyData message */
+	rawlen = PQgetCopyData(streamConn, recvBuf, 1);
+	if (rawlen == 0)
 	{
-		if (!libpq_select(timeout))
-			return false;
+		/*
+		 * No data available yet. If the caller requested to block, wait for
+		 * more data to arrive. But if this is the first call after connecting,
+		 * don't wait, because there might already be some data in libpq buffer
+		 * that we haven't returned to caller.
+		 */
+		if (timeout  0  !justconnected)
+		{
+			if (!libpq_select(timeout))
+return false;
+		}
+		justconnected = false;
 
 		if (PQconsumeInput(streamConn) == 0)
 			ereport(ERROR,
 	(errmsg(could not receive data from WAL stream: %s,
 			PQerrorMessage(streamConn;
-	}
-	justconnected = false;
 
-	/* Receive CopyData message */
-	rawlen = PQgetCopyData(streamConn, recvBuf, 1);
-	if (rawlen == 0)			/* no data available yet, then return */
-		return false;
+		/* Now that we've consumed some input, try again */
+		rawlen = PQgetCopyData(streamConn, recvBuf, 1);
+		if (rawlen == 0)
+			return false;
+	}
 	if (rawlen == -1)			/* end-of-streaming or error */
 	{
 		PGresult   *res;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting

2011-01-13 Thread Fujii Masao

On Thu, Jan 13, 2011 at 7:00 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 +1 for promote. People unfamiliar with the replication stuff might not
 immediately understand that it's related to replication, but they wouldn't
 have any use for the option anyway. It should be clear to anyone who needs
 it.

I did s/failover/promote. Here is the updated patch.

 - pg_ctl should unlink failover_files when it failed to send failover 
 signals.

Done.

And, I changed some descriptions about trigger in high-availability.sgml
and recovery-config.sgml.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pg_ctl_failover_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Walreceiver fsyncs excessively

2011-01-13 Thread Fujii Masao

On Thu, Jan 13, 2011 at 9:01 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 While testing Fujii-san's patch to flush any already-written WAL on error in
 walreceiver, I added a debugging elog to XLogWalRcvFlush() to print out how
 far it has written and flushed.

 I saw an unexpected pattern while the standby catches up with the master:

 LOG:  streaming replication successfully connected to primary
 LOG:   flushing flush=0/0 write=0/1E02
 LOG:   flushing flush=0/1E02 write=0/1E04
 LOG:   flushing flush=0/1E04 write=0/1E06
 LOG:   flushing flush=0/1E06 write=0/1E08
 LOG:   flushing flush=0/1E08 write=0/1E0A
 LOG:   flushing flush=0/1E0A write=0/1E0C
 LOG:   flushing flush=0/1E0C write=0/1E0E
 LOG:   flushing flush=0/1E0E write=0/1E10
 LOG:   flushing flush=0/1E10 write=0/1E12
 LOG:   flushing flush=0/1E12 write=0/1E14

 The master sends the WAL at full-speed, but walreceiver always fsyncs it in
 128 kB chunks. That's excessive, it should be able to read and write to disk
 the whole WAL segment before flushing.

 There's a little flaw in the walreceiver logic that tries to read all the
 available WAL before flushing and sleeping. The way libpqrcv_receive is
 written, when it's called with timeout==0 it will not call PQconsumeInput.
 So what happens is that when walreceiver main loop calls libpqrcv_receive()
 in a loop to fetch all WAL that's available without blocking, it actually
 only reads the WAL that's in the libpq receive buffer - it will not read the
 WAL that's in the TCP read buffer.

 Attached patch fixes libpqrcv_receive() so that it calls PQconsumeInput()
 before concluding that there's no data available. The excessive fsyncing can
 lead to very bad performance, so this needs to be appled to 9.0 too.

Seems good.

Can we remove the justconnected flag, thanks to the patch?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Warning compiling pg_dump (MinGW, Windows XP)

2011-01-13 Thread Pavel Golub

Hello, Pgsql-hackers.

I'm getting such warnings:

pg_dump.c: In function 'dumpSequence':
pg_dump.c:11449:2: warning: unknown conversion type character 'l' in format
pg_dump.c:11449:2: warning: too many arguments for format
pg_dump.c:11450:2: warning: unknown conversion type character 'l' in format
pg_dump.c:11450:2: warning: too many arguments for format

Line numbers my not be the same in the official sources, because I've
made some changes. But the lines are:

snprintf(bufm, sizeof(bufm), INT64_FORMAT, SEQ_MINVALUE);
snprintf(bufx, sizeof(bufx), INT64_FORMAT, SEQ_MAXVALUE);

In my oppinion configure failed for MinGw+Windows in this case. Am I
right? Can someone give me a hint how to avoid this?

Thanks in advance

-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add function dependencies

2011-01-13 Thread Joel Jacobson

Thanks to the new pg_stat_xact_user_functions and
pg_stat_xact_user_tables views in 9.1, it will be possible to
automatically sample which functions uses which functions/tables to
generate a nice directional graph of the dependency tree, based on
recent real-life activity, excluding any unused relations/functions
not-in-use anymore. It's actually a feature to not include these, as
they make the graph a lot more complicated.

If you want a graph on the activity during Mondays between 2:30pm and
2:31pm, such a graph could easily be generated, or if you want it for
30 days (which would probably include a lot more edges in the graph),
it can also be generated. :-)

It would be quite easy to automatically inject some small code
snippets to the top and bottom of each user function, to get the diff
of select * from pg_stat_xact_user_functions and
pg_stat_xact_user_tables between the entry point of each function and
the exit point.

It would be a lot nicer if it would be possible to automatically let
PostgreSQL sample such data for you, providing nice system views with
information on the sampled data per function, allowing you to query it
and ask,

 - What functions has funciton public.myfunc(int) called and what
tables has it inserted/selected/updated/deleted from since the last
time I resetted the
statistics?

Just an idea...

-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

Folks,

I've noticed over the years that we give people dire warnings never to
send a KILL signal to the postmaster, but I'm unsure as to what are
potential consequences of this, as in just exactly how this can result
in problems.  Is there some reference I can look to for explanations
of the mechanism(s) whereby the damage occurs?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Kevin Grittner

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 Pardon my ignorance, but where exactly is the extra overhead
 coming from? Searching for a predicate lock?
 
Right.  As each tuple is read we need to ensure that there is a
predicate lock to cover it.  Since finer-grained locks can be
combined into coarser-grained locks we need to start with the fine
grained and move toward checking the coarser grains, to avoid
missing a lock during promotion.  So for each tuple we calculate a
hash, find a partition, lock it, and lookup the tuple as a lock
target.  When that's not found we do the same thing for the page. 
When that's not found we do the same thing for the relation.
 
But we acquired a relation lock up front, when we determined that
this would be a heap scan, so we could short-circuit this whole
thing if within the heapgettup_pagemode function we could determine
that this was a scan of the whole relation.
 
The profiling also showed that it was spending an obscene amount of
time calculating hash values (over 10% of total run time!).  I'm
inclined to think that a less sophisticated algorithm (like just
adding oid, page, and tuple offset numbers) would generate very
*real* savings with the down side being a very hypothetical
*possible* cost to longer chains in the HTAB.  But that's a separate
issue, best settled on the basis of benchmarks rather than
theoretical discussions.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Kevin Grittner

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 where exactly is the extra overhead coming from?
 
Keep in mind that this is a sort of worst case scenario.  The data
is fully cached in shared memory and we're doing a sequential pass
just counting the rows.  In an earlier benchmark (which I should
re-do after all this refactoring), random access queries against a
fully cached data set only increased run time by 1.8%.  Throw some
disk access into the mix, and the overhead is likely to get lost in
the noise.
 
But, as I said, count(*) seems to be the first thing many people try
as a benchmark, and this is a symptom of a more general issue, so
I'd like to find a good solution.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 16:51, Kevin Grittner wrote:

Right.  As each tuple is read we need to ensure that there is a
predicate lock to cover it.  Since finer-grained locks can be
combined into coarser-grained locks we need to start with the fine
grained and move toward checking the coarser grains, to avoid
missing a lock during promotion.  So for each tuple we calculate a
hash, find a partition, lock it, and lookup the tuple as a lock
target.  When that's not found we do the same thing for the page.
When that's not found we do the same thing for the relation.

But we acquired a relation lock up front, when we determined that
this would be a heap scan, so we could short-circuit this whole
thing if within the heapgettup_pagemode function we could determine
that this was a scan of the whole relation.


That sounds simple enough. Add a boolean field to HeapScanDesc, 
rs_relpredicatelocked, and set it when you acquire the relation lock.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Kevin Grittner

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 That sounds simple enough. Add a boolean field to HeapScanDesc, 
 rs_relpredicatelocked, and set it when you acquire the relation
 lock.
 
I'll take a look at doing that.  Thanks!
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

David Fetter da...@fetter.org writes:
 I've noticed over the years that we give people dire warnings never to
 send a KILL signal to the postmaster, but I'm unsure as to what are
 potential consequences of this, as in just exactly how this can result
 in problems.  Is there some reference I can look to for explanations
 of the mechanism(s) whereby the damage occurs?

There's no risk of data corruption, if that's what you're thinking of.
It's just that you're then looking at having to manually clean up the
child processes and then restart the postmaster; a process that is not
only tedious but does offer the possibility of screwing yourself.

In particular the risk is that someone clueless enough to do this would
next decide that removing $PGDATA/postmaster.pid, rather than killing
all the existing children, is the quickest way to get the postmaster
restarted.  Once he's done that, his data will shortly be hosed beyond
recovery, because now he has two noncommunicating sets of backends
massaging the same files via separate sets of shared buffers.

The reason this sequence of events doesn't seem improbable is that the
error you get when you try to start a new postmaster, if there are still
old backends running, is

FATAL:  pre-existing shared memory block (key 5490001, ID 15609) is still in use
HINT:  If you're sure there are no old server processes still running, remove 
the shared memory block or just delete the file postmaster.pid.

Maybe we should rewrite that HINT --- while it's *possible* that
removing the shmem block or deleting postmaster.pid is the right thing
to do, it's not exactly *likely*.  I think we need to put a bit more
emphasis on the If ... part.  Like If you are prepared to swear on
your mother's grave that there are no old server processes still
running, consider removing postmaster.pid.  But first check for existing
processes again.

(BTW, I notice that this interlock against starting a new postmaster
appears to be broken in HEAD, which is likely not unrelated to the fact
that the contents of postmaster.pid seem to be totally bollixed :-()

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system views for walsender activity

2011-01-13 Thread Magnus Hagander

On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net wrote:
 No, do this at top

 if (walsnd-state == state)
  return;

 Keep spinlocks when actually setting it.

 I think this is safe...

 Aha. Thanks for the pointers, pfa a new version.

 ...but I think you also need to take the spinlock when reading the value.

Even when it can only ever be set by one process (the owning
walsender), and the variable is atomic (as it should be, since it's a
single enum/int)?

Anyway, it should be as simple as copying it out to a local variable
when it's already in the spinlock and then use that, right?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Kevin Grittner

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 13.01.2011 16:51, Kevin Grittner wrote:
 
 But we acquired a relation lock up front, when we determined that
 this would be a heap scan, so we could short-circuit this whole
 thing if within the heapgettup_pagemode function we could
 determine that this was a scan of the whole relation.
 
 That sounds simple enough. Add a boolean field to HeapScanDesc, 
 rs_relpredicatelocked, and set it when you acquire the relation
 lock.
 
Heikki, I can't thank you enough.  The fix is here:
 
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=64ca508a0e2fa9c21dc76a5d6a5f549c27f511fa
 
The timings are now:
 
begin transaction isolation level repeatable read;
Time: 324.938 ms
Time: 228.045 ms
Time: 227.963 ms

begin transaction isolation level serializable;
Time: 311.954 ms
Time: 311.928 ms
Time: 311.848 ms

begin transaction isolation level serializable, read only;
Time: 227.471 ms
Time: 228.137 ms
Time: 227.778 ms

begin transaction isolation level serializable, read only,
deferrable;
Time: 227.899 ms
Time: 249.772 ms
Time: 228.026 ms

begin transaction isolation level repeatable read;
Time: 231.173 ms
Time: 245.041 ms
Time: 228.149 ms
 
I'm surprised the difference is still that high as a percentage, and
will investigate, but this seems survivable.  When I do the math,
the difference comes out to 83.885 nanoseconds per row.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Walreceiver fsyncs excessively

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 14:34, Fujii Masao wrote:

On Thu, Jan 13, 2011 at 9:01 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Attached patch fixes libpqrcv_receive() so that it calls PQconsumeInput()
before concluding that there's no data available. The excessive fsyncing can
lead to very bad performance, so this needs to be appled to 9.0 too.


Seems good.

Can we remove the justconnected flag, thanks to the patch?


Yes, good point. Committed with justconnected removed.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql crashes on encoding mismatch

2011-01-13 Thread Hitoshi Harada

2011/1/13 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 I found a crash case (assertion failure) when runing psql -f
 utf8_encoded_script.sql against client_encoding = shift_jis in
 postgresql.conf. Though encoding mismatch is obviously user's fault, a
 crash doesn't explain anything to him.

 I'm not too impressed with this patch: it seems like the most it will
 accomplish is to move the failure to some other, equally obscure, place
 --- because you'll still have a string that's invalidly encoded.
 Moreover, if you've got wrongly encoded data, it wouldn't be hard at all
 for it to mess up psql's lexing; consider cases such as a
 character-that's-not-as-long-as-we-think just in front of a quote mark.

 Shouldn't we instead try to verify the multibyte encoding somewhere
 upstream of here?

I had thought it before going into the patch, too. However, the fact
that psql(fe-misc.c) doesn't have PQverfiymb() although it has
PQmblen() implied to me that encoding verification should be done in
server side perhaps. I might be too ignorant to imagine the lexing
problem of your quote mark, but my crash sample has multibyte
characters in sql comment, which is ignored in the server parsing. If
we decided that the case raises error, wouldn't some existing
applications be broken? I can imagine they are in the same situation
of encoding mismatch and are run without problem I found by chance.

Just for reference I attach the case sql file. To reproduce it:

1. initdb
2. edit client_encoding = shift_jis in postgresql.conf
3. start postgres
4. psql -f case_utf8.sql

Note: the line break should be LF as the file stands. CR-LF cannot
reproduce the problem.

Regards,

-- 
Hitoshi Harada


case_utf8.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] arrays as pl/perl input arguments [PATCH]

2011-01-13 Thread Alex Hunsaker

On Thu, Jan 13, 2011 at 01:06, Martijn van Oosterhout klep...@svana.org wrote:
 On Thu, Jan 13, 2011 at 12:06:33AM -0700, Alex Hunsaker wrote:
  I had supposed that it would be possible to do the string conversion
  lazily, ie, only if the string value was actually demanded.

 Yep, In-fact if we wanted we could even die (or throw an exception in
 other language speak :) ) when the string value is demanded.

 I played with this a little and it is fairly easy to make a variable
 such that $a is the string representation and $a[0] the first value of
 the array. The problem is that you can't pass such a variable into a
 subroutine.

[ snip ]
 my @a=(1,2);

 tie $a, MyClass, \@a;

 print \$a='$a'\n;
 print \$a[0]='$a[0]'\n;


Erm... the reason you can't seem to pass it to any subroutines is its
actually 2 variables: $a, @a.
When you print $a\n; you are using the tied variable that uses @a;
And when you print $a[0]\n; you are accessing the array directly.
I think you just had an unfortunate variable name, otherwise strict
would have complained appropriately. :)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq documentation cleanups (repost 3)

2011-01-13 Thread Bruce Momjian

Robert Haas wrote:
 On Wed, Jan 12, 2011 at 8:54 PM, Bruce Momjian br...@momjian.us wrote:
  I am also attaching a few more of Leslie's changes that I think are
  useful. ?The first clarifies a confusion Leslie had about the fact that
  return is referencing the return value of the function and not the
  value returned in the pointer.
 
 Hmm.  Well, if that's the confusion, I don't think inserting the words
 by the function is the right way to fix it - it certainly isn't
 returned by anything else.  You could change it to say It is also
 possible for *errmsg to be NULL even when the return value is also
 NULL; this indicates...
 
  The second change is, I think, better wording.
 
 OK.
 
  The third moves the deprecated text to the start of the function
  description. ?Leslie pointed out that that is how we do it for other
  libpq functions, so we should move it for consistency.
 
 That seems to me to read pretty awkwardly.  You could perhaps strike
 the chunk and the whole first paragraph and simply write PQoidStatus
 is an older, deprecated version of PQoidValue.  It returns its result
 as a character string rather than an Oid, and is not thread-safe. and
 then cut directly to the synopsis.  That would be consistent with what
 we've done elsewhere; moving just that one sentence is not.

OK, I have made the adjustments you mentioned with my own wording
(attached and applied).  Let me know of any more needed adjustments.
Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 58e593d..fe661b8 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -972,8 +972,8 @@ PQconninfoOption *PQconninfoParse(const char *conninfo, char **errmsg);
If literalerrmsg/ is not symbolNULL/, then literal*errmsg/ is set
to symbolNULL/ on success, else to a functionmalloc/'d error string explaining
the problem.  (It is also possible for literal*errmsg/ to be
-   set to symbolNULL/ even when symbolNULL/ is returned; this indicates an out-of-memory
-   situation.)
+   set to symbolNULL/ and the function to return symbolNULL/;
+   this indicates an out-of-memory condition.)
   /para
 
   para
@@ -1352,7 +1352,7 @@ ConnStatusType PQstatus(const PGconn *conn);
   para
See the entry for functionPQconnectStartParams/, functionPQconnectStart/
and functionPQconnectPoll/ with regards to other status codes that
-   might be seen.
+   might be returned.
   /para
  /listitem
 /varlistentry
@@ -3163,23 +3163,15 @@ Oid PQoidValue(const PGresult *res);
 
  listitem
   para
-   Returns a string with the OID of the inserted row, if the
-   acronymSQL/acronym command was an commandINSERT/command
-   that inserted exactly one row, or a commandEXECUTE/command of
-   a prepared statement consisting of a suitable
-   commandINSERT/command.  (The string will be literal0/ if
-   the commandINSERT/command did not insert exactly one row, or
-   if the target table does not have OIDs.)  If the command was not
-   an commandINSERT/command, returns an empty string.
+   This function is deprecated in favor of
+   functionPQoidValue/function and is not thread-safe.
+   It returns a string with the OID of the inserted row, while
+   functionPQoidValue/function returns the OID value.
 synopsis
 char *PQoidStatus(const PGresult *res);
 /synopsis
   /para
 
-  para
-   This function is deprecated in favor of
-   functionPQoidValue/function.  It is not thread-safe.
-  /para
  /listitem
 /varlistentry
/variablelist

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

On Thu, Jan 13, 2011 at 10:41:28AM -0500, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  I've noticed over the years that we give people dire warnings never to
  send a KILL signal to the postmaster, but I'm unsure as to what are
  potential consequences of this, as in just exactly how this can result
  in problems.  Is there some reference I can look to for explanations
  of the mechanism(s) whereby the damage occurs?
 
 There's no risk of data corruption, if that's what you're thinking of.
 It's just that you're then looking at having to manually clean up the
 child processes and then restart the postmaster; a process that is not
 only tedious but does offer the possibility of screwing yourself.

Does this mean that there's no cross-platform way to ensure that
killing a process results in its children's timely (i.e. before damage
can occur) death?  That such a way isn't practical from a performance
point of view?

 In particular the risk is that someone clueless enough to do this would
 next decide that removing $PGDATA/postmaster.pid, rather than killing
 all the existing children, is the quickest way to get the postmaster
 restarted.  Once he's done that, his data will shortly be hosed beyond
 recovery, because now he has two noncommunicating sets of backends
 massaging the same files via separate sets of shared buffers.

Right.

 The reason this sequence of events doesn't seem improbable is that the
 error you get when you try to start a new postmaster, if there are still
 old backends running, is
 
 FATAL:  pre-existing shared memory block (key 5490001, ID 15609) is still in 
 use
 HINT:  If you're sure there are no old server processes still running, remove 
 the shared memory block or just delete the file postmaster.pid.
 
 Maybe we should rewrite that HINT --- while it's *possible* that
 removing the shmem block or deleting postmaster.pid is the right thing
 to do, it's not exactly *likely*.  I think we need to put a bit more
 emphasis on the If ... part.  Like If you are prepared to swear on
 your mother's grave that there are no old server processes still
 running, consider removing postmaster.pid.  But first check for existing
 processes again.

Maybe the hint could give an OS-tailored way to check this...

 (BTW, I notice that this interlock against starting a new postmaster
 appears to be broken in HEAD, which is likely not unrelated to the
 fact that the contents of postmaster.pid seem to be totally bollixed
 :-()

D'oh!  Well, I hope knowing it's a problem gives some kind of glimmer
as to how to solve it :)

Is this worth writing tests for?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] C++ keywords in headers (was Re: [GENERAL] #include funcapi.h)

2011-01-13 Thread Bruce Momjian

Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Tom Lane's message of lun dic 27 13:54:56 -0300 2010:
  [ lightbulb ] ... although we could improve that quite a bit if we
  processed each .h file separately instead of insisting on smashing
  everything into one compilation.  Let me go try that.
 
  FWIW I have this patch lingering about that I wrote months ago, to check
  for header problems (not C++ stuff, just things like forgetting to
  include some necessary header in some other header).  Since it needs a
  lot of polish (needs to ignore certain headers, and avoid leave
  lingering files around), I didn't commit it; and I haven't updated it to
  the new Make recursive stuff, either.
 
 src/tools/pginclude/ already contains several scripts for this sort of
 thing.  Bruce runs them by hand occasionally, although I just found out
 that he's evidently not run the does-each-header-compile-standalone
 test in awhile.  It would probably pay to automate these.

It is true I have not run those tests in a while.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system views for walsender activity

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net wrote:
 No, do this at top

 if (walsnd-state == state)
  return;

 Keep spinlocks when actually setting it.

 I think this is safe...

 Aha. Thanks for the pointers, pfa a new version.

 ...but I think you also need to take the spinlock when reading the value.

 Even when it can only ever be set by one process (the owning
 walsender), and the variable is atomic (as it should be, since it's a
 single enum/int)?

The fact that it can only be modified by one process makes it safe for
*that process* to read it without taking the lock, but another process
that wants to read it still needs the lock, I believe - otherwise you
might get a slightly stale value.  That's probably not a *huge* deal
in this case, but I think it'd be better to get it right because
people tend to copy these sorts of things elsewhere, and it'd be bad
if it got copied into some place more critical.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_depend explained

2011-01-13 Thread David Fetter

On Wed, Jan 12, 2011 at 09:09:31PM +0100, Joel Jacobson wrote:
 (sorry for top posting,

No worries.

 iPhone + drunk)

A dangerous combination indeed.  I hear water, NSAIDs and time can
help with the hangover ;)

 pg_depend_before is a select * from pg_depend before creating the
 test db model

Please put a self-contained example on the snippets page, and please
also to check that it actually runs before doing so.  You'd mangled
some aliases in the query you sent, which leads me to believe you
hadn't actually tried running it.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system views for walsender activity

2011-01-13 Thread Magnus Hagander

On Thu, Jan 13, 2011 at 18:43, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Jan 12, 2011 at 03:03, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 No, do this at top

 if (walsnd-state == state)
  return;

 Keep spinlocks when actually setting it.

 I think this is safe...

 Aha. Thanks for the pointers, pfa a new version.

 ...but I think you also need to take the spinlock when reading the value.

 Even when it can only ever be set by one process (the owning
 walsender), and the variable is atomic (as it should be, since it's a
 single enum/int)?

 The fact that it can only be modified by one process makes it safe for
 *that process* to read it without taking the lock, but another process
 that wants to read it still needs the lock, I believe - otherwise you
 might get a slightly stale value.  That's probably not a *huge* deal
 in this case, but I think it'd be better to get it right because
 people tend to copy these sorts of things elsewhere, and it'd be bad
 if it got copied into some place more critical.

ok, thanks for the pointers - fix applied.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

David Fetter da...@fetter.org writes:
 On Thu, Jan 13, 2011 at 10:41:28AM -0500, Tom Lane wrote:
 It's just that you're then looking at having to manually clean up the
 child processes and then restart the postmaster; a process that is not
 only tedious but does offer the possibility of screwing yourself.

 Does this mean that there's no cross-platform way to ensure that
 killing a process results in its children's timely (i.e. before damage
 can occur) death?  That such a way isn't practical from a performance
 point of view?

The simple, easy, cross-platform solution is this: don't kill -9 the
postmaster.  Send it one of the provisioned shutdown signals and let it
kill its children for you.

At least on Unix I don't believe there is any other solution.  You
could try looking at ps output but there's a fundamental race condition,
ie the postmaster could spawn another child just before you kill it,
whereupon the child is reassigned to init and there's no longer a good
way to tell that it came from that postmaster.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 At least on Unix I don't believe there is any other solution.  You
 could try looking at ps output but there's a fundamental race
 condition, ie the postmaster could spawn another child just before
 you kill it, whereupon the child is reassigned to init and there's
 no longer a good way to tell that it came from that postmaster.
 
Couldn't you run `ps auxf` and kill any postgres process which is
not functioning as postmaster (those are pretty easy to distinguish)
and which isn't the child of such a process?  Is there ever a reason
to allow such an orphan to run?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in pg_settings/pg_depend

2011-01-13 Thread Tom Lane

Joel Jacobson j...@gluefinance.com writes:
 Are multiple identical entires in pg_depend possible?

Yes, probably.  It's certainly possible to have the same linkage occur
with different deptypes.  We don't try hard to avoid dups because they
don't matter.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in pg_describe_object, patch v2

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Wed, Jan 12, 2011 at 7:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IMO, what this patch needs is to not output the types unless they are
 actually different from the default (which can be inferred from the AM
 type and the function arguments). That would fix my concern about it
 emitting information that is 99.44% useless.

 I guess we could do that, but I don't understand how you're supposed
 to infer them, which means probably a lot of other people won't
 either.

Read the CREATE OPERATOR CLASS source code.  (It likely would be best to
refactor that a bit so it would expose some way to obtain the implied
defaults --- I don't think that's done explicitly now, and it's
certainly not exported from opclasscmds.c.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 At least on Unix I don't believe there is any other solution.  You
 could try looking at ps output but there's a fundamental race
 condition, ie the postmaster could spawn another child just before
 you kill it, whereupon the child is reassigned to init and there's
 no longer a good way to tell that it came from that postmaster.
 
 Couldn't you run `ps auxf` and kill any postgres process which is
 not functioning as postmaster (those are pretty easy to distinguish)
 and which isn't the child of such a process?  Is there ever a reason
 to allow such an orphan to run?

That's not terribly hard to do by hand, especially since the cautious
DBA could also do things like checking a process' CWD to verify which
postmaster it had belonged to.  I can't see automating it though.
We already have a perfectly good solution to the automated shutdown
problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 I can't see automating it though.  We already have a perfectly
 good solution to the automated shutdown problem.
 
Oh, I totally agree with that.  I somehow thought we'd gotten off
into how someone could recover after shooting their foot.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in pg_settings/pg_depend

2011-01-13 Thread Joel Jacobson

2011/1/13 Tom Lane t...@sss.pgh.pa.us:
 Yes, probably.  It's certainly possible to have the same linkage occur
 with different deptypes.  We don't try hard to avoid dups because they
 don't matter.

with different deptypes, yes, but in this case there were two
linkages of the same deptype.

Just seems a bit strange I only found one such in the entire database,
smells like some kind of bug, but might not be, I dunno, just thought
it was worth investigating a bit, but if you're sure about it I of
course trust you.

-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Florian Pflug

On Jan13, 2011, at 19:00 , Tom Lane wrote:
 At least on Unix I don't believe there is any other solution.  You
 could try looking at ps output but there's a fundamental race condition,
 ie the postmaster could spawn another child just before you kill it,
 whereupon the child is reassigned to init and there's no longer a good
 way to tell that it came from that postmaster.

Maybe I'm totally confused, but ...

Couldn't normal backends call PostmasterIsAlive and exit if not, just
like the startup process, the stats collector, autovacuum, bgwriter,
walwriter, walreceiver, walsender and the wal archiver already do?

I assumed they do, but now that I grepped the code it seems they don't.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_depend explained

2011-01-13 Thread Joel Jacobson

2011/1/13 David Fetter da...@fetter.org:
 Please put a self-contained example on the snippets page, and please
 also to check that it actually runs before doing so.  You'd mangled
 some aliases in the query you sent, which leads me to believe you
 hadn't actually tried running it.

I actually hadn't really solved the problem at the time I wrote my last email,
it turned out I had to do things a bit differently to avoid running
into problems with corner cases.

I will put together a self-contained example like you suggested and
get back shortly :-)

-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Streaming base backups

2011-01-13 Thread Magnus Hagander

On Wed, Jan 12, 2011 at 10:39, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Jan 10, 2011 at 11:09 PM, Magnus Hagander mag...@hagander.net wrote:
 I've committed the backend side of this, without that. Still working
 on the client, and on cleaning up Heikki's patch for grammar/parser
 support.

 Great work!

 I have some comments:

 While walsender is sending a base backup, WalSndWakeup should
 not send the signal to that walsender?

True, it's not necessary. How bad does it actually hurt things though?
Given that the walsender running the backup isn't actually waiting on
the latch, it doesn't actually send a signal, does it?


 In sendFile or elsewhere, we should periodically check whether
 postmaster is alive and whether the flag was set by the signal?

That, however, we probably should.


 At the end of the backup by walsender, it forces a switch to a new
 WAL file and waits until the last WAL file has been archived. So we
 should change postmaster so that it doesn't cause the archiver to
 end before walsender ends when shutdown is requested?

Um. I have to admit I'm not entirely following what you mean enough to
confirm it, but it *sounds* correct :-)

What scenario exactly is the problematic one?


 Also, when shutdown is requested, the walsender which is
 streaming WAL should not end before another walsender which
 is sending a backup ends, to stream the backup-end WAL?

Not sure I see the reason for that. If we're shutting down in the
middle of the base backup, we don't have any support for continuing
that one after we're back up - you have to start over.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

On Thu, Jan 13, 2011 at 12:45:07PM -0600, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  
  I can't see automating it though.  We already have a perfectly
  good solution to the automated shutdown problem.
  
 Oh, I totally agree with that.  I somehow thought we'd gotten off
 into how someone could recover after shooting their foot.

I get that we can't prevent all pilot error, but I was hoping we could
bullet-proof this a little more, especially in light of a certain
extremely popular server OS's OOM killer's default behavior.

Yes, I get that that behavior is crazy, and stupid, and that people
should shut it off, but it *is* our problem if we let the postmaster
start (or continue) when it's set that way.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 Couldn't normal backends call PostmasterIsAlive and exit if not, just
 like the startup process, the stats collector, autovacuum, bgwriter,
 walwriter, walreceiver, walsender and the wal archiver already do?

 I assumed they do, but now that I grepped the code it seems they don't.

That's intentional: they keep going until the user closes the session or
someone sends them a signal to do otherwise.  The other various
background processes have to watch PostmasterIsAlive because there is no
session to close.

Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use.
It sucks because you don't get a signal on parent death.  With the
arrival of the latch code, having to check for PostmasterIsAlive
frequently is the only reason for an idle background process to consume
CPU at all.

Another problem with the scheme is that it only works as long as the
background process is providing a *non critical* service.  Eventually we
are probably going to need some way for bgwriter/walwriter to stay alive
long enough to service orphaned backends, rather than disappearing
instantly if the postmaster goes away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

David Fetter da...@fetter.org writes:
 I get that we can't prevent all pilot error, but I was hoping we could
 bullet-proof this a little more, especially in light of a certain
 extremely popular server OS's OOM killer's default behavior.

 Yes, I get that that behavior is crazy, and stupid, and that people
 should shut it off, but it *is* our problem if we let the postmaster
 start (or continue) when it's set that way.

Packagers who are paying attention have fixed that ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use.
 It sucks because you don't get a signal on parent death.  With the
 arrival of the latch code, having to check for PostmasterIsAlive
 frequently is the only reason for an idle background process to consume
 CPU at all.

What we really need is SIGPARENT.  I wonder if the Linux folks would
consider adding such a thing.  Might be useful to others as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in pg_settings/pg_depend

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 2:04 PM, Joel Jacobson j...@gluefinance.com wrote:
 2011/1/13 Tom Lane t...@sss.pgh.pa.us:
 Yes, probably.  It's certainly possible to have the same linkage occur
 with different deptypes.  We don't try hard to avoid dups because they
 don't matter.

 with different deptypes, yes, but in this case there were two
 linkages of the same deptype.

 Just seems a bit strange I only found one such in the entire database,
 smells like some kind of bug, but might not be, I dunno, just thought
 it was worth investigating a bit, but if you're sure about it I of
 course trust you.

Instead of trusting him, you could investigate why it happens.  A
quick test shows this eliminates both dependencies:

drop rule pg_settings_u on pg_settings;

It appears that both of the dependencies in question are from that
rule and pointing to pg_settings.name, and it looks like that rule
mentions the name column of pg_settings twice.  With a little further
experimentation you can probably tease out whether each of the two
mentions produced a separate dependency...  my guess is yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use.
 It sucks because you don't get a signal on parent death.  With the
 arrival of the latch code, having to check for PostmasterIsAlive
 frequently is the only reason for an idle background process to consume
 CPU at all.

 What we really need is SIGPARENT.  I wonder if the Linux folks would
 consider adding such a thing.  Might be useful to others as well.

That's pretty much a dead-end idea unfortunately; it would never be
portable enough to let us change our system structure to rely on it.
Even more to the point, go away when the postmaster does isn't
really the behavior we want anyway.  Go away when the last backend
does is what we want.

I wonder whether we could have some sort of latch-like counter that
would count the number of active backends and deliver signals when the
count went to zero.  However, if the goal is to defend against random
applications of SIGKILL, there's probably no way to make this reliable
in userspace.

Another idea is to have a postmaster minder process that respawns the
postmaster when it's killed.  The hard part of that is that the minder
can't be connected to shared memory (else its OOM cross-section is just
as big as the postmaster's), and that makes it difficult for it to tell
when all the children have gone away.  I suppose it could be coded to
just retry every few seconds until success.  This doesn't improve the
behavior of background processes at all, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Frankly I'd prefer to get rid of PostmasterIsAlive, not extend its use.
 It sucks because you don't get a signal on parent death.  With the
 arrival of the latch code, having to check for PostmasterIsAlive
 frequently is the only reason for an idle background process to consume
 CPU at all.

 What we really need is SIGPARENT.  I wonder if the Linux folks would
 consider adding such a thing.  Might be useful to others as well.

 That's pretty much a dead-end idea unfortunately; it would never be
 portable enough to let us change our system structure to rely on it.
 Even more to the point, go away when the postmaster does isn't
 really the behavior we want anyway.  Go away when the last backend
 does is what we want.

I'm not convinced.  I was thinking that we could simply treat it like
SIGQUIT, if it's available.  I doubt there's a real use case for
continuing to run queries after the postmaster and all the background
processes are dead.  Expedited death seems like much better behavior.
Even checking PostmasterIsAlive() once per query would be reasonable,
except that it'd add a system call to check for a condition that
almost never holds, which I'm not eager to do.

 I wonder whether we could have some sort of latch-like counter that
 would count the number of active backends and deliver signals when the
 count went to zero.  However, if the goal is to defend against random
 applications of SIGKILL, there's probably no way to make this reliable
 in userspace.

I don't think you can get there 100%.  We could, however, make a rule
that when a background process fails a PostmasterIsAlive() check, it
sends SIGQUIT to everyone it can find in the ProcArray, which would at
least ensure a timely exit in most real-world cases.

 Another idea is to have a postmaster minder process that respawns the
 postmaster when it's killed.  The hard part of that is that the minder
 can't be connected to shared memory (else its OOM cross-section is just
 as big as the postmaster's), and that makes it difficult for it to tell
 when all the children have gone away.  I suppose it could be coded to
 just retry every few seconds until success.  This doesn't improve the
 behavior of background processes at all, though.

It hardly seems worth it.  Given a reliable interlock against multiple
postmasters, the real concern is making sure that a half-dead
postmaster gets itself all-dead quickly so that the DBA can start up a
new one before he gets fired.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

2011-01-13 Thread Ross J. Reedstrom

On Tue, Jan 11, 2011 at 11:06:18AM -0800, Josh Berkus wrote:
 
  It makes it very convenient to set up standbys, without having to worry
  that you'll conflict e.g with a nightly backup. I don't imagine people
  will use streaming base backups for very large databases anyway.
 
 Also, imagine that you're provisioning a 10-node replication cluster on
 EC2.  This would make that worlds easier.

Hmm, perhaps. My concern is that a naive attempt to do that is going to
have 10 base-backups happening at the same time, completely slamming the
master, and none of them completing is a reasonable time. Is this
possible, or is it that simultaneity will buy you hot caches and backup
#2 - #10 all run faster?

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder whether we could have some sort of latch-like counter that
 would count the number of active backends and deliver signals when the
 count went to zero.  However, if the goal is to defend against random
 applications of SIGKILL, there's probably no way to make this reliable
 in userspace.

 I don't think you can get there 100%.  We could, however, make a rule
 that when a background process fails a PostmasterIsAlive() check, it
 sends SIGQUIT to everyone it can find in the ProcArray, which would at
 least ensure a timely exit in most real-world cases.

You're going in the wrong direction there: we're trying to have the
system remain sane when the postmaster crashes, not see how quickly
it can screw up every remaining session.

BTW, in Unix-land we could maybe rely on SysV semaphores' SEM_UNDO
feature to keep a trustworthy count of how many live processes there
are.  But I don't know whether there's anything comparable for Windows.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Aidan Van Dyk

On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote:
 I'm not convinced.  I was thinking that we could simply treat it like
 SIGQUIT, if it's available.  I doubt there's a real use case for
 continuing to run queries after the postmaster and all the background
 processes are dead.  Expedited death seems like much better behavior.
 Even checking PostmasterIsAlive() once per query would be reasonable,
 except that it'd add a system call to check for a condition that
 almost never holds, which I'm not eager to do.

If postmaster has a few fds to spare, what about having it open a pipe
to every child it spawns.  It never has to read/write to it, but
postmaster closing will signal the client's fd.  The client just has
to pop the fd into whatever nrmal poll/select event handlign it uses
to notice when the parent's pipe is closed.

A FIFO would allow postmaster to not need as many file handles, and
clients reading the fifo would notice when the writer (postmaster)
closes it.

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 3:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 2:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder whether we could have some sort of latch-like counter that
 would count the number of active backends and deliver signals when the
 count went to zero.  However, if the goal is to defend against random
 applications of SIGKILL, there's probably no way to make this reliable
 in userspace.

 I don't think you can get there 100%.  We could, however, make a rule
 that when a background process fails a PostmasterIsAlive() check, it
 sends SIGQUIT to everyone it can find in the ProcArray, which would at
 least ensure a timely exit in most real-world cases.

 You're going in the wrong direction there: we're trying to have the
 system remain sane when the postmaster crashes, not see how quickly
 it can screw up every remaining session.

I strongly believe you're in the minority on that one, for the same
reasons that I don't think most people would agree with your notion of
what should be the default shutdown mode.  A database that can't
accept new connections is a liability, not an asset.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 2:19 PM, Ross J. Reedstrom reeds...@rice.edu wrote:
 On Tue, Jan 11, 2011 at 11:06:18AM -0800, Josh Berkus wrote:

  It makes it very convenient to set up standbys, without having to worry
  that you'll conflict e.g with a nightly backup. I don't imagine people
  will use streaming base backups for very large databases anyway.

 Also, imagine that you're provisioning a 10-node replication cluster on
 EC2.  This would make that worlds easier.

 Hmm, perhaps. My concern is that a naive attempt to do that is going to
 have 10 base-backups happening at the same time, completely slamming the
 master, and none of them completing is a reasonable time. Is this
 possible, or is it that simultaneity will buy you hot caches and backup
 #2 - #10 all run faster?

That's going to depend on the situation.  If the database fits in
memory, then it's just going to work.  If it fits on disk, it's less
obvious whether it'll be good or bad, but an arbitrary limitation here
doesn't serve us well.

P.S. Your reply-to header is busted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Florian Pflug

On Jan13, 2011, at 21:01 , Aidan Van Dyk wrote:
 On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote:
 I'm not convinced.  I was thinking that we could simply treat it like
 SIGQUIT, if it's available.  I doubt there's a real use case for
 continuing to run queries after the postmaster and all the background
 processes are dead.  Expedited death seems like much better behavior.
 Even checking PostmasterIsAlive() once per query would be reasonable,
 except that it'd add a system call to check for a condition that
 almost never holds, which I'm not eager to do.
 
 If postmaster has a few fds to spare, what about having it open a pipe
 to every child it spawns.  It never has to read/write to it, but
 postmaster closing will signal the client's fd.  The client just has
 to pop the fd into whatever nrmal poll/select event handlign it uses
 to notice when the parent's pipe is closed.

I just started to experiment with that idea, and wrote a small test
program to check if that'd work. I'll post the results when I'm done.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 A database that can't accept new connections is a liability, not
 an asset.
 
+1
 
I have so far been unable to imagine a use case for the production
databases I use where I would prefer to see backends continue after
postmaster failure.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 I strongly believe you're in the minority on that one, for the same
 reasons that I don't think most people would agree with your notion of
 what should be the default shutdown mode.  A database that can't
 accept new connections is a liability, not an asset.

Killing active sessions when it's not absolutely necessary is not an
asset.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Magnus Hagander

On Thu, Jan 13, 2011 at 21:37, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I strongly believe you're in the minority on that one, for the same
 reasons that I don't think most people would agree with your notion of
 what should be the default shutdown mode.  A database that can't
 accept new connections is a liability, not an asset.

 Killing active sessions when it's not absolutely necessary is not an
 asset.

It certainly can be. Consider any connection pooling scenario, which
would represent the vast majority of larger deployments today - if you
don't kill the sessions, they will never go away.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Aidan Van Dyk ai...@highrise.ca writes:
 If postmaster has a few fds to spare, what about having it open a pipe
 to every child it spawns.  It never has to read/write to it, but
 postmaster closing will signal the client's fd.  The client just has
 to pop the fd into whatever nrmal poll/select event handlign it uses
 to notice when the parent's pipe is closed.

Hmm.  Or more generally: there's one FIFO.  The postmaster holds both
sides open.  Backends hold the write side open.  (They can close the
read side, but that would just be to free up a FD.)  Background children
close the write side.  Now a background process can use EOF on the read
side of the FIFO to tell it that postmaster and all backends have
exited.  You still don't get a signal, but at least the condition you're
testing for is the one we actually want and not an approximation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

2011-01-13 Thread Josh Berkus

On 1/13/11 12:11 PM, Robert Haas wrote:
 That's going to depend on the situation.  If the database fits in
 memory, then it's just going to work.  If it fits on disk, it's less
 obvious whether it'll be good or bad, but an arbitrary limitation here
 doesn't serve us well.

FWIW, if we had this feature right now in 9.0 we (PGX) would be using
it.  We run into the case of DB in memory, multiple slaves fairly often
these days.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

On Thu, Jan 13, 2011 at 09:18:06PM +0100, Florian Pflug wrote:
 On Jan13, 2011, at 21:01 , Aidan Van Dyk wrote:
  On Thu, Jan 13, 2011 at 2:53 PM, Robert Haas robertmh...@gmail.com wrote:
  I'm not convinced.  I was thinking that we could simply treat it
  like SIGQUIT, if it's available.  I doubt there's a real use case
  for continuing to run queries after the postmaster and all the
  background processes are dead.  Expedited death seems like much
  better behavior.  Even checking PostmasterIsAlive() once per
  query would be reasonable, except that it'd add a system call to
  check for a condition that almost never holds, which I'm not
  eager to do.
  
  If postmaster has a few fds to spare, what about having it open a
  pipe to every child it spawns.  It never has to read/write to it,
  but postmaster closing will signal the client's fd.  The client
  just has to pop the fd into whatever nrmal poll/select event
  handlign it uses to notice when the parent's pipe is closed.
 
 I just started to experiment with that idea, and wrote a small test
 program to check if that'd work. I'll post the results when I'm
 done.

Great! :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I strongly believe you're in the minority on that one, for the same
 reasons that I don't think most people would agree with your notion of
 what should be the default shutdown mode.  A database that can't
 accept new connections is a liability, not an asset.

 Killing active sessions when it's not absolutely necessary is not an
 asset.

That's a highly arguable point and I certainly don't agree with it.  A
database with no postmaster and no background processes can't possibly
be expected to function in any sort of halfway reasonable way.  In
particular:

1. No checkpoints will occur, so the time required for recovery will
grow longer without bound.
2. All walsenders will exit, so no transactions will be replicated to standbys.
3. Transactions committed asynchronously won't be flushed to disk, and
are lost entirely unless enough other WAL activity occurs before the
last backend dies to force a WAL write.
4. Autovacuum won't run until the system is properly restarted, and to
make matters worse there's no statistics collector, so the information
that might trigger a later run will be lost also.
5. At some point, you'll run out of clean buffers, after which
performance will start to suck as backends have to do their own
writes.
6. At some probably later point, the fsync request queue will fill up,
after which performance will go into the toilet.  On 9.1devel, this
takes less than a minute of moderate activity on my MacOS X machine.

All in all, running for any significant period of time in this state
is likely a recipe for disaster, even if for some inexplicable reason
you don't care about the fact that the system won't accept any new
connections.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

2011-01-13 Thread Heikki Linnakangas


On 13.01.2011 22:57, Josh Berkus wrote:

On 1/13/11 12:11 PM, Robert Haas wrote:

That's going to depend on the situation.  If the database fits in
memory, then it's just going to work.  If it fits on disk, it's less
obvious whether it'll be good or bad, but an arbitrary limitation here
doesn't serve us well.


FWIW, if we had this feature right now in 9.0 we (PGX) would be using
it.  We run into the case of DB in memory, multiple slaves fairly often
these days.


Anyway, here's an updated patch with all the known issues fixed.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5b6a230..400e12e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -60,8 +60,6 @@
 
 
 /* File path names (all relative to $PGDATA) */
-#define BACKUP_LABEL_FILE		backup_label
-#define BACKUP_LABEL_OLD		backup_label.old
 #define RECOVERY_COMMAND_FILE	recovery.conf
 #define RECOVERY_COMMAND_DONE	recovery.done
 
@@ -338,7 +336,8 @@ typedef struct XLogCtlInsert
 	XLogPageHeader currpage;	/* points to header of block in cache */
 	char	   *currpos;		/* current insertion point in cache */
 	XLogRecPtr	RedoRecPtr;		/* current redo point for insertions */
-	bool		forcePageWrites;	/* forcing full-page writes for PITR? */
+	int			forcePageWrites;	/* forcing full-page writes for PITR? */
+	bool		exclusiveBackup;	/* a backup was started with pg_start_backup() */
 } XLogCtlInsert;
 
 /*
@@ -8313,16 +8312,38 @@ pg_start_backup(PG_FUNCTION_ARGS)
 
 	backupidstr = text_to_cstring(backupid);
 
-	startpoint = do_pg_start_backup(backupidstr, fast);
+	startpoint = do_pg_start_backup(backupidstr, fast, NULL);
 
 	snprintf(startxlogstr, sizeof(startxlogstr), %X/%X,
 			 startpoint.xlogid, startpoint.xrecoff);
 	PG_RETURN_TEXT_P(cstring_to_text(startxlogstr));
 }
 
+/*
+ * do_pg_start_backup is the workhorse of the user-visible pg_start_backup()
+ * function. It creates the necessary starting checkpoint and constructs the
+ * backup label file.
+ * 
+ * There are two kind of backups: exclusive and non-exclusive. An exclusive
+ * backup is started with pg_start_backup(), and there can be only one active
+ * at a time. The backup label file of an exclusive backup is written to
+ * $PGDATA/backup_label, and it is removed by pg_stop_backup().
+ *
+ * A non-exclusive backup is used for the streaming base backups (see
+ * src/backend/replication/basebackup.c). The difference to exclusive backups
+ * is that the backup label file is not written to disk. Instead, its would-be
+ * contents are returned in *labelfile, and the caller is responsible for
+ * including it in the backup archive as 'backup_label'. There can be many
+ * non-exclusive backups active at the same time, and they don't conflict
+ * with exclusive backups either.
+ *
+ * Every successfully started non-exclusive backup must be stopped by calling
+ * do_pg_stop_backup() or do_pg_abort_backup().
+ */
 XLogRecPtr
-do_pg_start_backup(const char *backupidstr, bool fast)
+do_pg_start_backup(const char *backupidstr, bool fast, char **labelfile)
 {
+	bool		exclusive = (labelfile == NULL);
 	XLogRecPtr	checkpointloc;
 	XLogRecPtr	startpoint;
 	pg_time_t	stamp_time;
@@ -8332,6 +8353,7 @@ do_pg_start_backup(const char *backupidstr, bool fast)
 	uint32		_logSeg;
 	struct stat stat_buf;
 	FILE	   *fp;
+	StringInfoData labelfbuf;
 
 	if (!superuser()  !is_authenticated_user_replication_role())
 		ereport(ERROR,
@@ -8350,6 +8372,12 @@ do_pg_start_backup(const char *backupidstr, bool fast)
 			  errmsg(WAL level not sufficient for making an online backup),
  errhint(wal_level must be set to \archive\ or \hot_standby\ at server start.)));
 
+	if (strlen(backupidstr)  MAXPGPATH)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg(backup label too long (max %d bytes),
+		MAXPGPATH)));
+
 	/*
 	 * Mark backup active in shared memory.  We must do full-page WAL writes
 	 * during an on-line backup even if not doing so at other times, because
@@ -8368,15 +8396,19 @@ do_pg_start_backup(const char *backupidstr, bool fast)
 	 * ensure adequate interlocking against XLogInsert().
 	 */
 	LWLockAcquire(WALInsertLock, LW_EXCLUSIVE);
-	if (XLogCtl-Insert.forcePageWrites)
+	if (exclusive)
 	{
-		LWLockRelease(WALInsertLock);
-		ereport(ERROR,
-(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg(a backup is already in progress),
- errhint(Run pg_stop_backup() and try again.)));
+		if (XLogCtl-Insert.exclusiveBackup)
+		{
+			LWLockRelease(WALInsertLock);
+			ereport(ERROR,
+	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+	 errmsg(a backup is already in progress),
+	 errhint(Run pg_stop_backup() and try again.)));
+		}
+		XLogCtl-Insert.exclusiveBackup = true;
 	}
-	XLogCtl-Insert.forcePageWrites = true;
+	XLogCtl-Insert.forcePageWrites++;
 	LWLockRelease(WALInsertLock);
 
 	/*
@@ -8393,7 +8425,7 @@

Re: [HACKERS] reviewers needed!

2011-01-13 Thread Robert Haas

On Tue, Jan 11, 2011 at 9:17 PM, Robert Haas robertmh...@gmail.com wrote:
 [ abject plea for reviewers ]

So far I have 6 people who have volunteered to be round-robin
reviewers, and 7 people who are listed as reviewers on the CF site
already.  That leaves 45 patches without a reviewer, plus whatever
comes in in the next day or so.  This is not going to work unless a
lot more people pitch in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] reviewers needed!

2011-01-13 Thread Josh Berkus


 So far I have 6 people who have volunteered to be round-robin
 reviewers, and 7 people who are listed as reviewers on the CF site
 already.  That leaves 45 patches without a reviewer, plus whatever
 comes in in the next day or so.  This is not going to work unless a
 lot more people pitch in.

I'll be joining as an RRR on the 24th.  I'm too booked before then.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

On Thu, Jan 13, 2011 at 02:21:44PM -0500, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  I get that we can't prevent all pilot error, but I was hoping we
  could bullet-proof this a little more, especially in light of a
  certain extremely popular server OS's OOM killer's default
  behavior.
 
  Yes, I get that that behavior is crazy, and stupid, and that
  people should shut it off, but it *is* our problem if we let the
  postmaster start (or continue) when it's set that way.
 
 Packagers who are paying attention have fixed that ;-)

Are we privileging packaged over unpackaged?  Some distro over others?  ;)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Robert Haas

On Thu, Jan 6, 2011 at 11:37 PM, Greg Smith g...@2ndquadrant.com wrote:
 If it defaulted to 3% of shared_buffers, min 64K  max 16MB for the auto
 setting, it would for the most part become an autotuned parameter.  That
 would make it 0.75 to 1MB at the standard anemic Linux default kernel
 parameters.  Maybe more than some would like, but dropping shared_buffers
 from 24MB to 23MB to keep this from being ridiculously undersized is
 probably a win.  That percentage would reach 16MB by the time shared_buffers
 was increased to 533MB, which also seems about right to me.  On a really bad
 setup (brief pause to flip off Apple) with only 4MB to work with total,
 you'd end up with wal_buffers between 64 and 128K, so very close to the
 status quo.

 Code that up, and we could probably even remove the parameter as a tunable
 altogether.  Very few would see a downside relative to any sensible
 configuration under the current situation, and many people would notice
 better automagic performance with one less parameter to tweak.  Given the
 recent investigations about the serious downsides of tiny wal_buffers values
 on new Linux kernels when using open_datasync, a touch more aggression about
 this setting seems particularly appropriate to consider now.  That's been
 swapped out as the default, but it's still possible people will switch to
 it.

Would anyone like to argue vigorously for or against the above proposal?

I'll start: I think this is a good idea.  I don't have a strong
opinion on whether the exact details of Greg proposes above are
precisely optimal, but I think they're in the right ballpark.
Furthermore, we already have other things that are tuned in somewhat
similar ways (e.g. the size of the fsync request queue defaults to the
number of shared buffers) so there's precedent for it.  It's one less
parameter that you have to set to make things just work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Magnus Hagander

On Thu, Jan 13, 2011 at 23:19, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jan 6, 2011 at 11:37 PM, Greg Smith g...@2ndquadrant.com wrote:
 If it defaulted to 3% of shared_buffers, min 64K  max 16MB for the auto
 setting, it would for the most part become an autotuned parameter.  That
 would make it 0.75 to 1MB at the standard anemic Linux default kernel
 parameters.  Maybe more than some would like, but dropping shared_buffers
 from 24MB to 23MB to keep this from being ridiculously undersized is
 probably a win.  That percentage would reach 16MB by the time shared_buffers
 was increased to 533MB, which also seems about right to me.  On a really bad
 setup (brief pause to flip off Apple) with only 4MB to work with total,
 you'd end up with wal_buffers between 64 and 128K, so very close to the
 status quo.

 Code that up, and we could probably even remove the parameter as a tunable
 altogether.  Very few would see a downside relative to any sensible
 configuration under the current situation, and many people would notice
 better automagic performance with one less parameter to tweak.  Given the
 recent investigations about the serious downsides of tiny wal_buffers values
 on new Linux kernels when using open_datasync, a touch more aggression about
 this setting seems particularly appropriate to consider now.  That's been
 swapped out as the default, but it's still possible people will switch to
 it.

 Would anyone like to argue vigorously for or against the above proposal?

 I'll start: I think this is a good idea.  I don't have a strong
 opinion on whether the exact details of Greg proposes above are
 precisely optimal, but I think they're in the right ballpark.
 Furthermore, we already have other things that are tuned in somewhat
 similar ways (e.g. the size of the fsync request queue defaults to the
 number of shared buffers) so there's precedent for it.  It's one less
 parameter that you have to set to make things just work.

+1, I like the idea. Would it still be there to override if necessary?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 5:29 PM, Magnus Hagander mag...@hagander.net wrote:
 +1, I like the idea. Would it still be there to override if necessary?

Depends what people want to do.  We could make the default 0kB, and
define that to mean auto-tune, or we could remove the parameter
altogether.  I think I was envisioning the latter, but if people are
hesitant to do that we could do the former instead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Florian Pflug

On Jan13, 2011, at 21:42 , Tom Lane wrote:
 Aidan Van Dyk ai...@highrise.ca writes:
 If postmaster has a few fds to spare, what about having it open a pipe
 to every child it spawns.  It never has to read/write to it, but
 postmaster closing will signal the client's fd.  The client just has
 to pop the fd into whatever nrmal poll/select event handlign it uses
 to notice when the parent's pipe is closed.
 
 Hmm.  Or more generally: there's one FIFO.  The postmaster holds both
 sides open.  Backends hold the write side open.  (They can close the
 read side, but that would just be to free up a FD.)  Background children
 close the write side.  Now a background process can use EOF on the read
 side of the FIFO to tell it that postmaster and all backends have
 exited.  You still don't get a signal, but at least the condition you're
 testing for is the one we actually want and not an approximation.

I was thinking along a similar line, and put together small test case to
prove that this actually works. The attached test program simulates the
interactions of a parent process (think postmaster), some utility processes
(think walwriter, bgwriter, ...) and some backends. It uses two pairs of
fd created with pipe(), called LifeSignParent and LifeSignParentBackends.

The writing end of the former is held open only in the parent process,
while the writing end of the latter is held open in the parent process and
all regular backend processes. Backend processes use select() to monitor
the reading end of the LifeSignParent fd pair. Since nothing is ever written
to the writing end, the fd becomes readable only when the parent exits,
because that is how select() signals EOF. Once that happens the backend
exits. The utility processes do the same, but monitor the reading end of
LifeSignParentBackends, and thus exit only after the parent and all regular
backends have died.

Since the lifesign checking uses select(), any place that already uses 
select can easily check for vanishing life signs. CHECK_FOR_INTERRUPTS could
simply check the life sign once every few seconds.

If we want an absolutely reliable signal instead of checking in
CHECK_FOR_INTERRUPTS, every backend would need to launch a monitor subprocess
which monitors the life sign, and exits once it vanishes. The backend would
then get a SIGCHLD once the postmaster dies. Seems like overkill, though.

The whole thing won't work on Windows, since even if it's got a pipe() or
socketpair() call, with EXEC_BACKEND there's no way of transferring these
fds to the child processes. AFAIK, however, Windows has other means with 
which such life signs can be implemented. For example, I seem to remember
that WaitForMultipleObjects() can be used to wait for process-related events.
But windows really isn't my area of expertise...

I have tested this on the latest Ubunutu LTS release (10.04.1) as well as
Mac OS X 10.6.6, and it seems to work correctly on both systems. I'd be
happy to hear from anyone who has access to other systems on whether this
works or not. The expected output is

Launched utility 5095
Launched backend 5097
Launched utility 5096
Launched backend 5099
Launched backend 5098
Utility 5095 detected live parent or backend
Backend 5097 detected live parent
Utility 5096 detected live parent or backend
Backend 5099 detected live parent
Backend 5098 detected live parent
Parent exiting
Backend 5097 exiting after parent died
Backend 5098 exiting after parent died
Backend 5099 exiting after parent died
Utility 5096 exiting after parent and backends died
Utility 5095 exiting after parent and backends died

Everything after Parent exiting might be interleaved with a shell prompt,
of course.

best regards,
Florian Pflug


liveness.c
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Josh Berkus


 Depends what people want to do.  We could make the default 0kB, and
 define that to mean auto-tune, or we could remove the parameter
 altogether.  I think I was envisioning the latter, but if people are
 hesitant to do that we could do the former instead.

Unfortunately, we might still need a manual parameter for override
because of the interaction between wal_buffers and
synchronous_commit=off, since it sets the max size of the unflushed data
buffer.  Discuss?

And the auto setting should be -1, not 0kB.  We use -1 for use
default for several other GUCs.

Other than that, I think Greg's numbers are fine, and strongly support
having one less thing to tune.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 6:02 PM, Josh Berkus j...@agliodbs.com wrote:

 Depends what people want to do.  We could make the default 0kB, and
 define that to mean auto-tune, or we could remove the parameter
 altogether.  I think I was envisioning the latter, but if people are
 hesitant to do that we could do the former instead.

 Unfortunately, we might still need a manual parameter for override
 because of the interaction between wal_buffers and
 synchronous_commit=off, since it sets the max size of the unflushed data
 buffer.  Discuss?

Do we have any evidence there's actually a problem in that case, or
that a larger value of wal_buffers solves it?  I mean, the background
writer is going to start a background flush as quickly as it can...

 And the auto setting should be -1, not 0kB.  We use -1 for use
 default for several other GUCs.

No can do.  Gotta have things in the same units.

 Other than that, I think Greg's numbers are fine, and strongly support
 having one less thing to tune.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Do magic using pg_depend

2011-01-13 Thread Joel Jacobson

https://github.com/gluefinance/pov/blob/master/doc/example_database.sql

Please feel free to put any of this on the PostgreSQL wiki like
suggested by David Fetter.

This is an example of some functionality provided and used by pov
(PostgreSQL Object version control system).
Most of, if not all, this stuff has already been implemented in pg_dump,
but since pov is a SQL-based system it was necessary to implement the
same functionality
using only sql/plpgsql/plperl.

Author: Joel Jacobson, Glue Finance AB, Sweden, j...@gluefinance.com
Datestamp: 2011-01-13 23:42 Europe/Stockholm
License: MIT (http://www.opensource.org/licenses/mit-license.php)

We will learn how to do a lot of PostgreSQL-magic only by using the
nice system table pg_depend.
Today we will,
a) create nice directional graphs of all object dependencies,
b) sort all objects in a truly sorted topological creatable order,
c) show create/drop commands for most of the objects.

Let the show begin!

Installation:

$ git clone g...@github.com:gluefinance/pov.git
$ cd pov
$ sh install_example_database.sh

a) Generate directional graph in DOT-format.
COPY (SELECT diagraph FROM pov.pg_depend_dot) TO '/tmp/example_database.dot';

Then use the dot (http://www.graphviz.org/) to generate graphs in svg,
png, or any format.
dot -oexample_database.png -Tpng example_database.dot
dot -oexample_database.svg -Tsvg example_database.dot

Or view it in the SQL prompt:

test=# select * from pov.pg_depend_dot;
  diagraph
-
 digraph pg_depend {
 function plpgsql_call_handler() 1255.11599.0 - language
plpgsql 2612.11602.0 [color=black label=n]
 function plpgsql_inline_handler(internal) 1255.11600.0 -
language plpgsql 2612.11602.0 [color=black label=n]
 function plpgsql_validator(oid) 1255.11601.0 - language
plpgsql 2612.11602.0 [color=black label=n]
 function plperl_call_handler() 1255.23562.0 - language plperl
2612.23565.0 [color=black label=n]
 function plperl_inline_handler(internal) 1255.23563.0 -
language plperl 2612.23565.0 [color=black label=n]
 function plperl_validator(oid) 1255.23564.0 - language plperl
2612.23565.0 [color=black label=n]
 function f1(integer) 1255.23656.0 - view v4 1259.23688.0
[color=black label=n]
 function f1(integer) 1255.23656.0 - constraint t3_id_check on
table t3 2606.23673.0 [color=black label=n]
 table t1 1259.23651.0 - table t1 column id 1259.23651.1
[color=yellow label=an]
 table t1 column id 1259.23651.1 - view v1 1259.23676.0
[color=black label=n]
 table t1 column id 1259.23651.1 - constraint t1_pkey on table
t1 2606.23655.0 [color=blue label=a]
 table t1 column id 1259.23651.1 - constraint t2_id_fkey on
table t2 2606.23664.0 [color=black label=n]
 sequence s1 1259.23657.0 - default for table t3 column id
2604.23672.0 [color=black label=n]
 table t2 1259.23659.0 - table t2 column id 1259.23659.1
[color=yellow label=an]
 table t2 column id 1259.23659.1 - view v2 1259.23680.0
[color=black label=n]
 table t2 column id 1259.23659.1 - constraint t2_pkey on table
t2 2606.23663.0 [color=blue label=a]
 table t2 column id 1259.23659.1 - constraint t2_id_fkey on
table t2 2606.23664.0 [color=blue label=a]
 table t3 1259.23669.0 - table t3 column id 1259.23669.1
[color=yellow label=an]
 table t3 column id 1259.23669.1 - default for table t3 column
id 2604.23672.0 [color=blue label=a]
 table t3 column id 1259.23669.1 - constraint t3_id_check on
table t3 2606.23673.0 [color=yellow label=na]
 table t3 column id 1259.23669.1 - constraint t3_pkey on table
t3 2606.23675.0 [color=blue label=a]
 view v1 1259.23676.0 - view v1 column id 1259.23676.1
[color=black label=n]
 view v1 column id 1259.23676.1 - view v3 1259.23684.0
[color=black label=n]
 view v2 1259.23680.0 - view v2 column id 1259.23680.1
[color=black label=n]
 view v2 column id 1259.23680.1 - view v3 1259.23684.0
[color=black label=n]
 view v3 1259.23684.0 - view v3 column id1 1259.23684.1
[color=black label=n]
 view v3 1259.23684.0 - view v3 column id2 1259.23684.2
[color=black label=n]
 view v3 column id1 1259.23684.1 - view v4 1259.23688.0
[color=black label=n]
 view v3 column id2 1259.23684.2 - view v4 1259.23688.0
[color=black label=n]
 constraint t1_pkey on table t1 2606.23655.0 - constraint
t2_id_fkey on table t2 2606.23664.0 [color=black label=n]
 schema public 2615.2200.0 - function f1(integer)
1255.23656.0 [color=black label=n]
 schema public 2615.2200.0 - table t1 1259.23651.0
[color=black label=n]
 schema public 2615.2200.0 - sequence s1 1259.23657.0
[color=black label=n]
 schema public 2615.2200.0 - table t2 1259.23659.0
[color=black label=n]
 schema public 2615.2200.0 - table t3 1259.23669.0
[color=black label=n]
 schema public 2615.2200.0 - view v1

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Josh Berkus

Robert,

 Unfortunately, we might still need a manual parameter for override
 because of the interaction between wal_buffers and
 synchronous_commit=off, since it sets the max size of the unflushed data
 buffer.  Discuss?
 
 Do we have any evidence there's actually a problem in that case, or
 that a larger value of wal_buffers solves it?  I mean, the background
 writer is going to start a background flush as quickly as it can...

I don't think anyone has done any testing.  However, the setting is
there and some users might be convinced that they need it.

 And the auto setting should be -1, not 0kB.  We use -1 for use
 default for several other GUCs.
 
 No can do.  Gotta have things in the same units.

That's certainly not true with, for example, log_temp_files.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 Would anyone like to argue vigorously for or against the above
 proposal?
 
Greg's numbers look reasonable to me, and there's nobody I'd trust
more to come up with reasonable numbers for this.  One less tunable
is a good thing, especially since this designed to scale from
someone slapping it on his laptop for a first quick try, all the way
up to industrial strength production environments.  I guess a manual
override doesn't bother me too much, but I am a bit dubious of its
value, and there is value in keeping the GUC count down
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Jeff Davis

On Thu, 2011-01-13 at 11:14 -0800, David Fetter wrote:
 I get that we can't prevent all pilot error, but I was hoping we could
 bullet-proof this a little more, especially in light of a certain
 extremely popular server OS's OOM killer's default behavior.

That's a good point. I'm not sure how much action can reasonably be
taken, however.

 Yes, I get that that behavior is crazy, and stupid, and that people
 should shut it off, but it *is* our problem if we let the postmaster
 start (or continue) when it's set that way.

As an aside, linux has actually changed the heuristic:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a63d83f427fbce97a6cea0db2e64b0eb8435cd10

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread David Fetter

On Thu, Jan 13, 2011 at 03:29:13PM -0800, Jeff Davis wrote:
 On Thu, 2011-01-13 at 11:14 -0800, David Fetter wrote:
  I get that we can't prevent all pilot error, but I was hoping we
  could bullet-proof this a little more, especially in light of a
  certain extremely popular server OS's OOM killer's default
  behavior.
 
 That's a good point.  I'm not sure how much action can reasonably be
 taken, however.

We may find out from Florian's experiments :)

  Yes, I get that that behavior is crazy, and stupid, and that
  people should shut it off, but it *is* our problem if we let the
  postmaster start (or continue) when it's set that way.
 
 As an aside, linux has actually changed the heuristic:
 
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a63d83f427fbce97a6cea0db2e64b0eb8435cd10

Great!  In a decade or so, no more servers will be running with an
earlier kernel ;)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-13 Thread Tatsuo Ishii

 On Thu, Jan 13, 2011 at 2:13 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Ok. Here is the patch for this. I use 40P02, instead of 40004.
 
 Please add this to the currently open CommitFest:
 
 https://commitfest.postgresql.org/action/commitfest_view/open

Done. Comments are welcome. Unless there's objection, I will commit it
this weekend.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-13 Thread Tom Lane

Tatsuo Ishii is...@postgresql.org writes:
 Please add this to the currently open CommitFest:
 https://commitfest.postgresql.org/action/commitfest_view/open

 Done. Comments are welcome. Unless there's objection, I will commit it
 this weekend.

If you're expecting anyone to actually *review* it during the CF,
that's a bit premature.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] arrays as pl/perl input arguments [PATCH]

2011-01-13 Thread Stephen J. Butler

On Thu, Jan 13, 2011 at 2:06 AM, Martijn van Oosterhout
klep...@svana.org wrote:
 I played with this a little and it is fairly easy to make a variable
 such that $a is the string representation and $a[0] the first value of
 the array. The problem is that you can't pass such a variable into a
 subroutine.

I played with this too:


#!/usr/bin/perl -w

use strict;

package Pg::ArrayArg;

use overload
''= \as_s,
'@{}'   = \as_a;

sub new {
my $proto = shift;
my $class = ref $proto || $proto;

bless {
string  = shift,
array   = shift
}, $class;
}

sub as_s {
shift-{ 'string' };
}

sub as_a {
shift-{ 'array' };
}

package main;

my $aa = Pg::ArrayArg-new( '{1,2}', [ 1, 2 ] );

printf ref = %s\n, ref $aa;
print string = $aa\n;
printf string = %s\n, $aa;
printf array index = (%s, %s)\n, $aa-[ 0 ], $aa-[ 1 ];
printf array_ref = %s\n, scalar @$aa;

print regexp test = ;
if ($aa =~ /^{(.*)}$/) {
print looks like array\n;
printf join of split = %s\n, join ';', split /,/, $1;
} else {
print doesn't look like array\n;
}

Suppose one of these compatibility objects is passed into legacy code
as $_[0]. The problem is that 'ref $_[0]' will return 'Pg::ArrayArg'
instead of what it used to, '' (empty string). Other than that, I
think it performs as people would expect.

You could even change 'as_s' to generate the string on the fly as
requested instead of generating both representations at instantiation.

Just my $0.02.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 8

2011-01-13 Thread Kevin Grittner

Anssi Kääriäinenanssi.kaariai...@thl.fi wrote:
 
 I think I found a problem. This is using SSI v8. The table
 definition:
 
 create table test_t (id integer, val1 text, val2 integer);
 
 create index test_idx on test_t(id) where val2 = 1;
 
 insert into test_t (select generate_series(0, 1), 'a', 2);
 insert into test_t (select generate_series(0, 10), 'a', 1);
 
 T1:
 hot2= begin transaction isolation level serializable;
 
 hot2= select * from test_t where val2 = 1;
 
 hot2= update test_t set val2 = 2 where val2 = 1 and id = 10;
 
 T2:
 hot2= begin transaction isolation level serializable;
 
 hot2= select * from test_t where val2 = 1;
 
 hot2= update test_t set val2 = 2 where val2 = 1 and id = 9;
 
 hot2= commit;
 
 T1:
 hot2= commit;
 
I hope you have no objection to having the code you wrote included
in the test suite which is part of the patch.  Well, if you do, I'll
pull it back out and invent something similar...  ;-)
 
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=2502cccbdd5e5d44be469549b91fe49c0554ec3e
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] arrays as pl/perl input arguments [PATCH]

2011-01-13 Thread David E. Wheeler

On Jan 13, 2011, at 4:15 PM, Stephen J. Butler wrote:

 Suppose one of these compatibility objects is passed into legacy code
 as $_[0]. The problem is that 'ref $_[0]' will return 'Pg::ArrayArg'
 instead of what it used to, '' (empty string). Other than that, I
 think it performs as people would expect.

Well, frankly, since up to this patch you *never* got an ARRAY reference 
argument, who would be calling `ref` on it anyway?

 You could even change 'as_s' to generate the string on the fly as
 requested instead of generating both representations at instantiation.

Yep.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 5:29 PM, Magnus Hagander mag...@hagander.net wrote:
 +1, I like the idea. Would it still be there to override if necessary?

 Depends what people want to do.  We could make the default 0kB, and
 define that to mean auto-tune, or we could remove the parameter
 altogether.  I think I was envisioning the latter, but if people are
 hesitant to do that we could do the former instead.

I think we need to keep the override capability until the autotune
algorithm has proven itself in the field for a couple of years.

I agree with Josh that a negative value should be used to select the
autotune method.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-13 Thread Tatsuo Ishii

 Tatsuo Ishii is...@postgresql.org writes:
 Please add this to the currently open CommitFest:
 https://commitfest.postgresql.org/action/commitfest_view/open
 
 Done. Comments are welcome. Unless there's objection, I will commit it
 this weekend.
 
 If you're expecting anyone to actually *review* it during the CF,
 that's a bit premature.

No problem to wait for longer. I will wait by the end of January for
the present.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Killing active sessions when it's not absolutely necessary is not an
 asset.

 That's a highly arguable point and I certainly don't agree with it.

Your examples appear to rely on the assumption that background processes
exit instantly when the postmaster dies.  Which they should not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Florian Pflug

On Jan14, 2011, at 01:32 , Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Killing active sessions when it's not absolutely necessary is not an
 asset.
 
 That's a highly arguable point and I certainly don't agree with it.
 
 Your examples appear to rely on the assumption that background processes
 exit instantly when the postmaster dies.  Which they should not.

Even if they stay around, no new connections will be possible once the
postmaster is gone. So this really comes down to what somebody perceives
to be a bigger problem - new connections failing or existing connections
being terminated.

I don't believe there's one right answer to that.

Assume postgres is driving a website, and the postmaster crashes shortly
after a pg_dump run started. You probably won't want your website to be
offline while pg_dump is finishing its backup.

If, on the other hand, your data warehousing database is running a
multi-hour query, you might prefer that query to finish, even at the price
of not being able to accept new connections.

So maybe there should be a GUC for this?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 I don't believe there's one right answer to that.

Right.  Force-kill presumes there is only one right answer.

 Assume postgres is driving a website, and the postmaster crashes shortly
 after a pg_dump run started. You probably won't want your website to be
 offline while pg_dump is finishing its backup.

 If, on the other hand, your data warehousing database is running a
 multi-hour query, you might prefer that query to finish, even at the price
 of not being able to accept new connections.

 So maybe there should be a GUC for this?

No need (and rather inflexible anyway).  If you don't want an orphaned
backend to continue, you send it SIGTERM.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 7:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Killing active sessions when it's not absolutely necessary is not an
 asset.

 That's a highly arguable point and I certainly don't agree with it.

 Your examples appear to rely on the assumption that background processes
 exit instantly when the postmaster dies.  Which they should not.

But they do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 I don't believe there's one right answer to that.

 Right.  Force-kill presumes there is only one right answer.

 Assume postgres is driving a website, and the postmaster crashes shortly
 after a pg_dump run started. You probably won't want your website to be
 offline while pg_dump is finishing its backup.

 If, on the other hand, your data warehousing database is running a
 multi-hour query, you might prefer that query to finish, even at the price
 of not being able to accept new connections.

 So maybe there should be a GUC for this?

 No need (and rather inflexible anyway).  If you don't want an orphaned
 backend to continue, you send it SIGTERM.

It is not easy to make this work in such a way that you can ensure a
clean, automatic restart of PostgreSQL after a postmaster death.
Which is what at least some people want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 So maybe there should be a GUC for this?

 No need (and rather inflexible anyway).  If you don't want an orphaned
 backend to continue, you send it SIGTERM.

 It is not easy to make this work in such a way that you can ensure a
 clean, automatic restart of PostgreSQL after a postmaster death.
 Which is what at least some people want.

True.  It strikes me also that the postmaster does provide some services
other than accepting new connections:

* ensuring that everybody gets killed if a backend crashes

* respawning autovac launcher and other processes that might exit
harmlessly

* is there still any cross-backend signaling that goes through the
postmaster?  We got rid of the sinval case, but I don't recall if
there's others.

While you could probably live without these in the scenario of let my
honking big query finish before restarting, you would not want to do
without them in unattended operation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] kill -KILL: What happens?

2011-01-13 Thread Robert Haas

On Thu, Jan 13, 2011 at 8:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 13, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 So maybe there should be a GUC for this?

 No need (and rather inflexible anyway).  If you don't want an orphaned
 backend to continue, you send it SIGTERM.

 It is not easy to make this work in such a way that you can ensure a
 clean, automatic restart of PostgreSQL after a postmaster death.
 Which is what at least some people want.

 True.  It strikes me also that the postmaster does provide some services
 other than accepting new connections:

 * ensuring that everybody gets killed if a backend crashes

 * respawning autovac launcher and other processes that might exit
 harmlessly

 * is there still any cross-backend signaling that goes through the
 postmaster?  We got rid of the sinval case, but I don't recall if
 there's others.

 While you could probably live without these in the scenario of let my
 honking big query finish before restarting, you would not want to do
 without them in unattended operation.

Yep.  I'm pretty doubtful that you're going to want them even in that
case, but you're surely not going to want them in unattended
operation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixing GIN for empty/null/full-scan cases

2011-01-13 Thread Bruce Momjian

Robert Haas wrote:
 On Tue, Jan 4, 2011 at 4:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  On Tue, Jan 4, 2011 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  * Existing GIN indexes are upwards compatible so far as on-disk storage
  goes, but they will of course be missing entries for empty, null, or
  null-containing items. ?Users who want to do searches that should find
  such items will need to reindex after updating to 9.1.
 
  This is the only part of this proposal that bothers me a little bit.
  It would be nice if the system could determine whether a GIN index is
  upgraded from 9.0 or earlier and thus doesn't contain these entries
  - and avoid trying to use the index for these sorts of queries in
  cases where it might return wrong answers.
 
  I don't think it's really worth the trouble. ?The GIN code has been
  broken for these types of queries since day one, and yet we've had only
  maybe half a dozen complaints about it. ?Moreover there's no practical
  way to avoid trying to use the index, since in many cases the fact
  that a query requires a full-index scan isn't determinable at plan time.
 
  The best we could really do is throw an error at indexscan start, and
  that doesn't seem all that helpful. ?But it probably wouldn't take much
  code either, if you're satisfied with that answer. ?(I'm envisioning
  adding a version ID to the GIN metapage and then checking that before
  proceeding with a full-index scan.)
 
 I'd be satisfied with that answer.  It at least makes it a lot more
 clear when you've got a problem.  If this were a more common scenario,
 I'd probably advocate for a better solution, but the one you propose
 seems adequate given the frequency of the problem as you describe it.

What does pg_upgrade need to do about this for 9.1?  Just tell people
they might get an GIN error someday?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 111 matches

Mail list logo