Re: [HACKERS] pg_upgrade code questions

2010-05-14 Thread Devrim GÜNDÜZ
On Thu, 2010-05-13 at 17:19 +0200, Magnus Hagander wrote:
 I say remove it. On all accounts.
 
 There's a fork of postgres for EDB AS, shouldn't there be a fork of
 pg_upgrade the same way, if it requires special code? The code in
 community postgresql certainly shouldn't have any EDB AS code in it.

Agreed.
-- 
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
PostgreSQL RPM Repository: http://yum.pgrpms.org
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] pg_upgrade code questions

2010-05-14 Thread Magnus Hagander
On Fri, May 14, 2010 at 5:34 AM, Bruce Momjian br...@momjian.us wrote:
 Takahiro Itagaki wrote:

 Bruce Momjian br...@momjian.us wrote:

 2. extern PGDLLIMPORT 
pg_upgrade has own definitions of
extern PGDLLIMPORT Oid binary_upgrade_next_xxx
  
The issue here is that you use PGDLLIMPORT where you are importing the
variable, not where it is defined.  For example, look at
'seq_page_cost'.  You can see PGDLLIMPORT used where it is imported 
with
'extern', but not where is it defined.
  
   Right.  Also we are intentionally not exposing those variables in any
   backend .h file, because they are not meant for general use.  So the
   extern PGDLLIMPORT isn't going to be in the main backend and has to
   be in pg_upgrade.  This was discussed awhile ago when we put in those
   variables, I believe.
 
  Yes, this was discussed.

 I wonder some compilers or linkers might hide unexported global variables
 from postgres.lib as if they are declared with 'static' specifiers.
 I'm especially worried about Windows and MSVC. So, if Windows testers
 can see it works, there was nothing to worry about.

 Yes, none of the variables pg_upgrade is referencing are 'static', and
 Magnus tested MSVC and checked MinGW compiles.

Just to be clear, I only verified that it *built*, didn't have time to
check if it actually *worked*.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Zdenek Kotala
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01


msgfmt -o po/ja.mo po/ja.po
WARNING: the string after closing  is ignored at line number 11.
Error, No space after directive at line number 2008.
ERROR: Exiting...
gmake[2]: *** [po/ja.mo] Error 2


The problem is that it contains mix of DOS/Unix end of lines.


Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Yeb Havinga

Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:


Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:


My $0.02 - I like the whole 'don't sort, search' (or how did they call
it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing map/kohonen),
so my local filters could make labels based on that, instead of perhaps
badly spelled keywords in subjects or message body.


I missed this when I read it the first time .. all list email does 
have an X-Mailing-List header added so that you can label based on 
list itself ... is that what you mean, or are you thinking of 
something else entirely?
Something else: if automatic classification of articles was in place, 
there would be need of fewer mailing lists, depending on the quality of 
the classification.


IMHO the problem of handling the big volume of the lists is not solved 
by splitting into more, since it does not decrease the amount of posts 
that are interesting from the subscribers perspective. It would only 
mean that posters are more likely to make mistakes, a possible increase 
in crossposts or 'my question was not answered there so now I try here' 
on the sender part, and at the subscriber side bigger chance to miss 
interesting articles. That my current mailing list setup works for me 
supports this claim; I did not subscribe to less lists, but managed to 
decrease the ms spent at 'handling' to an amount that became workable.


Though I do not believe algorithmic article classification/ranking to 
provide a 100% fool proof filter, it might help decreasing the ms spent 
per article more. Take a look at how carrot2 clusters results from 
the query  postgresql prepared transactions site:postgresql.org - 
http://search.carrot2.org/stable/search?source=webview=treeskin=fancy-compactquery=postgresql+prepared+transactions+site%3Apostgresql.orgresults=100algorithm=lingoEToolsDocumentSource.country=ALLEToolsDocumentSource.language=ENGLISHEToolsDocumentSource.safeSearch=false


I wonder if a cluster algorithm could tag articles with (multiple) 
keywords, e.g. 'hackers','prepared transaction','dba' etc etc. I could 
then make filters or ranking on: hackers AND optimizer - +10.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Nicolas Barbier
2010/5/14 Greg Stark gsst...@mit.edu:

 On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote:

 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT

 Can you give an actual realistic example -- ie, not doing a select for
 update and then never updating the row or with an explanation of what
 the programmer is attempting to accomplish with such an unusual
 sequence? The rest of the post talks about FKs but I don't see any
 here...

The link with FKs is as follows:

* The example does not use a real FK, because the whole purpose is to
do the same as FKs while not using the FK machinery.
* The example uses only one table, because that is enough to
illustrate the problem (see next items).
* C1 locks a row, supposedly because it wants to create a reference to
it in a non-mentioned table, and wants to prevent the row from being
deleted under it.
* C2 deletes that row (supposedly after it verified that there are no
references to it; it would indeed not be able to see the reference
that C1 created/would create), and C1 fails to detect that.
* C2 also fails to detect the problem, because the lock that C1 held
is being released after C1 commits, and C2 can happily go on deleting
the row.
* The end result is that the hypothetical reference is created,
although the referent is gone.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug

On May 14, 2010, at 2:37 , Greg Stark wrote:

 On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote:
 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT
 
 
 Can you give an actual realistic example -- ie, not doing a select for
 update and then never updating the row or with an explanation of what
 the programmer is attempting to accomplish with such an unusual
 sequence? The rest of the post talks about FKs but I don't see any
 here...

The table t is supposed to represent the parent table of a FK constraint. The 
SELECT FOR UPDATE is done upon an INSERT to the child table to protect the 
parent row against concurrent deletion. I've used FOR UPDATE instead of FOR 
SHARE because I did test this against oracle also, and oracle does not support 
FOR SHARE. 

Here's a full example of a pair of FK triggers in PL/PGSQL that work correctly 
in READ COMMITTED mode but fail to enforce the constraint in SERIALIZABLE mode 
as the following sequence of commands show. With my proposal, the DELETE would 
again raise a serialization error and hence keep the constraint satisfied.

C1: BEGIN
C1: INSERT INTO child (parent_id) VALUES (1) -- Locks the parent row FOR UPDATE
C2: BEGIN
C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
C2: SELECT TRUE -- Take snapshot *before* C1 commits
C1: COMMIT
C2: DELETE FROM parent WHERE parent_id = 1 -- Succeeds
C2: COMMIT

--
CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY);
CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT 
NULL);

CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM child WHERE parent_id = OLD.parent_id;
 IF FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || OLD.parent_id || ' still 
referenced during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE 
PROCEDURE ri_parent();

CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR UPDATE OF parent;
 IF NOT FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || NEW.parent_id || ' does 
not exist during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE 
PROCEDURE ri_child();
--

best regards,

Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nvarchar notation accepted?

2010-05-14 Thread Florian Pflug
On May 14, 2010, at 5:56 , Jaime Casanova wrote:
 On Thu, May 13, 2010 at 10:13 PM, Takahiro Itagaki
 itagaki.takah...@oss.ntt.co.jp wrote:
 
 Jaime Casanova ja...@2ndquadrant.com wrote:
 
 i migrate a ms sql server database to postgres and was trying some
 queries from the application to find if everything works right...
 when i was looking to those queries i found some that has a notation
 for nvarchar (ej: campo = N'sometext')
 
 Do you have documentation for N'...' literal in SQLServer?
 Does it mean unicode literal? What is the difference from U literal?
 http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html
 
 
 nop, only thing i found is about NVARCHAR:
 http://msdn.microsoft.com/en-us/library/ms186939.aspx but it has no
 examples about the N'' notation although you can find examples of it
 use here: http://msdn.microsoft.com/en-us/library/dd776381.aspx#BasicSyntax

Without using the N prefixed versions of CHAR, VARCHAR and string literals, MS 
SQL Server refuses to process characters other than those in the database's 
character set. It will replace all those characters with '?'.

Note that this is not an encoding issue - it will even do so with protocol 
versions (everything = 7.0 I think) that use UTF16 on-wire, where those 
characters can be transmitted just fine.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Takahiro Itagaki

Zdenek Kotala zdenek.kot...@sun.com wrote:

 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01
 The problem is that it contains mix of DOS/Unix end of lines.

I removed two CRs in ja.po.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
[slight rearrangement]
 
Florian Pflug  wrote:
 
 I'm very exited about the work you're doing
 
Always nice to hear.  :-)
 
 I view my proposal as pretty orthogonal to that work.
 
 My proposal allows for simple FK-like constraints to be
 implemented at user-level that are correct for all isolation
 levels.
 
OK, I can see the attraction in that.
 
 True serializable transaction are much more powerful than what I
 proposed, but at a much higher price too, due to the necessity of
 SIREAD locks.
 
I think that SIREAD locks will generally be cheaper than SELECT FOR
UPDATE, since the former don't require any disk I/O and the latter
do.  I only have one benchmark so far (more on the way), but it
attempts to isolate the cost of acquiring the SIREAD locks by using
a read-only load against a fully cached database.  Benchmarks so far
show the new version of the SERIALIZABLE level as supporting 1.8%
fewer TPS than REPEATABLE READ (the existing snapshot isolation
level) in that environment.  That will probably disappear into the
noise for any load involving disk I/O.
 
Now *rollbacks*, particularly those due to false positives, might
become a more serious issue in some pessimal loads, but I'm still
working on developing meaningful benchmarks for that.
 
I guess what I'm suggesting is that unless you have a very small
database with a very large number of connections in a high
contention workload, or you can't require SERIALIZABLE transaction
isolation level, SSI might actually perform better than what you're
proposing.  Of course, that's all conjecture until there are
benchmarks; but I'd be very interested in getting any and all
alternative solutions like this worked into a benchmark -- where I
can pull out the FOR UPDATE and FOR SHARE clauses, any redundant
updates or denormalizations added just for concurrency issues, and
all explicit locking -- and compare that under SERIALIZABLE to the
original performance.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Fujii Masao
2010/4/29 Boszormenyi Zoltan z...@cybertec.at:
 attached is a patch that does $SUBJECT, we are submitting it for 9.1.
 I have updated it to today's CVS after the wal_level GUC went in.

I'm planning to create the synchronous replication patch for 9.0, too.
My design is outlined in the wiki. Let's work together to do the design
of it.
http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability

The log-shipping replication has some synchronization levels as follows.
Which are you going to work on?

The transaction commit on the master
#1 doesn't wait for replication (already suppored in 9.0)
#2 waits for WAL to be received by the standby
#3 waits for WAL to be received and flushed by the standby
#4 waits for WAL to be received, flushed and replayed by the standby
..etc?

I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside
the scope of my development for at least 9.1. In #4, read-only query
can easily block recovery by the lock conflict and make the
transaction commit on the master get stuck. This problem is difficult
to be addressed within 9.1, I think. But the design and implementation
of #2 and #3 need to be easily extensible to #4.

 How does it work?

 First, the walreceiver and the walsender are now able to communicate
 in a duplex way on the same connection, so while COPY OUT is
 in progress from the primary server, the standby server is able to
 issue PQputCopyData() to pass the transaction IDs that were seen
 with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
 signatures. I did by adding a new protocol message type, with letter
 'x' that's only acknowledged by the walsender process. The regular
 backend was intentionally unchanged so an SQL client gets a protocol
 error. A new libpq call called PQsetDuplexCopy() which sends this
 new message before sending START_REPLICATION. The primary
 makes a note of it in the walsender process' entry.

 I had to move the TransactionIdLatest(xid, nchildren, children) call
 that computes latestXid earlier in RecordTransactionCommit(), so
 it's in the critical section now, just before the
 XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
 call. Otherwise, there was a race condition between the primary
 and the standby server, where the standby server might have seen
 the XLOG_XACT_COMMIT record for some XIDs before the
 transaction in the primary server marked itself waiting for this XID,
 resulting in stuck transactions.

You seem to have chosen #4 as synchronization level. Right?

In your design, the transaction commit on the master waits for its XID
to be read from the XLOG_XACT_COMMIT record and replied by the standby.
Right? This design seems not to be extensible to #2 and #3 since
walreceiver cannot read XID from the XLOG_XACT_COMMIT record. How about
using LSN instead of XID? That is, the transaction commit waits until
the standby has reached its LSN. LSN is more easy-used for walreceiver
and startup process, I think.

What if the synchronous standby starts up from the very old backup?
The transaction on the master needs to wait until a large amount of
outstanding WAL has been applied? I think that synchronous replication
should start with *asynchronous* replication, and should switch to the
sync level after the gap between servers has become enough small.
What's your opinion?

 I have added 3 new options, two GUCs in postgresql.conf and one
 setting in recovery.conf. These options are:

 1. min_sync_replication_clients = N

 where N is the number of reports for a given transaction before it's
 released as committed synchronously. 0 means completely asynchronous,
 the value is maximized by the value of max_wal_senders. Anything
 in between 0 and max_wal_senders means different levels of partially
 synchronous replication.

 2. strict_sync_replication = boolean

 where the expected number of synchronous reports from standby
 servers is further limited to the actual number of connected synchronous
 standby servers if the value of this GUC is false. This means that if
 no standby servers are connected yet then the replication is asynchronous
 and transactions are allowed to finish without waiting for synchronous
 reports. If the value of this GUC is true, then transactions wait until
 enough synchronous standbys connect and report back.

Why are these options necessary?

Can these options cover more than three synchronization levels?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to know killed by pg_terminate_backend

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 8:20 PM, Tatsuo Ishii is...@postgresql.org wrote:
  Maybe we could make PostgreSQL a little bit smarter so that it returns
  a different code than 57P01 when killed by pg_terminate_backend().

 Seems reasonable. Does the victim backend currently know why it has been
 killed?

 I don't think so.

 One idea is postmaster sets a flag in the shared memory area
 indicating it rceived SIGTERM before forwarding the signal to
 backends.

 Backend check the flag and if it's not set, it knows that the signal
 has been sent by pg_terminate_backend(), not postmaster.

Or it could also be sent by some other user process, like the user
running kill from the shell.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug
On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 True serializable transaction are much more powerful than what I
 proposed, but at a much higher price too, due to the necessity of
 SIREAD locks.
 
 I think that SIREAD locks will generally be cheaper than SELECT FOR
 UPDATE, since the former don't require any disk I/O and the latter
 do.  I only have one benchmark so far (more on the way), but it
 attempts to isolate the cost of acquiring the SIREAD locks by using
 a read-only load against a fully cached database.  Benchmarks so far
 show the new version of the SERIALIZABLE level as supporting 1.8%
 fewer TPS than REPEATABLE READ (the existing snapshot isolation
 level) in that environment.  That will probably disappear into the
 noise for any load involving disk I/O.

I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE 
or FOR UPDATE lock. But the number of SIREAD locks would exceed the number of 
FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least 
of you ran even transaction under true serializable isolation.

I don't quite understand how SIREAD locks work if they don't involve any disk 
IO, since shared memory isn't resizable. But I guess I'll find out once you 
post the patch ;-)

 I guess what I'm suggesting is that unless you have a very small
 database with a very large number of connections in a high
 contention workload, or you can't require SERIALIZABLE transaction
 isolation level, SSI might actually perform better than what you're
 proposing.

That is entirely possible. However, unless your patch completely removes 
support for snapshot isolation (what is current called SERIALIZABLE), my 
proposal still eliminates the situation that user-level constraints are correct 
in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot 
isolation.

Btw, the only user of FOR SHARE locks inside postgres proper are the RI 
triggers, and those do that special crosscheck when called within a 
SERIALIZABLE transactions. I do take this as evidence that the current behavior 
might not be all that useful with serializable transactions...

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Zdenek Kotala
Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900:
 Zdenek Kotala zdenek.kot...@sun.com wrote:
 
  http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01
  The problem is that it contains mix of DOS/Unix end of lines.
 
 I removed two CRs in ja.po.

Thanks. Gothic moth is green again

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_mothbr=HEAD

The rest solaris machine will recover during a night.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Boszormenyi Zoltan
Fujii Masao írta:
 2010/4/29 Boszormenyi Zoltan z...@cybertec.at:
   
 attached is a patch that does $SUBJECT, we are submitting it for 9.1.
 I have updated it to today's CVS after the wal_level GUC went in.
 

 I'm planning to create the synchronous replication patch for 9.0, too.
 My design is outlined in the wiki. Let's work together to do the design
 of it.
 http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability

 The log-shipping replication has some synchronization levels as follows.
 Which are you going to work on?

 The transaction commit on the master
 #1 doesn't wait for replication (already suppored in 9.0)
 #2 waits for WAL to be received by the standby
 #3 waits for WAL to be received and flushed by the standby
 #4 waits for WAL to be received, flushed and replayed by the standby
 ..etc?

 I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside
 the scope of my development for at least 9.1. In #4, read-only query
 can easily block recovery by the lock conflict and make the
 transaction commit on the master get stuck. This problem is difficult
 to be addressed within 9.1, I think. But the design and implementation
 of #2 and #3 need to be easily extensible to #4.

   
 How does it work?

 First, the walreceiver and the walsender are now able to communicate
 in a duplex way on the same connection, so while COPY OUT is
 in progress from the primary server, the standby server is able to
 issue PQputCopyData() to pass the transaction IDs that were seen
 with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
 signatures. I did by adding a new protocol message type, with letter
 'x' that's only acknowledged by the walsender process. The regular
 backend was intentionally unchanged so an SQL client gets a protocol
 error. A new libpq call called PQsetDuplexCopy() which sends this
 new message before sending START_REPLICATION. The primary
 makes a note of it in the walsender process' entry.

 I had to move the TransactionIdLatest(xid, nchildren, children) call
 that computes latestXid earlier in RecordTransactionCommit(), so
 it's in the critical section now, just before the
 XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
 call. Otherwise, there was a race condition between the primary
 and the standby server, where the standby server might have seen
 the XLOG_XACT_COMMIT record for some XIDs before the
 transaction in the primary server marked itself waiting for this XID,
 resulting in stuck transactions.
 

 You seem to have chosen #4 as synchronization level. Right?
   

Yes.

 In your design, the transaction commit on the master waits for its XID
 to be read from the XLOG_XACT_COMMIT record and replied by the standby.
 Right? This design seems not to be extensible to #2 and #3 since
 walreceiver cannot read XID from the XLOG_XACT_COMMIT record.

Yes, this was my problem, too. I would have had to
implement a custom interpreter into walreceiver to
process the WAL records and extract the XIDs.

But at least the supporting details, i.e. not opening another
connection, instead being able to do duplex COPY operations in
a server-acknowledged way is acceptable, no? :-)

  How about
 using LSN instead of XID? That is, the transaction commit waits until
 the standby has reached its LSN. LSN is more easy-used for walreceiver
 and startup process, I think.
   

Indeed, using the LSN seems to be more appropriate for
the walreceiver, but how would you extract the information
that a certain LSN means a COMMITted transaction? Or
we could release a locked transaction in case the master receives
an LSN greater than or equal to the transaction's own LSN?

Sending back all the LSNs in case of long transactions would
increase the network traffic compared to sending back only the
XIDs, but the amount is not clear for me. What I am more
worried about is the contention on the ProcArrayLock.
XIDs are rarer then LSNs, no?

 What if the synchronous standby starts up from the very old backup?
 The transaction on the master needs to wait until a large amount of
 outstanding WAL has been applied? I think that synchronous replication
 should start with *asynchronous* replication, and should switch to the
 sync level after the gap between servers has become enough small.
 What's your opinion?
   

It's certainly one option, which I think partly addressed
with the strict_sync_replication knob below.
If strict_sync_replication = off, then the master doesn't make
its transactions wait for the synchronous reports, and the client(s)
can work through their WALs. IIRC, the walreceiver connects
to the master only very late in the recovery process, no?

It would be nicer if it could be made automatic. I simply thought
that there may be situations where the strict behaviour may be
desired. I was thinking about the transactions executed on the
master between the standby startup and walreceiver connection.
Someone may want to ensure the synchronous behaviour
for every xact, no 

Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 
 I think that SIREAD locks will generally be cheaper than SELECT
 FOR UPDATE, since the former don't require any disk I/O and the
 latter do.
 
 I can see how a single SIREAD lock can potentially be cheaper than
 a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks
 would exceed the number of FOR SHARE / FOR UPDATE locks by a few
 order of magnitude I'd think - at least of you ran even
 transaction under true serializable isolation.
 
 I don't quite understand how SIREAD locks work if they don't
 involve any disk IO, since shared memory isn't resizable.
 
We use a well-worn technique used by many (most?) database products
-- granularity promotion.  This is one of the things which could
cause enough false positives under some loads to cause your
technique to perform better than SSI for those loads.
 
 unless your patch completely removes support for snapshot
 isolation (what is current called SERIALIZABLE)
 
Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
isolation.  We're leaving REPEATABLE READ alone.
 
 my proposal still eliminates the situation that user-level
 constraints are correct in READ COMMITTED and (true) SERIALIZABLE
 isolation but not in snapshot isolation.
 
Agreed.  If someone wants to enforce user-level constraints using
SSI, they will somehow need to ensure that less strict isolation
levels are never used to modify data.  Your approach lifts that
burden.
 
By the way, if you can make this behave in a similar way to Oracle,
especially if the syntax is compatible, I'm sure it will help
promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
2010, I talked briefly with a couple guys from an Oracle shop who
were looking at converting to PostgreSQL, and were very concerned
about not having what you describe.  The techniques required to
ensure integrity in PostgreSQL were not, to put it mildly, appealing
to them.  I suspect that they would be satisfied with *either* SSI
or the change you describe.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 ... is there a reason why, other the fact that we don't do now, that we 
 can't just put in a restriction against cross posting altogether?

Because that would be shooting ourselves in the foot. Cross-posting 
is often desirable. If we had a clearer distinction of list topics, I 
might support such a move, but we don't, so I can't.

 ... and, for those that have been here awhile, who should know better, 
 why isn't there any self-management of this sort of stuff in the first 
 place?

What would you have us do?

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005141005
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkvtWKwACgkQvJuQZxSWSsimYACgrPesGj6yxfo49c6T1PPLrKir
oPoAn0b81VrrrqAozXnPXV/5vzlAuxr1
=11EB
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 There is no reason why advocacy can't happen on general. Theoretically
 www could be on hackers (although I do see the point of a separate
 list).

I don't feel as strong about -advocacy being removed, but we certainly 
can fold in -sql and -admin. Would anyone argue against rolling those 
two (sql and admin) into -general as a first step?

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005141009
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkvtWbgACgkQvJuQZxSWSsjfAQCg0s9GxUIKnxHjbAWd2XOWxYpk
OZMAni62Fpj/PPTE9/qFUNw08une4YgT
=OyI0
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug
On May 14, 2010, at 15:54 , Kevin Grittner wrote:
 Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 unless your patch completely removes support for snapshot
 isolation (what is current called SERIALIZABLE)
 
 Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
 isolation.  We're leaving REPEATABLE READ alone.

Ah, yeah, that makes a lot of sense. I kinda had forgotten about REPEATABLE 
READ...

 my proposal still eliminates the situation that user-level
 constraints are correct in READ COMMITTED and (true) SERIALIZABLE
 isolation but not in snapshot isolation.
 
 Agreed.  If someone wants to enforce user-level constraints using
 SSI, they will somehow need to ensure that less strict isolation
 levels are never used to modify data.  Your approach lifts that
 burden.
 
 By the way, if you can make this behave in a similar way to Oracle,
 especially if the syntax is compatible, I'm sure it will help
 promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
 2010, I talked briefly with a couple guys from an Oracle shop who
 were looking at converting to PostgreSQL, and were very concerned
 about not having what you describe.  The techniques required to
 ensure integrity in PostgreSQL were not, to put it mildly, appealing
 to them.  I suspect that they would be satisfied with *either* SSI
 or the change you describe.

My proposal would make SELECT ... FOR UPDATE behave like Oracle does with 
regard to serialization conflicts. SELECT ... FOR SHARE doesn't seem to exist 
on Oracle at all - at least I couldn't find a reference to it in the docs.

The syntax isn't 100% compatible because Oracle seems to expect a list of 
columns after the FOR UPDATE clause, while postgres expects a list of tables.

I must admit that I wasn't able to find an explicit reference to Oracle's 
behavior in their docs, so I had to resort to experiments. They do have 
examples showing how to do FK-like constraints with triggers, and those don't 
contain any warning whatsoever about problems in SERIALIZABLE mode, though. But 
still, if there is word on this from Oracle somewhere, I'd love to hear about 
it.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner
Greg Sabino Mullane g...@turnstep.com wrote:
 
 Would anyone argue against rolling those two (sql and admin) into
 -general as a first step?
 
At the risk of repeating myself, I won't be able to keep up with the
traffic of the combined list; so rather than read 100% of the
messages from a smaller set, I'll need to pick and choose based on
subject line or some such.  I get the impression that other people,
who read different subsets of the lists, will be forced to a similar
change.  That may result in either some posts slipping through the
cracks or in increasing the burden of responding to the posts for
those brave few who wade through them all.
 
Personally, I'm not convince that merging current lists will solve
more problems than it will create.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 
 I must admit that I wasn't able to find an explicit reference to
 Oracle's behavior in their docs, so I had to resort to
 experiments. They do have examples showing how to do FK-like
 constraints with triggers, and those don't contain any warning
 whatsoever about problems in SERIALIZABLE mode, though.  But
 still, if there is word on this from Oracle somewhere, I'd love to
 hear about it.
 
I suspect that in trying to emulate Oracle on this, you may run into
an issue which posed challenges for the SSI implementation which
didn't come up in the Cahill prototype implementations: Oracle, and
all other MVCC databases I've read about outside of PostgreSQL, use
an update in place with a rollback log technique.  Access to any
version of a given row or index entry goes through a single
location, with possible backtracking through the log after that,
which simplifies management of certain concurrency issues.  Do they
perhaps use an in-RAM lock table, pointing to the base location of
the row for these SELECT FOR UPDATE locks?  (Just guessing; I've
never used Oracle, myself.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Greg Sabino Mullane wrote:



-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



... is there a reason why, other the fact that we don't do now, that we
can't just put in a restriction against cross posting altogether?


Because that would be shooting ourselves in the foot. Cross-posting
is often desirable. If we had a clearer distinction of list topics, I
might support such a move, but we don't, so I can't.


But, its the cross-posting, IMHO, that reduces the distinction ...


... and, for those that have been here awhile, who should know better,
why isn't there any self-management of this sort of stuff in the first
place?


What would you have us do?


Redirect users ... if user sends a query performance related question to 
-general, respond back with -general as the CC, To as -performance and a 
Reply-To header of -performance ... that way those on -general know that 
its been redirected, but *hopefully* users replying will honor the 
-performance redirect ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Greg Sabino Mullane wrote:



-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



There is no reason why advocacy can't happen on general. Theoretically
www could be on hackers (although I do see the point of a separate
list).


I don't feel as strong about -advocacy being removed, but we certainly
can fold in -sql and -admin. Would anyone argue against rolling those
two (sql and admin) into -general as a first step?


Question ... we have, right now:

-sql : how to write a query
-performance : how to improve performance of my queries
-admin : how to admin the server
-novice : I'm a new user
-odbc : how to use ...
-php : php related interface questions
-interfaces : more general then -odbc

why not close down -general so that ppl *have* to use better pick where to 
post their question ...




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Kevin Grittner wrote:


Greg Sabino Mullane g...@turnstep.com wrote:


Would anyone argue against rolling those two (sql and admin) into
-general as a first step?


At the risk of repeating myself, I won't be able to keep up with the
traffic of the combined list; so rather than read 100% of the
messages from a smaller set, I'll need to pick and choose based on
subject line or some such.  I get the impression that other people,
who read different subsets of the lists, will be forced to a similar
change.  That may result in either some posts slipping through the
cracks or in increasing the burden of responding to the posts for
those brave few who wade through them all.


That's what I find with the freebsd-questions list ... there is so much 
noise in there that I tend to avoid posting to it for fear that my email 
will just get skip'd over ...


I am definitely against *merging* lists ... getting rid of the 'meta list' 
makes more sense so as to force ppl to *use* the smaller lists then to 
merge smaller lists and *increase* the noise on one of them ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner
Marc G. Fournier scra...@hub.org wrote:
 
 -sql : how to write a query
 -performance : how to improve performance of my queries
 -admin : how to admin the server
 -novice : I'm a new user
 -odbc : how to use ...
 -php : php related interface questions
 -interfaces : more general then -odbc
 
 why not close down -general so that ppl *have* to use better pick
 where to post their question ...
 
That's a change I could support.  I even think the descriptions are
pretty close to what should show.
 
In trying to think what might be missing, I wonder whether we could
decrease inappropriate traffic on the -bugs list if we had a
feature request list, for end users not prepared to discuss things
at the level appropriate for -hackers, but who think that PostgreSQL
should support some feature they don't see.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Tom Lane
Marc G. Fournier scra...@hub.org writes:
 why not close down -general so that ppl *have* to use better pick where to 
 post their question ...

I can't imagine that there's not going to need to be a catchall list
for problems that don't fit into any of the subcategories.

More generally, we already have most of the lists that you suggest, and
we already know that people frequently don't find the most appropriate
list for postings.  I don't think getting rid of -general would help
that in the least.  The way to cut down on misposted traffic is to make
the set of categories smaller and simpler, not to redouble our efforts
to persuade people to use the same or even more categories.

BTW, as far as no crossposting goes: usually when I find myself doing
that, it's to redirect a thread that started on -bugs or -general into
-hackers.  I don't see the need for that going away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian
Bruce Momjian wrote:
 I was able to easily crash the standby server today just by starting it
 and connecting to it via psql.  The master was idle.  The failure was:
 
   LOG:  streaming replication successfully connected to primary
   TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: 
 procarray.c, Line: 1211)
   LOG:  server process (PID 12761) was terminated by signal 6: Abort trap
   LOG:  terminating any other active server processes
 
 My master postgresql.conf was:
 
   wal_level = hot_standby # minimal, archive, or 
 hot_standby
   archive_mode = on   # allows archiving to be done
   archive_command = 'cp -i %p /u/pg/archive/%f  /dev/null '  # command 
 to use to archive a logfile segment
   max_wal_senders = 1 # max number of walsender processes
 
 My slave postgresql.conf was:
 
   port = 5433 # (change requires restart)
   wal_level = hot_standby # minimal, archive, or 
 hot_standby
   archive_mode = off  # allows archiving to be done
   archive_command = 'cp -i %p /u/pg/archive/%f  /dev/null '  # 
 command to use to archive a logfile segment
   hot_standby = on# allows queries during recovery
   max_wal_senders = 1 # max number of walsender processes
 
 and my slave recovery.conf was:
 
   restore_command = 'cp /u/pg/archive/%f %p'  # e.g. 'cp 
 /mnt/server/archivedir/%f %p'
   standby_mode = 'on'
   primary_conninfo = 'host=localhost port=5432'   # e.g. 
 'host=localhost port=5432'
 
 Let me know what additional information I can supply.

I saw Simon's commit fixing this bug.  Another good reason we didn't
bundle 9.0 beta2 yesterday.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Bruce Momjian
Joseph Adams wrote:
 == array/object conversion ==
 
 The json_object function converts a tuple to a JSON object.  If there
 are duplicate column names, there will be duplicate keys in the
 resulting JSON object.
 
 json_object([content [AS name] [, ...]]) returns json
 
 Likewise, the json_array function converts a tuple to a JSON array.
 Column names are ignored.
 
 json_array([content [AS name] [, ...]]) returns json

Do you see any problems with the fact that JSON arrays can use mixed
data types, e.g.:

[ 1, 2, 'hi', false]

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Yeb Havinga wrote:


Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:


Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:


My $0.02 - I like the whole 'don't sort, search' (or how did they call
it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing map/kohonen),
so my local filters could make labels based on that, instead of perhaps
badly spelled keywords in subjects or message body.


I missed this when I read it the first time .. all list email does have an 
X-Mailing-List header added so that you can label based on list itself ... 
is that what you mean, or are you thinking of something else entirely?
Something else: if automatic classification of articles was in place, there 
would be need of fewer mailing lists, depending on the quality of the 
classification.


You've mentinoed this serveral time, but what *is* autoclassication 
of articles?  or is this something you do on the gmail side of things?




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add SIGCHLD catch to psql

2010-05-14 Thread Stephen Frost
Greetings,

  Toying around with FETCH_COUNT today, I discovered that it didn't do
  the #1 thing I really wanted to use it for- query large tables without
  having to worry about LIMIT to see the first couple hundred records.
  The reason is simple- psql ignores $PAGER exiting, which means that it
  will happily continue pulling down the entire large table long after
  you've stopped caring, which means you still have to wait forever.

  The attached, admittedly quick hack, fixes this by having psql catch
  SIGCHLD's using handle_sigint.  I've tested this and it doesn't
  appear to obviously break other cases where we have children (\!, for
  example), since we're not going to be running a database query when
  we're doing those, and if we are, and the child dies, we probably want
  to *stop* anyway, similar to the $PAGER issue.

  Another approach that I considered was fixing various things to deal
  cleanly with write's failing to $PAGER (I presume the writes *were*
  failing, since less was in a defunct state, but I didn't actually
  test).  This solution was simpler, faster to code and check, and alot
  less invasive (or so it seemed to me at the time).

  Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
  current behaviour of completely ignoring $PAGER exiting is a bug.

Thanks,

Stephen
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f605c97..dcab436 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** NoticeProcessor(void *arg, const char *m
*** 188,194 
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable the SIGINT signal catcher to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
--- 188,194 
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable SIGINT and SIGCHLD signals to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
*** NoticeProcessor(void *arg, const char *m
*** 208,213 
--- 208,218 
   * catcher to longjmp through sigint_interrupt_jmp.  We assume readline and
   * fgets are coded to handle possible interruption.  (XXX currently this does
   * not work on win32, so control-C is less useful there)
+  *
+  * SIGCHLD is also caught and handled the same to deal with cases where a user's
+  * PAGER or other child process exits.  Otherwise, we would just keep sending
+  * data to a dead/zombied process.  This won't typically matter except when
+  * FETCH_COUNT is used.
   */
  volatile bool sigint_interrupt_enabled = false;
  
*** void
*** 259,264 
--- 264,272 
  setup_cancel_handler(void)
  {
  	pqsignal(SIGINT, handle_sigint);
+ 
+ 	/* Also send SIGCHLD signals, to catch cases where the user exits PAGER */
+ 	pqsignal(SIGCHLD, handle_sigint);
  }
  #else			/* WIN32 */
  


signature.asc
Description: Digital signature


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Pavel Stehule
2010/5/14 Bruce Momjian br...@momjian.us:
 Joseph Adams wrote:
 == array/object conversion ==

 The json_object function converts a tuple to a JSON object.  If there
 are duplicate column names, there will be duplicate keys in the
 resulting JSON object.

 json_object([content [AS name] [, ...]]) returns json

 Likewise, the json_array function converts a tuple to a JSON array.
 Column names are ignored.

 json_array([content [AS name] [, ...]]) returns json

 Do you see any problems with the fact that JSON arrays can use mixed
 data types, e.g.:

        [ 1, 2, 'hi', false]

it could not be a problem

regards
Pavel

 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I can't imagine that there's not going to need to be a catchall
 list for problems that don't fit into any of the subcategories.
 
 More generally, we already have most of the lists that you
 suggest, and we already know that people frequently don't find the
 most appropriate list for postings.  I don't think getting rid of
 -general would help that in the least.  The way to cut down on
 misposted traffic is to make the set of categories smaller and
 simpler, not to redouble our efforts to persuade people to use the
 same or even more categories.
 
Well, redoubling our current efforts to direct people to more
specific lists would accomplish nothing, since doubling zero leaves
you with zero.  The description of -general includes:
 
| General discussion area for users. Apart from compile, acceptance
| test, and bug problems, most new users will probably only be
| interested in this mailing list
 
Given that, the fact that -admin, -novice, -sql, and -performance
collectively get as many posts as -general suggests that people are,
in fact, making some effort to find a list which seems a good fit. 
Perhaps if the description of -general was changed to suggest it
*was* a catch-all for posts which don't fit the other lists, things
would improve.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Josh Berkus



There is no reason why advocacy can't happen on general. Theoretically
www could be on hackers (although I do see the point of a separate
list).


First off, this is absolutely the wrong list to be discussing management 
of PostgreSQL lists.  That belongs on pgsql-www.  And, I'll point out, 
that this completely pointless discussion on the wrong list has been 1/6 
of the traffic on -hackers for the last two days.  Also, I really don't 
see what problem people think they're addressing with these bimonthly 
calls for list consolidation.  It seems like a solution in search of a 
problem.


So it's an exercise in ironic wankitude.   Can we please stop it now?

Second, regarding advocacy: no, absolutely not.  -advocacy is a working 
list and not a virtual water cooler.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Yeb Havinga

Marc G. Fournier wrote:

On Fri, 14 May 2010, Yeb Havinga wrote:


Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:


Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:

My $0.02 - I like the whole 'don't sort, search' (or how did they 
call

it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing 
map/kohonen),
so my local filters could make labels based on that, instead of 
perhaps

badly spelled keywords in subjects or message body.


I missed this when I read it the first time .. all list email does 
have an X-Mailing-List header added so that you can label based on 
list itself ... is that what you mean, or are you thinking of 
something else entirely?
Something else: if automatic classification of articles was in place, 
there would be need of fewer mailing lists, depending on the quality 
of the classification.


You've mentinoed this serveral time, but what *is* autoclassication 
of articles?  or is this something you do on the gmail side of things?
I ment classification in the sense of automated as apposed to manual 
classification by author or subscriber, in the general sense, not linked 
to any mail client or server. Example: junk mail detection by mail client.


After sending my previous mail this morning I looked a bit more into 
(the faq of) carrot2, which links to LingPipe as a solution for people 
that like pre-defined classes. LingPipe in fact has a tutorial where 
they classify a dataset of newsgroups articles, see e.g. 
http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html. I 
suppose it would be interesting to see what could be done with the pg 
archives. If the archive database itself is publically available, or 
could be made available I'd be willing to put some time into this 
(solely on the bases that I'm interested in the outcome, not that I 
pursue that it'd be used by the pg project, though that'd ofcourse be 
cool if it turned out that way in the end)


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
joeyadams3.14...@gmail.com wrote:
 The following function returns the type of any JSON value.

 json_type as enum ('null', 'string', 'number', 'boolean', 'object', 'array')
 json_type(json) returns json_type

Seems reasonable.

 Would it be a bad idea to give an enum and a function the same name
 (which appears to be allowed by PostgreSQL) ?  If so, json_type(json)
 could be json_typeof(json) or something instead.

No, I think that's a fine idea.

 I thought about having predicates like IS STRING and IS NUMBER,
 similar to the IS DOCUMENT predicate used for XML.  However, a major
 problem with that approach is that it could lead to confusion
 involving IS NULL.  By my understanding, the JSON datatype will just
 be a specialization of TEXT (it just validates the input).  Like TEXT,
 a JSON string can be 'null'.  'null'::JSON is not NULL.  Bear in mind
 that json_to_*('null') is NULL, though.

Even aside from the possible semantic confusion, I don't think that we
should make any changes to our core grammar (gram.y) to support JSON.
It's not really necessary and it's better not to add extra stuff to
the grammar unless we really need it.

 I also thought about having a series of json_is_* functions.  I don't
 think it's a bad idea, but I think json_type is a better solution.

I agree.

 == text/number/boolean conversion ==

 These functions each convert a non-compound JSON value to its
 respective return type.  Run-time type checking is performed; a
 conversion will throw an error if the input JSON is not the correct
 type.  If the JSON value is 'null', then the return value will be
 NULL.

 json_to_text(json) returns text
 json_to_number(json) returns numeric
 json_to_bool(json) returns boolean

Can '3' be converted to a number, or only if it's written without the quotes?

 These functions convert values to JSON.  Passing NULL to any of the
 functions below will return 'null':

 text_to_json(text) returns json
 number_to_json(numeric) returns json
 bool_to_json(boolean) returns json

 There could be generic value_to_json(any), but not a
 json_to_value(json) function.  See
 http://archives.postgresql.org/pgsql-hackers/2010-04/msg00321.php for
 more details.

Seems OK.

 Conversion to/from number or boolean can also be achieved with
 casting.  Note well that 'string'::JSON::TEXT is 'string', not the
 string's actual value.  json_to_text is needed for this conversion.
 For this reason, casting JSON might seem like something to recommend
 against.  However, IMHO, casting numbers and booleans to/from JSON is
 fine and dandy; the paragraphs below give some weight to this.

 I originally considered making json_to_number and number_to_json work
 with TEXT instead of NUMERIC.  However, as Tom Lane pointed out in the
 above link, Forcing people to insert explicit coercions from text
 isn't going to be particularly convenient to use..  Nevertheless,
 NUMERIC introduces a problem.  For instance, if you say:

 SELECT '-1e-38'::NUMERIC;

 This conversion knocks out the scientific notation and produces a
 41-character string.  I seriously doubt that all outside applications
 will handle 41-character numbers correctly.

Maybe not, but I don't think it's your problem to fix it if they
don't.  If people want to have fine-grained control over the JSON that
gets generated, they can always generate the value as text and cast it
to JSON.

 Perhaps there should be individual functions for specific data types,
 or maybe just a handful for particular cases.  There might be
 json_to_int, json_to_float, and json_to_numeric.  In any case,
 converting to/from number types can be achieved quite easily with
 casting.

Personally I'd go with just json_to_numeric for starters.  We can
always add the others if and when it's clear that they are useful.

 == array/object conversion ==

 The json_object function converts a tuple to a JSON object.  If there
 are duplicate column names, there will be duplicate keys in the
 resulting JSON object.

 json_object([content [AS name] [, ...]]) returns json

Seems good.

 Likewise, the json_array function converts a tuple to a JSON array.
 Column names are ignored.

 json_array([content [AS name] [, ...]]) returns json

I think this is pointless and should be omitted.

 The json_agg function reduces a set of JSON values to a single array
 containing those values.

 aggregate json_agg(json) returns json

Very useful, I like that.

 json_object and json_agg can be used together to convert an entire
 result set to one JSON array:

 SELECT json_agg(json_object(*)) FROM tablename;

Spiffy.

 json_keys gets the keys of a JSON object as a set.

 json_keys(json) returns setof text

I would tend to make this return text[] rather than SETOF text.

 json_values gets the values of a JSON object or the iems of a JSON
 array as a set.

 json_values(json) returns setof json

Similarly I would make this return json[].

 Note that all JSON slicing and splicing operations 

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Kevin Grittner wrote:


Well, redoubling our current efforts to direct people to more
specific lists would accomplish nothing, since doubling zero leaves
you with zero.  The description of -general includes:


Agreed ...

Given that, the fact that -admin, -novice, -sql, and -performance 
collectively get as many posts as -general suggests that people are, in 
fact, making some effort to find a list which seems a good fit. Perhaps 
if the description of -general was changed to suggest it *was* a 
catch-all for posts which don't fit the other lists, things would 
improve.


Can you offer improvd / stronger wording on this ... ?

Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Mike Rylander
On Fri, May 14, 2010 at 1:15 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
 joeyadams3.14...@gmail.com wrote:

[snip]

 == array/object conversion ==

 The json_object function converts a tuple to a JSON object.  If there
 are duplicate column names, there will be duplicate keys in the
 resulting JSON object.

 json_object([content [AS name] [, ...]]) returns json

 Seems good.

 Likewise, the json_array function converts a tuple to a JSON array.
 Column names are ignored.

 json_array([content [AS name] [, ...]]) returns json

 I think this is pointless and should be omitted.


(I'm going to avoid the use of the term object here to reduce confusion.)

I disagree with the assertion that it's pointless, and I have a
specific use-case in mind for this function.  I have a system that
uses JSON arrays on the wire to encapsulate data, and both ends
understand the positional semantics of the elements.  Using JSON
arrays instead of JSON objects reduces the transfer size by 40-80%,
depending on how full the rows (or class instances) are and the data
types of the elements, simply by removing the redundant object keys.
This function would be extremely useful to me when creating or
persisting raw class instances of these sorts.

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] recovery consistent != hot standby

2010-05-14 Thread Robert Haas
While looking through postmaster.c and xlog.c I discovered that we're
being a little bit loose about our use of terminology.  Maybe this was
right when committed (I think, at that point, Hot Standby was always
on) but it's not right any more.  It appears that we only enter the
PM_RECOVERY_CONSISTENT when Hot Standby is enabled; otherwise, we
remain in PM_RECOVERY even after reaching consistency.  I think, then,
that the state, and the signal which triggers it are misnamed.  For
the avoidance of confusion, I'd like to propose that we rename as
follows:

PM_RECOVERY_CONSISTENT - PM_HOT_STANDBY
PMSIGNAL_RECOVERY_CONSISTENT - PMSIGNAL_BEGIN_HOT_STANDBY

Objections?  Better ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Tom Lane
I've been able to reproduce the problem described here:
http://archives.postgresql.org/pgsql-bugs/2010-05/msg00100.php
Do this:

create table foo(f1 text);
alter table foo set (toast.autovacuum_enabled = false);
insert into foo values(repeat('xyzzy',10));
vacuum verbose foo;

Notice that the vacuum output shows there's only one toast tuple per
toast table page.

The problem is that if any reloption is set for the toast table,
we build a StdRdOptions struct in which fillfactor is zero, and
then all the code that actually uses fillfactor honors that.
And the reason fillfactor gets left as zero is that there is no
entry for it in the tables for toast-table reloptions.  Clearly,
this wasn't thought through carefully enough.

I'm inclined to think that we should remove the notion of there
being separate sets of rdoptions for heaps and toast tables ---
any case in which someone tries to omit a StdRdOption for toast
tables is going to fail the same way, unless we are fortunate
enough that zero is a desirable default.  What seems more rational
is to provide toast.fillfactor but give it min/max/default = 100.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 1:28 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, May 13, 2010 at 12:10 PM, Robert Haas robertmh...@gmail.com wrote:
 Hmm, it seems this is my night to rediscover the wisdom of your
 previous proposals.  I think that state would only be appropriate when
 we shutdown after reaching consistency, not any shutdown during
 recovery.  Do you agree?

 No. When shutdown happens before reaching consistency, the database might
 be inconsistent, but which doesn't mean that some data might be corrupted.
 We can get consistent (not corrupted) database by applying the WAL records
 to inconsistent one.

 HINT:  If this has occurred more than once some data might be
 corrupted and you might need to choose an earlier recovery target.

 I think that the hint message indicates the data corruption which prevents
 recovery from completing, rather than the inconsistency of the database.

Hmm, OK, I think that makes sense.  Would you care to propose a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generalized Inverted Generalized Search Tree

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 12:03 AM, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:
 We can index multiple scalar values per row with GIN access method,
 and also can index single vector value per row with GiST AM.

 Is it worth having a new AM to index multiple vector values per row?
 It will be an AM for the missing feature in below:

                | scalar | vector |
 +++
  single per row | btree  | gist   |
  multi per row  | gin    | *HERE* |

 We can call the new AM gigist. Or, there might be another idea
 to support expression indexes for SRF functions, like
  =# CREATE TABLE tbl (c circle[]);
  =# CREATE INDEX ON tbl USING gist (unnest(c));

 Comments and ideas welcome.

I'm not sure I agree with your characterization of the purpose of
GIST.  I think of GIST as a mechanism to support index lookups on
types that are not totally ordered.  Still, I guess I understand the
point - I think what you are trying to do is optimize queries of the
form, e.g.:

SELECT * FROM tbl WHERE [some circle in c overlaps a given box]

I don't personally want to do that :-) but I can easily imagine that
someone else might.  As for adding a new AM, we could certainly do it
that way, but we should at least consider the possibility of trying to
modify the existing GIST code to handle this case in addition to the
things it already does.  If that's not possible, then we should think
about how to minimize code duplication.  I fear that if we're not
careful we might end up having to fix a lot of bugs in two places;
also, any code that is specific to gigist will certainly get less
real-world use than gist itself, so sharing code will help keep our
bug counts down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010:

 The problem is that if any reloption is set for the toast table,
 we build a StdRdOptions struct in which fillfactor is zero, and
 then all the code that actually uses fillfactor honors that.
 And the reason fillfactor gets left as zero is that there is no
 entry for it in the tables for toast-table reloptions.  Clearly,
 this wasn't thought through carefully enough.

Ouch :-( We messed with this stuff after the initial commit because I
didn't get it right the first time either.  I'm surprised that we didn't
find this problem right away.

 I'm inclined to think that we should remove the notion of there
 being separate sets of rdoptions for heaps and toast tables ---
 any case in which someone tries to omit a StdRdOption for toast
 tables is going to fail the same way, unless we are fortunate
 enough that zero is a desirable default.

It doesn't seem like having the distinction has bought us anything.
However, if we do that, we can't add a separate entry to intRelOpts to
fix min/max/default to 100, so I think we should document that options
for toast must match those for heaps.

 What seems more rational is to provide toast.fillfactor but give it
 min/max/default = 100.

Adding an entry to intRelOpts should be enough to fix this bug, correct?
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree
Recently, in preparation for migrating an application to postgres, I
got to this part of the manual (which is *excellent* so far, by the
way):

http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html

A quick check with the folks on #postgresql confirmed my
understanding, which was that the locking semantics of setval() and
nextval() make this unsafe:

SELECT setval('my_seq', nextval('my_seq') + 500);

Now, I was reminded that I could simply do this:

SELECT nextval('my_seq') FROM generate_series(1, 500);

But of course then I would have no guarantee that I would get a
contiguous block of ids, which means if I'm using this to do a mass
insert of records which refer to each others' ids (example: storing a
directed, linear graph), I either have to do a correlated update on
the client side, after transferring the keys (consider the cost of
doing this for a few million records - 4 MB in keys per million
records, for, in extreme cases, 12 MB of data to be inserted -- 33%
overhead in the worst case, presuming symmetric bandwidth), or I have
to insert into a temporary table, then have the db backend do the
update, then insert from there to the real table. Both are imperfect
options in terms of performance and complexity.

Thus, before I start work on it, I propose an extension to the current
nextval():

SELECT nextval('my_seq', 500);

This would increment the my_seq sequence by its interval * 500, and
return the first valid key. This both makes client code that needs a
bunch of PKs simpler to implement, and saves in performance, since the
client can just replace all its PKs (presuming they're currently a
contiguous block from 1 to n) with my_starting_pk + current_pk, so
this:

 pk | next_node
+---
  0 | 1
  1 | 2
  2 | 0

can be easily updated like this:

SELECT nextval('my_seq', (SELECT count(*) FROM my_table));
UPDATE my_table SET pk = currval('my_seq') + pk, next_node =
currval('my_seq') + next_node;

to something like this:

   pk   | next_node
+--
 521650 |521651
 521651 |521652
 521652 |521650

This is a net gain of performance and ease of implementation in many
cases where a large number of ids from a sequence are needed -- with a
small added benefit of the keys being guaranteed to be contiguous.

I don't see any technical problems with this; postgres already can
pre-allocate more than one key, but the number is semi-static (the
CACHE parameter to CREATE SEQUENCE). This might break existing user
code if they've defined a nextval(regclass, integer), but I don't see
any way to

Finally, I've checked sequence.c -- this looks pretty straightforward
to implement, but I figured checking with this list was wise before
starting work. Apologies if I've been overly wordy.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010:
 What seems more rational is to provide toast.fillfactor but give it
 min/max/default = 100.

 Adding an entry to intRelOpts should be enough to fix this bug, correct?

I think so, but haven't tested.  The docs would need some correction
perhaps.

BTW, I notice that the code allows people to fool with
autovacuum_analyze_threshold and related parameters for toast tables,
which seems rather pointless since we never analyze toast tables.
So the fillfactor isn't really the only instance of the issue.

Maybe a better solution is to have some kind of notion of a default-only
entry, which is sufficient to insert the default into the struct but
isn't accepted as a user-settable item.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Kenneth Marshall
Hi Peter,

All you need to do is define your own sequence with an
increment of 500. Look at:

http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

Regards,
Ken

On Fri, May 14, 2010 at 02:56:18PM -0400, Peter Crabtree wrote:
 Recently, in preparation for migrating an application to postgres, I
 got to this part of the manual (which is *excellent* so far, by the
 way):
 
 http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html
 
 A quick check with the folks on #postgresql confirmed my
 understanding, which was that the locking semantics of setval() and
 nextval() make this unsafe:
 
 SELECT setval('my_seq', nextval('my_seq') + 500);
 
 Now, I was reminded that I could simply do this:
 
 SELECT nextval('my_seq') FROM generate_series(1, 500);
 
 But of course then I would have no guarantee that I would get a
 contiguous block of ids, which means if I'm using this to do a mass
 insert of records which refer to each others' ids (example: storing a
 directed, linear graph), I either have to do a correlated update on
 the client side, after transferring the keys (consider the cost of
 doing this for a few million records - 4 MB in keys per million
 records, for, in extreme cases, 12 MB of data to be inserted -- 33%
 overhead in the worst case, presuming symmetric bandwidth), or I have
 to insert into a temporary table, then have the db backend do the
 update, then insert from there to the real table. Both are imperfect
 options in terms of performance and complexity.
 
 Thus, before I start work on it, I propose an extension to the current
 nextval():
 
 SELECT nextval('my_seq', 500);
 
 This would increment the my_seq sequence by its interval * 500, and
 return the first valid key. This both makes client code that needs a
 bunch of PKs simpler to implement, and saves in performance, since the
 client can just replace all its PKs (presuming they're currently a
 contiguous block from 1 to n) with my_starting_pk + current_pk, so
 this:
 
  pk | next_node
 +---
   0 | 1
   1 | 2
   2 | 0
 
 can be easily updated like this:
 
 SELECT nextval('my_seq', (SELECT count(*) FROM my_table));
 UPDATE my_table SET pk = currval('my_seq') + pk, next_node =
 currval('my_seq') + next_node;
 
 to something like this:
 
pk   | next_node
 +--
  521650 |521651
  521651 |521652
  521652 |521650
 
 This is a net gain of performance and ease of implementation in many
 cases where a large number of ids from a sequence are needed -- with a
 small added benefit of the keys being guaranteed to be contiguous.
 
 I don't see any technical problems with this; postgres already can
 pre-allocate more than one key, but the number is semi-static (the
 CACHE parameter to CREATE SEQUENCE). This might break existing user
 code if they've defined a nextval(regclass, integer), but I don't see
 any way to
 
 Finally, I've checked sequence.c -- this looks pretty straightforward
 to implement, but I figured checking with this list was wise before
 starting work. Apologies if I've been overly wordy.
 
 Peter
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
 joeyadams3.14...@gmail.com wrote:
 [] retrieves a value of a JSON array/object by (one-based) index.  In
 other words, value[n] is equivalent to selecting the nth row of
 json_values(value) (provided value is of type JSON).  Examples:
 
 SELECT ('[1,2,3,4]'::JSON)[3]
 -- Result is '3'::JSON
 
 SELECT ('{a: 1, b: 2, c: 3, d: 4}'::JSON)[3]
 -- Result is '3'::JSON

 I think some kind of array deference and object deference mechanism is
 absolutely, positively 100% required.  I don't know whether the
 particular syntax you've proposed here is best or whether we should
 pick another syntax or just use function notation, but I think we
 definitely need *something*.

Trying to use array notation on something that isn't a SQL array type
is guaranteed to be a mess.  I strongly recommend that you not attempt
that.  Just define a function for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Josh Berkus wrote:

First off, this is absolutely the wrong list to be discussing management of 
PostgreSQL lists.  That belongs on pgsql-www.


Actually, this is as good a list as any ... -www is for WWW related 
issues, not mailing list ... be as inappropriate there as it would be on 
sysadmins, which also doesn't cover mailing lists ...


Second, regarding advocacy: no, absolutely not.  -advocacy is a working list 
and not a virtual water cooler.


BTW, and even I totally forgot about it, but we do have a virtual water 
cooler already: pgsql-chat ... 224 subscribers currently, just nobody uses 
it ...


In fact, I just removed / changed to BCC -hackers so that all further 
discussions on this part of the thread will be on -chat ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Selena Deckelmann
On Fri, May 14, 2010 at 9:51 AM, Josh Berkus j...@agliodbs.com wrote:

 Second, regarding advocacy: no, absolutely not.  -advocacy is a working list
 and not a virtual water cooler.

+1. I would find it very difficult to manage having -advocacy thrown
into -general.

If folks think that information isn't getting wide enough
distribution, that's one thing. But it is an important working group,
even if there's not a ton of traffic all the time on it.

-selena

-- 
http://chesnok.com/daily - me

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
 If  min_sync_replication_clients == 0, then the replication is async.
 If  min_sync_replication_clients == max_wal_senders then the
 replication is fully synchronous.
 If 0  min_sync_replication_clients  max_wal_senders then
 the replication is partially synchronous, i.e. the master can wait
 only for say, 50% of the clients to report back before it's considered
 synchronous and the relevant transactions get released from the wait.

That's an interesting design and in some ways pretty elegant, but it
rules out some things that people might easily want to do - for
example, synchronous replication to the other server in the same data
center that acts as a backup for the master; and asynchronous
replication to a reporting server located off-site.

One of the things that I think we will probably need/want to change
eventually is the fact that the master has no real knowledge of who
the replication slaves are.  That might be something we want to change
in order to be able to support more configurability.  Inventing syntax
out of whole cloth and leaving semantics to the imagination of the
reader:

CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback on);
CREATE REPLICATION SLAVE failover_server (mode synchronous,
xid_feedback off, break_synchrep_timeout 30);

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Bruce Momjian
Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  
  I can't imagine that there's not going to need to be a catchall
  list for problems that don't fit into any of the subcategories.
  
  More generally, we already have most of the lists that you
  suggest, and we already know that people frequently don't find the
  most appropriate list for postings.  I don't think getting rid of
  -general would help that in the least.  The way to cut down on
  misposted traffic is to make the set of categories smaller and
  simpler, not to redouble our efforts to persuade people to use the
  same or even more categories.
  
 Well, redoubling our current efforts to direct people to more
 specific lists would accomplish nothing, since doubling zero leaves
 you with zero.  The description of -general includes:
  
 | General discussion area for users. Apart from compile, acceptance
 | test, and bug problems, most new users will probably only be
 | interested in this mailing list
  
 Given that, the fact that -admin, -novice, -sql, and -performance
 collectively get as many posts as -general suggests that people are,
 in fact, making some effort to find a list which seems a good fit. 
 Perhaps if the description of -general was changed to suggest it
 *was* a catch-all for posts which don't fit the other lists, things
 would improve.

FYI, I usually email new people privately that cross-posting a question
can cause the question to be ignored.  They usually respond positively
and avoid it in the future.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Bruce Momjian wrote:

FYI, I usually email new people privately that cross-posting a question 
can cause the question to be ignored.  They usually respond positively 
and avoid it in the future.


We all have our own methods ... for instance, I just CC'd this to -chat 
with a -BCC to -hackers so that follow ups will go over there (since Josh 
is right, this thread doesn't belong on -hackers) ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add SIGCHLD catch to psql

2010-05-14 Thread Bruce Momjian
Stephen Frost wrote:
-- Start of PGP signed section.
 Greetings,
 
   Toying around with FETCH_COUNT today, I discovered that it didn't do
   the #1 thing I really wanted to use it for- query large tables without
   having to worry about LIMIT to see the first couple hundred records.
   The reason is simple- psql ignores $PAGER exiting, which means that it
   will happily continue pulling down the entire large table long after
   you've stopped caring, which means you still have to wait forever.
 
   The attached, admittedly quick hack, fixes this by having psql catch
   SIGCHLD's using handle_sigint.  I've tested this and it doesn't
   appear to obviously break other cases where we have children (\!, for
   example), since we're not going to be running a database query when
   we're doing those, and if we are, and the child dies, we probably want
   to *stop* anyway, similar to the $PAGER issue.
 
   Another approach that I considered was fixing various things to deal
   cleanly with write's failing to $PAGER (I presume the writes *were*
   failing, since less was in a defunct state, but I didn't actually
   test).  This solution was simpler, faster to code and check, and alot
   less invasive (or so it seemed to me at the time).
 
   Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
   current behaviour of completely ignoring $PAGER exiting is a bug.

Plesae add this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Greg Smith

Tom Lane wrote:

I can see the need for small tightly-focused special lists.


How about a list devoted to discussions about reorganizing the lists?  
It would get plenty of traffic, and then I could not subscribe to that 
and have that many less messages to read.


There is only one viable solution to reduce list traffic:  ban forever 
everyone who top-posts or doesn't trim what they quote.  Maybe some 
other old-school Usenet rules too--can we ban those with incorrectly 
formatted signatures and finally add proper bozo tagging?  Praise Kibo.


Seriously though, I file admin/general/performance into one user 
oriented folder, hackers/committers into a second, and all the non-code 
ones (advocacy, www, docs) into a third.  I don't think there's any way 
to restructure those lists that will improve life for people who try to 
keep up with most of them.  I was traveling yesterday and busy today, 
and now I'm 350 messages behind.  No amount of rijiggering the lists 
will change the fact that there's just that much activity happening 
around PostgreSQL.  You can move the messages around, but the same 
number are going to show up, and people who want to keep up with 
everything will have to cope with that.  The best you can do is get 
better support in your mail program for wiping out whole threads at 
once, once you've realized you're not interested in them.


The only real argument to keep some more targeted lists is for the 
benefit of the people who subscribe to them, not we the faithful, so 
that they can have something that isn't a firehose of messages to sort 
through.  Is it helpful to novices that they can subscribe to a list 
when they won't be overwhelmed by traffic, and can ask questions without 
being too concerned about being harassed for being newbies?  Probably.  
Are there enough people interesting in performance topics alone to 
justify a list targeted just to them?  Certainly; I was only on that 
list for a long time before joining any of the others.  Are the 
marketing oriented people subscribed only to advocacy and maybe announce 
happy to avoid the rest of the lists?  You bet.


Folding, say, performance or admin into general, one idea that pops up 
sometimes, doesn't help those people--now they can only get the 
firehose--and it doesn't help me, either.  If you can keep up with 
general, whether or not the other lists are also included in that or not 
doesn't really matter.  Ditto for hackers and the things you might try 
and split out of it.  It's just going to end up with more cross posting, 
and the only thing I hate more than a mailbox full of messages is 
discovering a chunk of them are dupes because of that.


I might like to see, for example, a user mailing list devoted strictly 
to replication/clustering work with PostgreSQL.  That's another topic I 
think that people are going to want to ask about more in the near future 
without getting overwhelmed.  But, again, that's for their benefit.  
I'll have to subscribe to that, too, and in reality it will probably 
increase the amount of messages I read, because people will ask stuff 
there that's already been covered on other lists, and vice-versa.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Greg Stark
On Fri, May 14, 2010 at 4:39 AM, Greg Smith g...@2ndquadrant.com wrote:
 Is it
 helpful to novices that they can subscribe to a list when they won't be
 overwhelmed by traffic, and can ask questions without being too concerned
 about being harassed for being newbies?  Probably.

Only if they aren't hoping to get answers... What percentage of the
hackers and experts who trawl -general for questions to answer are
subscribed to -novices?

-general isn't subscriber-only posts is it?

 Are there enough people
 interesting in performance topics alone to justify a list targeted just to
 them?  Certainly; I was only on that list for a long time before joining any
 of the others.

If they're interested in performance topics and they're not subscribed
to -general then they're missing *most* of what they're interested in
which doesn't take place on -performance. And most of what's on
-performance ends up being non-performance related questions anyways.

I think what I'm getting at is that we shouldn't have any lists for
traffic which could reasonably happen on -general. If it's a usage
question about postgres then it belongs in the same place regardless
of what feature or aspect of the usage it is -- otherwise it'll always
be some random subset of the relevant messages.

This won't actually cut down on list traffic for me and Simon but it
would help get people answers since they'll be posting to the same
place as everyone else.

 Are the marketing oriented people subscribed only to
 advocacy and maybe announce happy to avoid the rest of the lists?  You bet.

Well yeah. This is an actual discernible distinction. As evidence
I'llnote that there is no advocacy traffic on -general or other
mailing lists.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier

On Fri, 14 May 2010, Greg Stark wrote:


On Fri, May 14, 2010 at 4:39 AM, Greg Smith g...@2ndquadrant.com wrote:

Is it
helpful to novices that they can subscribe to a list when they won't be
overwhelmed by traffic, and can ask questions without being too concerned
about being harassed for being newbies?  Probably.


Only if they aren't hoping to get answers... What percentage of the
hackers and experts who trawl -general for questions to answer are
subscribed to -novices?

-general isn't subscriber-only posts is it?


All our lists are, yes ... *but* ... the 'subscriber list' is cross list, 
in that if you are subscribed to one, you can post to all ...



If they're interested in performance topics and they're not subscribed
to -general then they're missing *most* of what they're interested in
which doesn't take place on -performance. And most of what's on
-performance ends up being non-performance related questions anyways.


And IMHO, that is as much a fault of the 'old timers' on the lists as the 
newbies ... if nobody redirects / loosely enforces the mandates of the 
various lists, newbies aren't going to learn to post to more 
appropriate ones ...


Personally, my experience with large lists is that if there is a smaller, 
more focused list, I'll post there first, to avoid being lost in the noise 
... and, I will re-post to a more general list *if* and only if I'm unable 
to get an answer from where I posted my original ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 11:39 PM, Greg Smith g...@2ndquadrant.com wrote:
 The only real argument to keep some more targeted lists is for the benefit
 of the people who subscribe to them, not we the faithful, so that they can
 have something that isn't a firehose of messages to sort through.  Is it
 helpful to novices that they can subscribe to a list when they won't be
 overwhelmed by traffic, and can ask questions without being too concerned
 about being harassed for being newbies?  Probably.  Are there enough people
 interesting in performance topics alone to justify a list targeted just to
 them?  Certainly; I was only on that list for a long time before joining any
 of the others.  Are the marketing oriented people subscribed only to
 advocacy and maybe announce happy to avoid the rest of the lists?  You bet.

 Folding, say, performance or admin into general, one idea that pops up
 sometimes, doesn't help those people--now they can only get the
 firehose--and it doesn't help me, either.  If you can keep up with general,
 whether or not the other lists are also included in that or not doesn't
 really matter.  Ditto for hackers and the things you might try and split out
 of it.  It's just going to end up with more cross posting, and the only
 thing I hate more than a mailbox full of messages is discovering a chunk of
 them are dupes because of that.

+1.

 I might like to see, for example, a user mailing list devoted strictly to
 replication/clustering work with PostgreSQL.  That's another topic I think
 that people are going to want to ask about more in the near future without
 getting overwhelmed.  But, again, that's for their benefit.  I'll have to
 subscribe to that, too, and in reality it will probably increase the amount
 of messages I read, because people will ask stuff there that's already been
 covered on other lists, and vice-versa.

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.

 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.

After giving this considerable thought and testing the behavior at
some length, I think the OP has it right.  One thing I sometimes need
to do is denormalize a copy of a field, e.g.

CREATE TABLE parent (id serial, mode integer not null, primary key (id));
CREATE TABLE child (id serial, parent_id integer not null references
parent (id), parent_mode integer not null);

The way I have typically implemented this in the past is:

1. Add a trigger to the parent table so that, whenever the mode column
gets updated, we do an update on the parent_mode of all children.
2. Add a trigger to the child table so that, when a new child is
inserted, it initializes parent_mode from its parent.  I do SELECT
with FOR UPDATE on the parent parent can't change under me; though FOR
SHARE ought to be enough also since we're just trying to lock out
concurrent updates.

Suppose T1 updates the parent's mode while T2 adds a new child; then
both commit.  In read committed mode, this seems to work OK regardless
of the order of T1 and T2.  If T1 grabs the lock first, then T2 sees
the updated version of the row after T1 commits.  If T2 grabs the lock
first, then the update on the parent blocks until the child commits.
Subsequently, when the trigger fires, it apparently uses an up-to-date
snapshot, so the new child is updated also.  In serializable mode,
things are not so good.  If T1 grabs the lock first, the child waits
to see whether it commits or aborts.  On commit, it complains that it
can't serialize and aborts, which is reasonable - transaction aborts
are the price you pay for serializability.  If T2 grabs the lock
first, the update on the parent blocks as before, but now the update
is done with the old snapshot and ignores the new child, so the new
child now has a value for parent_mode that doesn't match the parent's
actual mode.  That is, you get the wrong answer due to a serialization
anomaly that didn't existed at the read committed level.

Increasing the transaction isolation level is supposed to *eliminate*
serialization anomalies, not create them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread hubert depesz lubaczewski
On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
 Hi Peter,
 
 All you need to do is define your own sequence with an
 increment of 500. Look at:
 
 http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

This is often not enough. For example - I want standard increment of 1,
but right now I'm importing 1 objects, and it would be simpler for
me to get 1 ids. Preferably in one block.

This is not achievable now. I know I can 'alter sequence set increment
by' - but this will also affect concurrent sessions. which might not be
a problem, but it's a side effect that I don't want.

+1 for original proposition, would love to get it.

depesz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote: 
 
 If they're interested in performance topics and they're not
 subscribed to -general then they're missing *most* of what they're
 interested in which doesn't take place on -performance.
 
Well, I for one can't currently suck the end of the fire hose which
is -general, and would be less able to do so should other lists be
folded into it.  So I lurk on -bugs, -performance, -admin, and
others -- not to glean information so much as to attempt to respond
in areas where I feel I might be able to be helpful and, with a bit
of luck, take some of the burden off of those who do the most to
help people on these lists.  Combining lists will only make it
harder for me to attempt to assist in this way.
 
 And most of what's on -performance ends up being non-performance
 related questions anyways.
 
I don't believe you.  Scanning this:
 
http://archives.postgresql.org/pgsql-performance/2010-05/index.php
 
I see a few non-performance questions, but they're clearly a small
fraction of the traffic.
 
 I think what I'm getting at is that we shouldn't have any lists
 for traffic which could reasonably happen on -general.
 
I think that's exactly backwards -- we shouldn't have any traffic on
-general for issues which could reasonably happen in another list. 
You can always configure your email to combine lists into a common
folder upon receipt.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian
bruce wrote:
  and my slave recovery.conf was:
  
  restore_command = 'cp /u/pg/archive/%f %p'  # e.g. 'cp 
  /mnt/server/archivedir/%f %p'
  standby_mode = 'on'
  primary_conninfo = 'host=localhost port=5432'   # e.g. 
  'host=localhost port=5432'
  
  Let me know what additional information I can supply.
 
 I saw Simon's commit fixing this bug.  Another good reason we didn't
 bundle 9.0 beta2 yesterday.

I can now confirm that Simon's patch fixes the bug.  (Unfortunately I
had a web cast yesterday about this feature and the bug prevented me
from completing the demonstration.)

FYI, my presentation shows all the steps necessary to setup HS/SR:

http://momjian.us/main/presentations/technical.html#hot_streaming

One odd thing is we have two paramters that mention hot_standby --- on
the master we have to do in postgresql.conf:

wal_level = hot_standby

and on the slave we do in postgresql.conf:

hot_standby = on

That is a little confusing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 
 One odd thing is we have two paramters that mention hot_standby
 --- on the master we have to do in postgresql.conf:
 
   wal_level = hot_standby
 
 and on the slave we do in postgresql.conf:
 
   hot_standby = on
 
 That is a little confusing.
 
Why?  I read that as saying that the master is writing sufficient
data into the WAL for it to be usable for hot standby purposes, and
that that the slave machine is going to be used as a hot standby. 
You'd better do the former if you're going to do the latter, but
there were good reasons not to try to infer one setting from the
other.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery consistent != hot standby

2010-05-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 PM_RECOVERY_CONSISTENT - PM_HOT_STANDBY
 PMSIGNAL_RECOVERY_CONSISTENT - PMSIGNAL_BEGIN_HOT_STANDBY

+1.  From the point of view of the postmaster, whether the state
transition happens immediately upon reaching consistency, or at a
later time, or perhaps even earlier (if we could make that work)
is not relevant.  What's relevant is that it's allowed to let in
hot-standby backends.  So the current naming overspecifies the
meaning of the state and the transition event.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Tom Lane
Peter Crabtree peter.crabt...@gmail.com writes:
 Now, I was reminded that I could simply do this:

 SELECT nextval('my_seq') FROM generate_series(1, 500);

 But of course then I would have no guarantee that I would get a
 contiguous block of ids,

The existing cache behavior will already handle that for you,
I believe.  I don't really see a need for new features here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski
dep...@depesz.com wrote:
 On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
 Hi Peter,

 All you need to do is define your own sequence with an
 increment of 500. Look at:

 http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

 This is often not enough. For example - I want standard increment of 1,
 but right now I'm importing 1 objects, and it would be simpler for
 me to get 1 ids. Preferably in one block.

 This is not achievable now. I know I can 'alter sequence set increment
 by' - but this will also affect concurrent sessions. which might not be
 a problem, but it's a side effect that I don't want.

 +1 for original proposition, would love to get it.

If we do this, I'm inclined to think that the extra argument to
nextval() should be treated as overriding the base increment rather
than specifying a multiplier for it.  Other than that nitpick, it
sounds like a reasonable thing to allow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote:
 On 5/12/10 8:07 PM, Robert Haas wrote:
 I think that would be a good thing to check (it'll confirm whether
 this is the same bug), but I'm not convinced we should actually fix it
 that way.  Prior to 8.4, we handled a smart shutdown during recovery
 at the conclusion of recovery, just prior to entering normal running.
 I'm wondering if we shouldn't revert to that behavior in both 8.4 and
 HEAD.

 This would be OK as long as we document it well.  We patched the
 shutdown the way we did specifically because Fujii thought it would be
 an easy fix; if it's complicated, we should revert it and document the
 issue for DBAs.

I don't understand this comment.

 Oh, and to confirm: the same issue exists, and has always existed, with
 Warm Standby.

That's what I was thinking, but I hadn't gotten around to testing it.
Thanks for the confirmation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Josh Berkus

 This would be OK as long as we document it well.  We patched the
 shutdown the way we did specifically because Fujii thought it would be
 an easy fix; if it's complicated, we should revert it and document the
 issue for DBAs.
 
 I don't understand this comment.

In other words, I'm saying that it's not critical that we troubleshoot
this for 9.0.  Revering Fujii's patch, if it's not working, is an option.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree
On Fri, May 14, 2010 at 5:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Crabtree peter.crabt...@gmail.com writes:
 Now, I was reminded that I could simply do this:

 SELECT nextval('my_seq') FROM generate_series(1, 500);

 But of course then I would have no guarantee that I would get a
 contiguous block of ids,

 The existing cache behavior will already handle that for you,
 I believe.  I don't really see a need for new features here.

I don't see how that works for this case, because the cache setting
is static, and also shared between sessions. So if I have 10 records
one time, and 100 records the next, and 587 the third time, what
should my CACHE be set to for that sequence?

And if I do ALTER SEQUENCE SET CACHE each time, I have either killed
concurrency (because I'm locking other sessions out of using that
sequence until I'm finished with it), or I have a race condition (if
someone else issues an ALTER SEQUENCE before I call nextval()). The
same problem exists with using ALTER SEQUENCE SET INCREMENT BY.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree
On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski
 dep...@depesz.com wrote:
 On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
 Hi Peter,

 All you need to do is define your own sequence with an
 increment of 500. Look at:

 http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

 This is often not enough. For example - I want standard increment of 1,
 but right now I'm importing 1 objects, and it would be simpler for
 me to get 1 ids. Preferably in one block.

 This is not achievable now. I know I can 'alter sequence set increment
 by' - but this will also affect concurrent sessions. which might not be
 a problem, but it's a side effect that I don't want.

 +1 for original proposition, would love to get it.

 If we do this, I'm inclined to think that the extra argument to
 nextval() should be treated as overriding the base increment rather
 than specifying a multiplier for it.  Other than that nitpick, it
 sounds like a reasonable thing to allow.


After giving it some thought, that sounds better. You gain some
functionality that way (temporarily overriding the interval) and lose
none.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Tom Lane
Peter Crabtree peter.crabt...@gmail.com writes:
 On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote:
 If we do this, I'm inclined to think that the extra argument to
 nextval() should be treated as overriding the base increment rather
 than specifying a multiplier for it.  Other than that nitpick, it
 sounds like a reasonable thing to allow.

 After giving it some thought, that sounds better. You gain some
 functionality that way (temporarily overriding the interval) and lose
 none.

Well, what you lose is the previous assurance that values of nextval()
were always multiples of the increment.  I could see that breaking
applications that are using non-unity increments.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 5:51 PM, Josh Berkus j...@agliodbs.com wrote:

 This would be OK as long as we document it well.  We patched the
 shutdown the way we did specifically because Fujii thought it would be
 an easy fix; if it's complicated, we should revert it and document the
 issue for DBAs.

 I don't understand this comment.

 In other words, I'm saying that it's not critical that we troubleshoot
 this for 9.0.  Revering Fujii's patch, if it's not working, is an option.

There is no patch which we could revert to fix this, by Fujii Masao or
anyone else.  The patch he proposed has not been committed.  I am
still studying the problem to try to figure out where to go with it.
We could decide to punt the whole thing for 9.1, but I'd like to
understand what the options are before we make that decision.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 6:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Crabtree peter.crabt...@gmail.com writes:
 On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote:
 If we do this, I'm inclined to think that the extra argument to
 nextval() should be treated as overriding the base increment rather
 than specifying a multiplier for it.  Other than that nitpick, it
 sounds like a reasonable thing to allow.

 After giving it some thought, that sounds better. You gain some
 functionality that way (temporarily overriding the interval) and lose
 none.

 Well, what you lose is the previous assurance that values of nextval()
 were always multiples of the increment.  I could see that breaking
 applications that are using non-unity increments.

Err, right.  But those applications presumably will also not be using
this new behavior.  There are no versions of PG that have an extra
argument to nextval but still guarantee that the values of nextval()
are multiples of the increment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian
Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
  
  One odd thing is we have two paramters that mention hot_standby
  --- on the master we have to do in postgresql.conf:
  
  wal_level = hot_standby
  
  and on the slave we do in postgresql.conf:
  
  hot_standby = on
  
  That is a little confusing.
  
 Why?  I read that as saying that the master is writing sufficient
 data into the WAL for it to be usable for hot standby purposes, and
 that that the slave machine is going to be used as a hot standby. 
 You'd better do the former if you're going to do the latter, but
 there were good reasons not to try to infer one setting from the
 other.

My point was that the string 'hot_standby' is both a variable and a
setting.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] underscore split to alias

2010-05-14 Thread Erik Rijkers
I am not sure this is a bug, but I was surprised by the following behaviour
in HEAD and 8.4.4 (instances built today, 2010.05.14):

Invalid (?) values like 123_456 are split before the underscore and interpreted 
as
123 as 456:


$ psql -p 6591 -d testdb -c select 123_456, current_setting('server_version')
 _456 | current_setting
--+-
  123 | 9.0beta1
(1 row)


$ psql -p 6584 -d testdb -c select 123_456, current_setting('server_version')
 _456 | current_setting
--+-
  123 | 8.4.4
(1 row)


Older versions (8.3 thru 7.4) all give an error:


$ psql -p 6583 -d testdb -c select 123_456, current_setting('server_version')
ERROR:  syntax error at or near _456
LINE 1: select 123_456, current_setting('server_version')
  ^

The error given by the older servers seems justified.  I assume
this is a bug and not a deliberate change?


thanks,


Erik Rijkers





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] underscore split to alias

2010-05-14 Thread Tom Lane
Erik Rijkers e...@xs4all.nl writes:
 I am not sure this is a bug, but I was surprised by the following behaviour
 in HEAD and 8.4.4 (instances built today, 2010.05.14):

 Invalid (?) values like 123_456 are split before the underscore and 
 interpreted as
 123 as 456:

All versions of postgres will parse 123_456 as an integer (123)
immediately followed by an identifier (_456).  In the particular context
that this is all of a top-level SELECT item, the SQL spec requires that
we parse this as an integer and a column alias (with an implied AS).
We failed to do that before 8.4, but now honor the spec requirement
that AS can be omitted.  Personally I think that's one of the stupider,
more error-prone aspects of the spec's syntax, but nonetheless it's
required by spec 

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Joseph Adams
On Fri, May 14, 2010 at 11:33 AM, Bruce Momjian br...@momjian.us wrote:
 Joseph Adams wrote:
 == array/object conversion ==

 The json_object function converts a tuple to a JSON object.  If there
 are duplicate column names, there will be duplicate keys in the
 resulting JSON object.

 json_object([content [AS name] [, ...]]) returns json

 Likewise, the json_array function converts a tuple to a JSON array.
 Column names are ignored.

 json_array([content [AS name] [, ...]]) returns json

 Do you see any problems with the fact that JSON arrays can use mixed
 data types, e.g.:

        [ 1, 2, 'hi', false]

I suppose the json_object and json_array functions would determine
which JSON types to employ by looking at the types of arguments given
(TEXT values would become strings, INT/FLOAT/NUMERIC/etc. values would
become numbers, TRUE/FALSE would become true/false, NULLS would just
be null, and JSON values would just be inserted as themselves).  Note
that json_array('Hello'::TEXT) would yield '[\Hello\]'::JSON,
while json_array('Hello'::JSON) would yield '[Hello]' .

Going the other way around, values pulled out of JSON objects and
arrays would just be of type JSON.  This (revised) function signature
says it all:

json_values(JSON) returns JSON[]

In short, I don't believe mixed data types in arrays will be a
problem.  json_to_* and *_to_json functions would be used for
individual conversions.

On Fri, May 14, 2010 at 1:15 PM, Robert Haas robertmh...@gmail.com wrote:
 json_keys gets the keys of a JSON object as a set.

 json_keys(json) returns setof text

 I would tend to make this return text[] rather than SETOF text.

 json_values gets the values of a JSON object or the iems of a JSON
 array as a set.

 json_values(json) returns setof json

 Similarly I would make this return json[].

Agreed.  For those who want sets, the unnest() function can be used.

 - retrieves an item of a JSON object by key.
[snip]
 [] retrieves a value of a JSON array/object by (one-based) index.
[snip]

 I think some kind of array deference and object deference mechanism is
 absolutely, positively 100% required.  I don't know whether the
 particular syntax you've proposed here is best or whether we should
 pick another syntax or just use function notation, but I think we
 definitely need *something*.

If the dereferencing operations aren't available, one could work
around it by using json_keys/json_values.  Of course, it would be a
really clunky solution, and implementing - will probably be easy
compared to implementing those functions.

 I also think we need a function called something like json_length()
 which returns the length of a list or the number of keys in an object.

Definitely.

By the way, I'm considering making it so JSON arrays will be treated
like objects when it comes to - and the json_keys function.  Thus,
json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and
('[1,4,9,16,25]'::JSON) - 3 would yield the third item.  This would
obviate the need for an array-only subscript function/operator.

In general, I prefer zero-based counting, but because PostgreSQL's
array indexing is one-based, one-based array keys would be better for
the sake of consistency.  Note that if there was a function like this
in the future:

-- Access a JSON object like you would in JavaScript
json_path('{squares: [1,4,9,16,25]}', '.squares[2]')

There could be confusion, as JavaScript uses zero-based indexing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Robert Haas
On Fri, May 14, 2010 at 10:35 PM, Joseph Adams
joeyadams3.14...@gmail.com wrote:
 By the way, I'm considering making it so JSON arrays will be treated
 like objects when it comes to - and the json_keys function.  Thus,
 json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and
 ('[1,4,9,16,25]'::JSON) - 3 would yield the third item.  This would
 obviate the need for an array-only subscript function/operator.

 In general, I prefer zero-based counting, but because PostgreSQL's
 array indexing is one-based, one-based array keys would be better for
 the sake of consistency.  Note that if there was a function like this
 in the future:

 -- Access a JSON object like you would in JavaScript
 json_path('{squares: [1,4,9,16,25]}', '.squares[2]')

 There could be confusion, as JavaScript uses zero-based indexing.

I think you should take Tom's suggestion and use functional notation
rather than operator notation.  And then I think you should use
0-based counting to match JS.  But I'm game to be outvoted if others
disagree.  Basically, I think you're right: it will be confusing to
have two different notations, and we're certainly going to want a JS
equivalent of XPath at some point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


[moved to -chat]

On Fri, 14 May 2010, Kevin Grittner wrote:

I think that's exactly backwards -- we shouldn't have any traffic on 
-general for issues which could reasonably happen in another list. You 
can always configure your email to combine lists into a common folder 
upon receipt.


*Exactly* ... the thought that we should increase the volume on any one of 
the lists seems counter-productive, but, I guess is the @postgresql.org 
mailing lists are the only ones that someone participats into, maybe they 
hae enough time to keep up on *all* of the email ... ?





Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Pavel Stehule
2010/5/14 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
 joeyadams3.14...@gmail.com wrote:
 [] retrieves a value of a JSON array/object by (one-based) index.  In
 other words, value[n] is equivalent to selecting the nth row of
 json_values(value) (provided value is of type JSON).  Examples:

 SELECT ('[1,2,3,4]'::JSON)[3]
 -- Result is '3'::JSON

 SELECT ('{a: 1, b: 2, c: 3, d: 4}'::JSON)[3]
 -- Result is '3'::JSON

 I think some kind of array deference and object deference mechanism is
 absolutely, positively 100% required.  I don't know whether the
 particular syntax you've proposed here is best or whether we should
 pick another syntax or just use function notation, but I think we
 definitely need *something*.

 Trying to use array notation on something that isn't a SQL array type
 is guaranteed to be a mess.  I strongly recommend that you not attempt
 that.  Just define a function for it.

I agree. Maybe you can implement cast to hstore datatype.

Regards

Pavel Stehule


                        regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Tom Lane
The recently added contrib/pg_upgrade code contains this bit:

/*
 * scandir() is originally from BSD 4.3, which had the third argument as
 * non-const. Linux and other C libraries have updated it to use a const.
 * 
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/questions/2005-12/msg00214.html
 *
 * Here we try to guess which libc's need const, and which don't. The net
 * goal here is to try to suppress a compiler warning due to a prototype
 * mismatch of const usage. Ideally we would do this via autoconf, but
 * autoconf doesn't have a suitable builtin test and it seems overkill
 * to add one just to avoid a warning.
 */
#elif defined(freebsd) || defined(bsdi) || defined(__darwin__) || 
defined(openbsd)
/* no const */
return scandir(dirname, namelist, (int (*) (struct dirent *)) selector, 
NULL);
#else
/* use const */
return scandir(dirname, namelist, selector, NULL);

This drew my attention a couple days ago because it was picking the
wrong alternative on OS X, which was because the as-committed coding
was defined(darwin), which is not how that symbol is spelled.  I fixed
that, but I am now thinking that the other three checks are equally
tin-eared.  In particular, I see that buildfarm members ermine (FreeBSD)
and spoonbill (OpenBSD) are reporting warnings here, which proves that
those two platforms don't predefine freebsd or openbsd respectively.
Does anyone know if they define __freebsd__ or __freebsd etc?

I'm not even too sure what bsdi is, but I'm suspicious of that branch
too.  A search of our code finds

contrib/pg_upgrade/file.c: 248: #elif defined(freebsd) || defined(bsdi) || 
defined(__darwin__) || defined(openbsd)
src/backend/utils/misc/ps_status.c: 67: #elif (defined(BSD) || 
defined(__bsdi__) || defined(__hurd__))  !defined(__darwin__)
src/include/port.h: 355: #if defined(bsdi) || defined(netbsd)
src/port/fseeko.c: 20: #if defined(__bsdi__) || defined(__NetBSD__)
src/port/fseeko.c: 24: #ifdef bsdi
src/port/fseeko.c: 47: #ifdef bsdi
src/port/fseeko.c: 55: #ifdef bsdi
src/port/fseeko.c: 66: #ifdef bsdi
src/port/fseeko.c: 76: #ifdef bsdi
src/port/fseeko.c: 87: #ifdef bsdi

which leaves one with not a lot of warm fuzzies that we know how to
spell the symbol for either bsdi or netbsd.  (Oh, and shouldn't
this pg_upgrade check be looking for netbsd too?)

In the darwin case we aren't really making any assumptions,
because we actually define __darwin__ in port/darwin.h.

I suppose that at least some of the *BSD herd really do predefine some
of the symbols being attributed to them here, but I would like to see
something authoritative about which and what.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Robert Haas
On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not even too sure what bsdi is, but I'm suspicious of that branch
 too.  A search of our code finds

It's a commercial distribution of BSD.  I remember it being pretty
nice when I used it 10+ years ago, but it sounds like it's dead now.
Too bad.

http://en.wikipedia.org/wiki/Berkeley_Software_Design

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not even too sure what bsdi is, but I'm suspicious of that branch
 too.  A search of our code finds

 It's a commercial distribution of BSD.  I remember it being pretty
 nice when I used it 10+ years ago, but it sounds like it's dead now.

Um ... so do you remember which symbol they predefined?  It's pretty
lame that we can't even spell it consistently in one source file.
(Or, if BSDI is so dead that no one remembers, shouldn't we rip out
those #if branches?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Robert Haas
On Sat, May 15, 2010 at 12:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not even too sure what bsdi is, but I'm suspicious of that branch
 too.  A search of our code finds

 It's a commercial distribution of BSD.  I remember it being pretty
 nice when I used it 10+ years ago, but it sounds like it's dead now.

 Um ... so do you remember which symbol they predefined?  It's pretty
 lame that we can't even spell it consistently in one source file.
 (Or, if BSDI is so dead that no one remembers, shouldn't we rip out
 those #if branches?)

No clue.  I agree.  (Yes.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List traffic

2010-05-14 Thread Jaime Casanova
On Fri, May 14, 2010 at 8:39 AM, Marc G. Fournier scra...@hub.org wrote:

 And IMHO, that is as much a fault of the 'old timers' on the lists as the
 newbies ... if nobody redirects / loosely enforces the mandates of the
 various lists, newbies aren't going to learn to post to more appropriate
 ones ...


oh! yeah! that's easy... you say: hey maybe that list is better for
your question... and suddenly you're a piece of crap that should never
answer a mail

most people are not prepared to understand the concept of more than
one list for project... what i personally do in the spanish list is to
read (and when i can) answer questions that have the less or none
answers first, then those that Alvaro has not commented yet and last
if i have time the other ones and then i read the subjects of the
threads in the other lists if something pop up read the thread and
mark as read everything else

-- 
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Boszormenyi Zoltan
Robert Haas írta:
 On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
   
 If  min_sync_replication_clients == 0, then the replication is async.
 If  min_sync_replication_clients == max_wal_senders then the
 replication is fully synchronous.
 If 0  min_sync_replication_clients  max_wal_senders then
 the replication is partially synchronous, i.e. the master can wait
 only for say, 50% of the clients to report back before it's considered
 synchronous and the relevant transactions get released from the wait.
 

 That's an interesting design and in some ways pretty elegant, but it
 rules out some things that people might easily want to do - for
 example, synchronous replication to the other server in the same data
 center that acts as a backup for the master; and asynchronous
 replication to a reporting server located off-site.
   

No, it doesn't. :-) You didn't take into account the third knob
usable in recovery.conf:
synchronous_slave = on/off
The off-site reporting server can be an asynchronous standby,
while the on-site backup server can be synchronous. The only thing
you need to take into account is that min_sync_replication_clients
shouldn't ever exceed your actual number of synchronous standbys.
The setup these three knobs provide is pretty flexible I think.

 One of the things that I think we will probably need/want to change
 eventually is the fact that the master has no real knowledge of who
 the replication slaves are.

The changes I made in my patch partly changes that,
the server still doesn't know who the standbys are
but there's a call that returns the number of connected
_synchronous_ standbys.

   That might be something we want to change
 in order to be able to support more configurability.  Inventing syntax
 out of whole cloth and leaving semantics to the imagination of the
 reader:

 CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback 
 on);
 CREATE REPLICATION SLAVE failover_server (mode synchronous,
 xid_feedback off, break_synchrep_timeout 30);

   


-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Peter Eisentraut
On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote:
 Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900:
  Zdenek Kotala zdenek.kot...@sun.com wrote:
  
   http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01
   The problem is that it contains mix of DOS/Unix end of lines.
  
  I removed two CRs in ja.po.
 
 Thanks. Gothic moth is green again
 
 http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_mothbr=HEAD
 
 The rest solaris machine will recover during a night.

I have added a check to the admin scripts to prevent this in the future.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Peter Eisentraut
On lör, 2010-05-15 at 00:15 -0400, Tom Lane wrote:
 I suppose that at least some of the *BSD herd really do predefine some
 of the symbols being attributed to them here, but I would like to see
 something authoritative about which and what.

http://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/porting-versions.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Peter Eisentraut
On lör, 2010-05-15 at 00:23 -0400, Robert Haas wrote:
 It's a commercial distribution of BSD.  I remember it being pretty
 nice when I used it 10+ years ago, but it sounds like it's dead now.

BSDI is the company that produced BSD/OS, which was Bruce's main
development environment at some point, which is why it has left
excruciating traces all over the PostgreSQL source.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote:
 The problem is that it contains mix of DOS/Unix end of lines.

 I have added a check to the admin scripts to prevent this in the future.

I wonder if we shouldn't be trying to prevent this at the CVS-checkin
level.  We've had problems with people committing DOS-ish newlines
many times before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers