Re: [HACKERS] pg_upgrade code questions
On Thu, 2010-05-13 at 17:19 +0200, Magnus Hagander wrote: I say remove it. On all accounts. There's a fork of postgres for EDB AS, shouldn't there be a fork of pg_upgrade the same way, if it requires special code? The code in community postgresql certainly shouldn't have any EDB AS code in it. Agreed. -- Devrim GÜNDÜZ PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer PostgreSQL RPM Repository: http://yum.pgrpms.org Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] pg_upgrade code questions
On Fri, May 14, 2010 at 5:34 AM, Bruce Momjian br...@momjian.us wrote: Takahiro Itagaki wrote: Bruce Momjian br...@momjian.us wrote: 2. extern PGDLLIMPORT pg_upgrade has own definitions of extern PGDLLIMPORT Oid binary_upgrade_next_xxx The issue here is that you use PGDLLIMPORT where you are importing the variable, not where it is defined. For example, look at 'seq_page_cost'. You can see PGDLLIMPORT used where it is imported with 'extern', but not where is it defined. Right. Also we are intentionally not exposing those variables in any backend .h file, because they are not meant for general use. So the extern PGDLLIMPORT isn't going to be in the main backend and has to be in pg_upgrade. This was discussed awhile ago when we put in those variables, I believe. Yes, this was discussed. I wonder some compilers or linkers might hide unexported global variables from postgres.lib as if they are declared with 'static' specifiers. I'm especially worried about Windows and MSVC. So, if Windows testers can see it works, there was nothing to worry about. Yes, none of the variables pg_upgrade is referencing are 'static', and Magnus tested MSVC and checked MinGW compiles. Just to be clear, I only verified that it *built*, didn't have time to check if it actually *worked*. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Japanies translation breaks solaris build
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01 msgfmt -o po/ja.mo po/ja.po WARNING: the string after closing is ignored at line number 11. Error, No space after directive at line number 2008. ERROR: Exiting... gmake[2]: *** [po/ja.mo] Error 2 The problem is that it contains mix of DOS/Unix end of lines. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Marc G. Fournier wrote: On Thu, 13 May 2010, Alvaro Herrera wrote: Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010: My $0.02 - I like the whole 'don't sort, search' (or how did they call it?) just let the inbox fill up, google is fast enough. What would be really interesting is to have some extra 'tags/headers' added to the emails (document classification with e.g. self organizing map/kohonen), so my local filters could make labels based on that, instead of perhaps badly spelled keywords in subjects or message body. I missed this when I read it the first time .. all list email does have an X-Mailing-List header added so that you can label based on list itself ... is that what you mean, or are you thinking of something else entirely? Something else: if automatic classification of articles was in place, there would be need of fewer mailing lists, depending on the quality of the classification. IMHO the problem of handling the big volume of the lists is not solved by splitting into more, since it does not decrease the amount of posts that are interesting from the subscribers perspective. It would only mean that posters are more likely to make mistakes, a possible increase in crossposts or 'my question was not answered there so now I try here' on the sender part, and at the subscriber side bigger chance to miss interesting articles. That my current mailing list setup works for me supports this claim; I did not subscribe to less lists, but managed to decrease the ms spent at 'handling' to an amount that became workable. Though I do not believe algorithmic article classification/ranking to provide a 100% fool proof filter, it might help decreasing the ms spent per article more. Take a look at how carrot2 clusters results from the query postgresql prepared transactions site:postgresql.org - http://search.carrot2.org/stable/search?source=webview=treeskin=fancy-compactquery=postgresql+prepared+transactions+site%3Apostgresql.orgresults=100algorithm=lingoEToolsDocumentSource.country=ALLEToolsDocumentSource.language=ENGLISHEToolsDocumentSource.safeSearch=false I wonder if a cluster algorithm could tag articles with (multiple) keywords, e.g. 'hackers','prepared transaction','dba' etc etc. I could then make filters or ranking on: hackers AND optimizer - +10. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
2010/5/14 Greg Stark gsst...@mit.edu: On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote: C1: BEGIN C1: SELECT * FROM t WHERE id = 1 FOR UPDATE C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT * FROM t -- Take snapshot before C1 commits C1: COMMIT C2: DELETE FROM t WHERE id = 1 C2: COMMIT Can you give an actual realistic example -- ie, not doing a select for update and then never updating the row or with an explanation of what the programmer is attempting to accomplish with such an unusual sequence? The rest of the post talks about FKs but I don't see any here... The link with FKs is as follows: * The example does not use a real FK, because the whole purpose is to do the same as FKs while not using the FK machinery. * The example uses only one table, because that is enough to illustrate the problem (see next items). * C1 locks a row, supposedly because it wants to create a reference to it in a non-mentioned table, and wants to prevent the row from being deleted under it. * C2 deletes that row (supposedly after it verified that there are no references to it; it would indeed not be able to see the reference that C1 created/would create), and C1 fails to detect that. * C2 also fails to detect the problem, because the lock that C1 held is being released after C1 commits, and C2 can happily go on deleting the row. * The end result is that the hypothetical reference is created, although the referent is gone. Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
On May 14, 2010, at 2:37 , Greg Stark wrote: On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote: C1: BEGIN C1: SELECT * FROM t WHERE id = 1 FOR UPDATE C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT * FROM t -- Take snapshot before C1 commits C1: COMMIT C2: DELETE FROM t WHERE id = 1 C2: COMMIT Can you give an actual realistic example -- ie, not doing a select for update and then never updating the row or with an explanation of what the programmer is attempting to accomplish with such an unusual sequence? The rest of the post talks about FKs but I don't see any here... The table t is supposed to represent the parent table of a FK constraint. The SELECT FOR UPDATE is done upon an INSERT to the child table to protect the parent row against concurrent deletion. I've used FOR UPDATE instead of FOR SHARE because I did test this against oracle also, and oracle does not support FOR SHARE. Here's a full example of a pair of FK triggers in PL/PGSQL that work correctly in READ COMMITTED mode but fail to enforce the constraint in SERIALIZABLE mode as the following sequence of commands show. With my proposal, the DELETE would again raise a serialization error and hence keep the constraint satisfied. C1: BEGIN C1: INSERT INTO child (parent_id) VALUES (1) -- Locks the parent row FOR UPDATE C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT TRUE -- Take snapshot *before* C1 commits C1: COMMIT C2: DELETE FROM parent WHERE parent_id = 1 -- Succeeds C2: COMMIT -- CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY); CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT NULL); CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$ BEGIN PERFORM TRUE FROM child WHERE parent_id = OLD.parent_id; IF FOUND THEN RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || OLD.parent_id || ' still referenced during ' || TG_OP; END IF; RETURN NULL; END; $body$ LANGUAGE PLPGSQL VOLATILE; CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE PROCEDURE ri_parent(); CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$ BEGIN PERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR UPDATE OF parent; IF NOT FOUND THEN RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || NEW.parent_id || ' does not exist during ' || TG_OP; END IF; RETURN NULL; END; $body$ LANGUAGE PLPGSQL VOLATILE; CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE PROCEDURE ri_child(); -- best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nvarchar notation accepted?
On May 14, 2010, at 5:56 , Jaime Casanova wrote: On Thu, May 13, 2010 at 10:13 PM, Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote: Jaime Casanova ja...@2ndquadrant.com wrote: i migrate a ms sql server database to postgres and was trying some queries from the application to find if everything works right... when i was looking to those queries i found some that has a notation for nvarchar (ej: campo = N'sometext') Do you have documentation for N'...' literal in SQLServer? Does it mean unicode literal? What is the difference from U literal? http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html nop, only thing i found is about NVARCHAR: http://msdn.microsoft.com/en-us/library/ms186939.aspx but it has no examples about the N'' notation although you can find examples of it use here: http://msdn.microsoft.com/en-us/library/dd776381.aspx#BasicSyntax Without using the N prefixed versions of CHAR, VARCHAR and string literals, MS SQL Server refuses to process characters other than those in the database's character set. It will replace all those characters with '?'. Note that this is not an encoding issue - it will even do so with protocol versions (everything = 7.0 I think) that use UTF16 on-wire, where those characters can be transmitted just fine. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Japanies translation breaks solaris build
Zdenek Kotala zdenek.kot...@sun.com wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01 The problem is that it contains mix of DOS/Unix end of lines. I removed two CRs in ja.po. Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
[slight rearrangement] Florian Pflug wrote: I'm very exited about the work you're doing Always nice to hear. :-) I view my proposal as pretty orthogonal to that work. My proposal allows for simple FK-like constraints to be implemented at user-level that are correct for all isolation levels. OK, I can see the attraction in that. True serializable transaction are much more powerful than what I proposed, but at a much higher price too, due to the necessity of SIREAD locks. I think that SIREAD locks will generally be cheaper than SELECT FOR UPDATE, since the former don't require any disk I/O and the latter do. I only have one benchmark so far (more on the way), but it attempts to isolate the cost of acquiring the SIREAD locks by using a read-only load against a fully cached database. Benchmarks so far show the new version of the SERIALIZABLE level as supporting 1.8% fewer TPS than REPEATABLE READ (the existing snapshot isolation level) in that environment. That will probably disappear into the noise for any load involving disk I/O. Now *rollbacks*, particularly those due to false positives, might become a more serious issue in some pessimal loads, but I'm still working on developing meaningful benchmarks for that. I guess what I'm suggesting is that unless you have a very small database with a very large number of connections in a high contention workload, or you can't require SERIALIZABLE transaction isolation level, SSI might actually perform better than what you're proposing. Of course, that's all conjecture until there are benchmarks; but I'd be very interested in getting any and all alternative solutions like this worked into a benchmark -- where I can pull out the FOR UPDATE and FOR SHARE clauses, any redundant updates or denormalizations added just for concurrency issues, and all explicit locking -- and compare that under SERIALIZABLE to the original performance. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch built on SR
2010/4/29 Boszormenyi Zoltan z...@cybertec.at: attached is a patch that does $SUBJECT, we are submitting it for 9.1. I have updated it to today's CVS after the wal_level GUC went in. I'm planning to create the synchronous replication patch for 9.0, too. My design is outlined in the wiki. Let's work together to do the design of it. http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability The log-shipping replication has some synchronization levels as follows. Which are you going to work on? The transaction commit on the master #1 doesn't wait for replication (already suppored in 9.0) #2 waits for WAL to be received by the standby #3 waits for WAL to be received and flushed by the standby #4 waits for WAL to be received, flushed and replayed by the standby ..etc? I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside the scope of my development for at least 9.1. In #4, read-only query can easily block recovery by the lock conflict and make the transaction commit on the master get stuck. This problem is difficult to be addressed within 9.1, I think. But the design and implementation of #2 and #3 need to be easily extensible to #4. How does it work? First, the walreceiver and the walsender are now able to communicate in a duplex way on the same connection, so while COPY OUT is in progress from the primary server, the standby server is able to issue PQputCopyData() to pass the transaction IDs that were seen with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE signatures. I did by adding a new protocol message type, with letter 'x' that's only acknowledged by the walsender process. The regular backend was intentionally unchanged so an SQL client gets a protocol error. A new libpq call called PQsetDuplexCopy() which sends this new message before sending START_REPLICATION. The primary makes a note of it in the walsender process' entry. I had to move the TransactionIdLatest(xid, nchildren, children) call that computes latestXid earlier in RecordTransactionCommit(), so it's in the critical section now, just before the XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata) call. Otherwise, there was a race condition between the primary and the standby server, where the standby server might have seen the XLOG_XACT_COMMIT record for some XIDs before the transaction in the primary server marked itself waiting for this XID, resulting in stuck transactions. You seem to have chosen #4 as synchronization level. Right? In your design, the transaction commit on the master waits for its XID to be read from the XLOG_XACT_COMMIT record and replied by the standby. Right? This design seems not to be extensible to #2 and #3 since walreceiver cannot read XID from the XLOG_XACT_COMMIT record. How about using LSN instead of XID? That is, the transaction commit waits until the standby has reached its LSN. LSN is more easy-used for walreceiver and startup process, I think. What if the synchronous standby starts up from the very old backup? The transaction on the master needs to wait until a large amount of outstanding WAL has been applied? I think that synchronous replication should start with *asynchronous* replication, and should switch to the sync level after the gap between servers has become enough small. What's your opinion? I have added 3 new options, two GUCs in postgresql.conf and one setting in recovery.conf. These options are: 1. min_sync_replication_clients = N where N is the number of reports for a given transaction before it's released as committed synchronously. 0 means completely asynchronous, the value is maximized by the value of max_wal_senders. Anything in between 0 and max_wal_senders means different levels of partially synchronous replication. 2. strict_sync_replication = boolean where the expected number of synchronous reports from standby servers is further limited to the actual number of connected synchronous standby servers if the value of this GUC is false. This means that if no standby servers are connected yet then the replication is asynchronous and transactions are allowed to finish without waiting for synchronous reports. If the value of this GUC is true, then transactions wait until enough synchronous standbys connect and report back. Why are these options necessary? Can these options cover more than three synchronization levels? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to know killed by pg_terminate_backend
On Thu, May 13, 2010 at 8:20 PM, Tatsuo Ishii is...@postgresql.org wrote: Maybe we could make PostgreSQL a little bit smarter so that it returns a different code than 57P01 when killed by pg_terminate_backend(). Seems reasonable. Does the victim backend currently know why it has been killed? I don't think so. One idea is postmaster sets a flag in the shared memory area indicating it rceived SIGTERM before forwarding the signal to backends. Backend check the flag and if it's not set, it knows that the signal has been sent by pg_terminate_backend(), not postmaster. Or it could also be sent by some other user process, like the user running kill from the shell. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
On May 14, 2010, at 12:56 , Kevin Grittner wrote: True serializable transaction are much more powerful than what I proposed, but at a much higher price too, due to the necessity of SIREAD locks. I think that SIREAD locks will generally be cheaper than SELECT FOR UPDATE, since the former don't require any disk I/O and the latter do. I only have one benchmark so far (more on the way), but it attempts to isolate the cost of acquiring the SIREAD locks by using a read-only load against a fully cached database. Benchmarks so far show the new version of the SERIALIZABLE level as supporting 1.8% fewer TPS than REPEATABLE READ (the existing snapshot isolation level) in that environment. That will probably disappear into the noise for any load involving disk I/O. I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks would exceed the number of FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least of you ran even transaction under true serializable isolation. I don't quite understand how SIREAD locks work if they don't involve any disk IO, since shared memory isn't resizable. But I guess I'll find out once you post the patch ;-) I guess what I'm suggesting is that unless you have a very small database with a very large number of connections in a high contention workload, or you can't require SERIALIZABLE transaction isolation level, SSI might actually perform better than what you're proposing. That is entirely possible. However, unless your patch completely removes support for snapshot isolation (what is current called SERIALIZABLE), my proposal still eliminates the situation that user-level constraints are correct in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot isolation. Btw, the only user of FOR SHARE locks inside postgres proper are the RI triggers, and those do that special crosscheck when called within a SERIALIZABLE transactions. I do take this as evidence that the current behavior might not be all that useful with serializable transactions... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Japanies translation breaks solaris build
Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900: Zdenek Kotala zdenek.kot...@sun.com wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01 The problem is that it contains mix of DOS/Unix end of lines. I removed two CRs in ja.po. Thanks. Gothic moth is green again http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_mothbr=HEAD The rest solaris machine will recover during a night. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch built on SR
Fujii Masao írta: 2010/4/29 Boszormenyi Zoltan z...@cybertec.at: attached is a patch that does $SUBJECT, we are submitting it for 9.1. I have updated it to today's CVS after the wal_level GUC went in. I'm planning to create the synchronous replication patch for 9.0, too. My design is outlined in the wiki. Let's work together to do the design of it. http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability The log-shipping replication has some synchronization levels as follows. Which are you going to work on? The transaction commit on the master #1 doesn't wait for replication (already suppored in 9.0) #2 waits for WAL to be received by the standby #3 waits for WAL to be received and flushed by the standby #4 waits for WAL to be received, flushed and replayed by the standby ..etc? I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside the scope of my development for at least 9.1. In #4, read-only query can easily block recovery by the lock conflict and make the transaction commit on the master get stuck. This problem is difficult to be addressed within 9.1, I think. But the design and implementation of #2 and #3 need to be easily extensible to #4. How does it work? First, the walreceiver and the walsender are now able to communicate in a duplex way on the same connection, so while COPY OUT is in progress from the primary server, the standby server is able to issue PQputCopyData() to pass the transaction IDs that were seen with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE signatures. I did by adding a new protocol message type, with letter 'x' that's only acknowledged by the walsender process. The regular backend was intentionally unchanged so an SQL client gets a protocol error. A new libpq call called PQsetDuplexCopy() which sends this new message before sending START_REPLICATION. The primary makes a note of it in the walsender process' entry. I had to move the TransactionIdLatest(xid, nchildren, children) call that computes latestXid earlier in RecordTransactionCommit(), so it's in the critical section now, just before the XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata) call. Otherwise, there was a race condition between the primary and the standby server, where the standby server might have seen the XLOG_XACT_COMMIT record for some XIDs before the transaction in the primary server marked itself waiting for this XID, resulting in stuck transactions. You seem to have chosen #4 as synchronization level. Right? Yes. In your design, the transaction commit on the master waits for its XID to be read from the XLOG_XACT_COMMIT record and replied by the standby. Right? This design seems not to be extensible to #2 and #3 since walreceiver cannot read XID from the XLOG_XACT_COMMIT record. Yes, this was my problem, too. I would have had to implement a custom interpreter into walreceiver to process the WAL records and extract the XIDs. But at least the supporting details, i.e. not opening another connection, instead being able to do duplex COPY operations in a server-acknowledged way is acceptable, no? :-) How about using LSN instead of XID? That is, the transaction commit waits until the standby has reached its LSN. LSN is more easy-used for walreceiver and startup process, I think. Indeed, using the LSN seems to be more appropriate for the walreceiver, but how would you extract the information that a certain LSN means a COMMITted transaction? Or we could release a locked transaction in case the master receives an LSN greater than or equal to the transaction's own LSN? Sending back all the LSNs in case of long transactions would increase the network traffic compared to sending back only the XIDs, but the amount is not clear for me. What I am more worried about is the contention on the ProcArrayLock. XIDs are rarer then LSNs, no? What if the synchronous standby starts up from the very old backup? The transaction on the master needs to wait until a large amount of outstanding WAL has been applied? I think that synchronous replication should start with *asynchronous* replication, and should switch to the sync level after the gap between servers has become enough small. What's your opinion? It's certainly one option, which I think partly addressed with the strict_sync_replication knob below. If strict_sync_replication = off, then the master doesn't make its transactions wait for the synchronous reports, and the client(s) can work through their WALs. IIRC, the walreceiver connects to the master only very late in the recovery process, no? It would be nicer if it could be made automatic. I simply thought that there may be situations where the strict behaviour may be desired. I was thinking about the transactions executed on the master between the standby startup and walreceiver connection. Someone may want to ensure the synchronous behaviour for every xact, no
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
Florian Pflug f...@phlo.org wrote: On May 14, 2010, at 12:56 , Kevin Grittner wrote: I think that SIREAD locks will generally be cheaper than SELECT FOR UPDATE, since the former don't require any disk I/O and the latter do. I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks would exceed the number of FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least of you ran even transaction under true serializable isolation. I don't quite understand how SIREAD locks work if they don't involve any disk IO, since shared memory isn't resizable. We use a well-worn technique used by many (most?) database products -- granularity promotion. This is one of the things which could cause enough false positives under some loads to cause your technique to perform better than SSI for those loads. unless your patch completely removes support for snapshot isolation (what is current called SERIALIZABLE) Both SERIALIZABLE and REPEATABLE READ currently map to snapshot isolation. We're leaving REPEATABLE READ alone. my proposal still eliminates the situation that user-level constraints are correct in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot isolation. Agreed. If someone wants to enforce user-level constraints using SSI, they will somehow need to ensure that less strict isolation levels are never used to modify data. Your approach lifts that burden. By the way, if you can make this behave in a similar way to Oracle, especially if the syntax is compatible, I'm sure it will help promote PostgreSQL adoption. At PostgreSQL Conference U.S. East 2010, I talked briefly with a couple guys from an Oracle shop who were looking at converting to PostgreSQL, and were very concerned about not having what you describe. The techniques required to ensure integrity in PostgreSQL were not, to put it mildly, appealing to them. I suspect that they would be satisfied with *either* SSI or the change you describe. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 ... is there a reason why, other the fact that we don't do now, that we can't just put in a restriction against cross posting altogether? Because that would be shooting ourselves in the foot. Cross-posting is often desirable. If we had a clearer distinction of list topics, I might support such a move, but we don't, so I can't. ... and, for those that have been here awhile, who should know better, why isn't there any self-management of this sort of stuff in the first place? What would you have us do? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201005141005 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkvtWKwACgkQvJuQZxSWSsimYACgrPesGj6yxfo49c6T1PPLrKir oPoAn0b81VrrrqAozXnPXV/5vzlAuxr1 =11EB -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 There is no reason why advocacy can't happen on general. Theoretically www could be on hackers (although I do see the point of a separate list). I don't feel as strong about -advocacy being removed, but we certainly can fold in -sql and -admin. Would anyone argue against rolling those two (sql and admin) into -general as a first step? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201005141009 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkvtWbgACgkQvJuQZxSWSsjfAQCg0s9GxUIKnxHjbAWd2XOWxYpk OZMAni62Fpj/PPTE9/qFUNw08une4YgT =OyI0 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
On May 14, 2010, at 15:54 , Kevin Grittner wrote: Florian Pflug f...@phlo.org wrote: On May 14, 2010, at 12:56 , Kevin Grittner wrote: unless your patch completely removes support for snapshot isolation (what is current called SERIALIZABLE) Both SERIALIZABLE and REPEATABLE READ currently map to snapshot isolation. We're leaving REPEATABLE READ alone. Ah, yeah, that makes a lot of sense. I kinda had forgotten about REPEATABLE READ... my proposal still eliminates the situation that user-level constraints are correct in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot isolation. Agreed. If someone wants to enforce user-level constraints using SSI, they will somehow need to ensure that less strict isolation levels are never used to modify data. Your approach lifts that burden. By the way, if you can make this behave in a similar way to Oracle, especially if the syntax is compatible, I'm sure it will help promote PostgreSQL adoption. At PostgreSQL Conference U.S. East 2010, I talked briefly with a couple guys from an Oracle shop who were looking at converting to PostgreSQL, and were very concerned about not having what you describe. The techniques required to ensure integrity in PostgreSQL were not, to put it mildly, appealing to them. I suspect that they would be satisfied with *either* SSI or the change you describe. My proposal would make SELECT ... FOR UPDATE behave like Oracle does with regard to serialization conflicts. SELECT ... FOR SHARE doesn't seem to exist on Oracle at all - at least I couldn't find a reference to it in the docs. The syntax isn't 100% compatible because Oracle seems to expect a list of columns after the FOR UPDATE clause, while postgres expects a list of tables. I must admit that I wasn't able to find an explicit reference to Oracle's behavior in their docs, so I had to resort to experiments. They do have examples showing how to do FK-like constraints with triggers, and those don't contain any warning whatsoever about problems in SERIALIZABLE mode, though. But still, if there is word on this from Oracle somewhere, I'd love to hear about it. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Greg Sabino Mullane g...@turnstep.com wrote: Would anyone argue against rolling those two (sql and admin) into -general as a first step? At the risk of repeating myself, I won't be able to keep up with the traffic of the combined list; so rather than read 100% of the messages from a smaller set, I'll need to pick and choose based on subject line or some such. I get the impression that other people, who read different subsets of the lists, will be forced to a similar change. That may result in either some posts slipping through the cracks or in increasing the burden of responding to the posts for those brave few who wade through them all. Personally, I'm not convince that merging current lists will solve more problems than it will create. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
Florian Pflug f...@phlo.org wrote: I must admit that I wasn't able to find an explicit reference to Oracle's behavior in their docs, so I had to resort to experiments. They do have examples showing how to do FK-like constraints with triggers, and those don't contain any warning whatsoever about problems in SERIALIZABLE mode, though. But still, if there is word on this from Oracle somewhere, I'd love to hear about it. I suspect that in trying to emulate Oracle on this, you may run into an issue which posed challenges for the SSI implementation which didn't come up in the Cahill prototype implementations: Oracle, and all other MVCC databases I've read about outside of PostgreSQL, use an update in place with a rollback log technique. Access to any version of a given row or index entry goes through a single location, with possible backtracking through the log after that, which simplifies management of certain concurrency issues. Do they perhaps use an in-RAM lock table, pointing to the base location of the row for these SELECT FOR UPDATE locks? (Just guessing; I've never used Oracle, myself.) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 ... is there a reason why, other the fact that we don't do now, that we can't just put in a restriction against cross posting altogether? Because that would be shooting ourselves in the foot. Cross-posting is often desirable. If we had a clearer distinction of list topics, I might support such a move, but we don't, so I can't. But, its the cross-posting, IMHO, that reduces the distinction ... ... and, for those that have been here awhile, who should know better, why isn't there any self-management of this sort of stuff in the first place? What would you have us do? Redirect users ... if user sends a query performance related question to -general, respond back with -general as the CC, To as -performance and a Reply-To header of -performance ... that way those on -general know that its been redirected, but *hopefully* users replying will honor the -performance redirect ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 There is no reason why advocacy can't happen on general. Theoretically www could be on hackers (although I do see the point of a separate list). I don't feel as strong about -advocacy being removed, but we certainly can fold in -sql and -admin. Would anyone argue against rolling those two (sql and admin) into -general as a first step? Question ... we have, right now: -sql : how to write a query -performance : how to improve performance of my queries -admin : how to admin the server -novice : I'm a new user -odbc : how to use ... -php : php related interface questions -interfaces : more general then -odbc why not close down -general so that ppl *have* to use better pick where to post their question ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Kevin Grittner wrote: Greg Sabino Mullane g...@turnstep.com wrote: Would anyone argue against rolling those two (sql and admin) into -general as a first step? At the risk of repeating myself, I won't be able to keep up with the traffic of the combined list; so rather than read 100% of the messages from a smaller set, I'll need to pick and choose based on subject line or some such. I get the impression that other people, who read different subsets of the lists, will be forced to a similar change. That may result in either some posts slipping through the cracks or in increasing the burden of responding to the posts for those brave few who wade through them all. That's what I find with the freebsd-questions list ... there is so much noise in there that I tend to avoid posting to it for fear that my email will just get skip'd over ... I am definitely against *merging* lists ... getting rid of the 'meta list' makes more sense so as to force ppl to *use* the smaller lists then to merge smaller lists and *increase* the noise on one of them ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Marc G. Fournier scra...@hub.org wrote: -sql : how to write a query -performance : how to improve performance of my queries -admin : how to admin the server -novice : I'm a new user -odbc : how to use ... -php : php related interface questions -interfaces : more general then -odbc why not close down -general so that ppl *have* to use better pick where to post their question ... That's a change I could support. I even think the descriptions are pretty close to what should show. In trying to think what might be missing, I wonder whether we could decrease inappropriate traffic on the -bugs list if we had a feature request list, for end users not prepared to discuss things at the level appropriate for -hackers, but who think that PostgreSQL should support some feature they don't see. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Marc G. Fournier scra...@hub.org writes: why not close down -general so that ppl *have* to use better pick where to post their question ... I can't imagine that there's not going to need to be a catchall list for problems that don't fit into any of the subcategories. More generally, we already have most of the lists that you suggest, and we already know that people frequently don't find the most appropriate list for postings. I don't think getting rid of -general would help that in the least. The way to cut down on misposted traffic is to make the set of categories smaller and simpler, not to redouble our efforts to persuade people to use the same or even more categories. BTW, as far as no crossposting goes: usually when I find myself doing that, it's to redirect a thread that started on -bugs or -general into -hackers. I don't see the need for that going away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HS/SR Assert server crash
Bruce Momjian wrote: I was able to easily crash the standby server today just by starting it and connecting to it via psql. The master was idle. The failure was: LOG: streaming replication successfully connected to primary TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: procarray.c, Line: 1211) LOG: server process (PID 12761) was terminated by signal 6: Abort trap LOG: terminating any other active server processes My master postgresql.conf was: wal_level = hot_standby # minimal, archive, or hot_standby archive_mode = on # allows archiving to be done archive_command = 'cp -i %p /u/pg/archive/%f /dev/null ' # command to use to archive a logfile segment max_wal_senders = 1 # max number of walsender processes My slave postgresql.conf was: port = 5433 # (change requires restart) wal_level = hot_standby # minimal, archive, or hot_standby archive_mode = off # allows archiving to be done archive_command = 'cp -i %p /u/pg/archive/%f /dev/null ' # command to use to archive a logfile segment hot_standby = on# allows queries during recovery max_wal_senders = 1 # max number of walsender processes and my slave recovery.conf was: restore_command = 'cp /u/pg/archive/%f %p' # e.g. 'cp /mnt/server/archivedir/%f %p' standby_mode = 'on' primary_conninfo = 'host=localhost port=5432' # e.g. 'host=localhost port=5432' Let me know what additional information I can supply. I saw Simon's commit fixing this bug. Another good reason we didn't bundle 9.0 beta2 yesterday. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
Joseph Adams wrote: == array/object conversion == The json_object function converts a tuple to a JSON object. If there are duplicate column names, there will be duplicate keys in the resulting JSON object. json_object([content [AS name] [, ...]]) returns json Likewise, the json_array function converts a tuple to a JSON array. Column names are ignored. json_array([content [AS name] [, ...]]) returns json Do you see any problems with the fact that JSON arrays can use mixed data types, e.g.: [ 1, 2, 'hi', false] -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Yeb Havinga wrote: Marc G. Fournier wrote: On Thu, 13 May 2010, Alvaro Herrera wrote: Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010: My $0.02 - I like the whole 'don't sort, search' (or how did they call it?) just let the inbox fill up, google is fast enough. What would be really interesting is to have some extra 'tags/headers' added to the emails (document classification with e.g. self organizing map/kohonen), so my local filters could make labels based on that, instead of perhaps badly spelled keywords in subjects or message body. I missed this when I read it the first time .. all list email does have an X-Mailing-List header added so that you can label based on list itself ... is that what you mean, or are you thinking of something else entirely? Something else: if automatic classification of articles was in place, there would be need of fewer mailing lists, depending on the quality of the classification. You've mentinoed this serveral time, but what *is* autoclassication of articles? or is this something you do on the gmail side of things? Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Add SIGCHLD catch to psql
Greetings, Toying around with FETCH_COUNT today, I discovered that it didn't do the #1 thing I really wanted to use it for- query large tables without having to worry about LIMIT to see the first couple hundred records. The reason is simple- psql ignores $PAGER exiting, which means that it will happily continue pulling down the entire large table long after you've stopped caring, which means you still have to wait forever. The attached, admittedly quick hack, fixes this by having psql catch SIGCHLD's using handle_sigint. I've tested this and it doesn't appear to obviously break other cases where we have children (\!, for example), since we're not going to be running a database query when we're doing those, and if we are, and the child dies, we probably want to *stop* anyway, similar to the $PAGER issue. Another approach that I considered was fixing various things to deal cleanly with write's failing to $PAGER (I presume the writes *were* failing, since less was in a defunct state, but I didn't actually test). This solution was simpler, faster to code and check, and alot less invasive (or so it seemed to me at the time). Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the current behaviour of completely ignoring $PAGER exiting is a bug. Thanks, Stephen diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index f605c97..dcab436 100644 *** a/src/bin/psql/common.c --- b/src/bin/psql/common.c *** NoticeProcessor(void *arg, const char *m *** 188,194 /* * Code to support query cancellation * ! * Before we start a query, we enable the SIGINT signal catcher to send a * cancel request to the backend. Note that sending the cancel directly from * the signal handler is safe because PQcancel() is written to make it * so. We use write() to report to stderr because it's better to use simple --- 188,194 /* * Code to support query cancellation * ! * Before we start a query, we enable SIGINT and SIGCHLD signals to send a * cancel request to the backend. Note that sending the cancel directly from * the signal handler is safe because PQcancel() is written to make it * so. We use write() to report to stderr because it's better to use simple *** NoticeProcessor(void *arg, const char *m *** 208,213 --- 208,218 * catcher to longjmp through sigint_interrupt_jmp. We assume readline and * fgets are coded to handle possible interruption. (XXX currently this does * not work on win32, so control-C is less useful there) + * + * SIGCHLD is also caught and handled the same to deal with cases where a user's + * PAGER or other child process exits. Otherwise, we would just keep sending + * data to a dead/zombied process. This won't typically matter except when + * FETCH_COUNT is used. */ volatile bool sigint_interrupt_enabled = false; *** void *** 259,264 --- 264,272 setup_cancel_handler(void) { pqsignal(SIGINT, handle_sigint); + + /* Also send SIGCHLD signals, to catch cases where the user exits PAGER */ + pqsignal(SIGCHLD, handle_sigint); } #else /* WIN32 */ signature.asc Description: Digital signature
Re: [HACKERS] JSON manipulation functions
2010/5/14 Bruce Momjian br...@momjian.us: Joseph Adams wrote: == array/object conversion == The json_object function converts a tuple to a JSON object. If there are duplicate column names, there will be duplicate keys in the resulting JSON object. json_object([content [AS name] [, ...]]) returns json Likewise, the json_array function converts a tuple to a JSON array. Column names are ignored. json_array([content [AS name] [, ...]]) returns json Do you see any problems with the fact that JSON arrays can use mixed data types, e.g.: [ 1, 2, 'hi', false] it could not be a problem regards Pavel -- Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Tom Lane t...@sss.pgh.pa.us wrote: I can't imagine that there's not going to need to be a catchall list for problems that don't fit into any of the subcategories. More generally, we already have most of the lists that you suggest, and we already know that people frequently don't find the most appropriate list for postings. I don't think getting rid of -general would help that in the least. The way to cut down on misposted traffic is to make the set of categories smaller and simpler, not to redouble our efforts to persuade people to use the same or even more categories. Well, redoubling our current efforts to direct people to more specific lists would accomplish nothing, since doubling zero leaves you with zero. The description of -general includes: | General discussion area for users. Apart from compile, acceptance | test, and bug problems, most new users will probably only be | interested in this mailing list Given that, the fact that -admin, -novice, -sql, and -performance collectively get as many posts as -general suggests that people are, in fact, making some effort to find a list which seems a good fit. Perhaps if the description of -general was changed to suggest it *was* a catch-all for posts which don't fit the other lists, things would improve. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
There is no reason why advocacy can't happen on general. Theoretically www could be on hackers (although I do see the point of a separate list). First off, this is absolutely the wrong list to be discussing management of PostgreSQL lists. That belongs on pgsql-www. And, I'll point out, that this completely pointless discussion on the wrong list has been 1/6 of the traffic on -hackers for the last two days. Also, I really don't see what problem people think they're addressing with these bimonthly calls for list consolidation. It seems like a solution in search of a problem. So it's an exercise in ironic wankitude. Can we please stop it now? Second, regarding advocacy: no, absolutely not. -advocacy is a working list and not a virtual water cooler. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Marc G. Fournier wrote: On Fri, 14 May 2010, Yeb Havinga wrote: Marc G. Fournier wrote: On Thu, 13 May 2010, Alvaro Herrera wrote: Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010: My $0.02 - I like the whole 'don't sort, search' (or how did they call it?) just let the inbox fill up, google is fast enough. What would be really interesting is to have some extra 'tags/headers' added to the emails (document classification with e.g. self organizing map/kohonen), so my local filters could make labels based on that, instead of perhaps badly spelled keywords in subjects or message body. I missed this when I read it the first time .. all list email does have an X-Mailing-List header added so that you can label based on list itself ... is that what you mean, or are you thinking of something else entirely? Something else: if automatic classification of articles was in place, there would be need of fewer mailing lists, depending on the quality of the classification. You've mentinoed this serveral time, but what *is* autoclassication of articles? or is this something you do on the gmail side of things? I ment classification in the sense of automated as apposed to manual classification by author or subscriber, in the general sense, not linked to any mail client or server. Example: junk mail detection by mail client. After sending my previous mail this morning I looked a bit more into (the faq of) carrot2, which links to LingPipe as a solution for people that like pre-defined classes. LingPipe in fact has a tutorial where they classify a dataset of newsgroups articles, see e.g. http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html. I suppose it would be interesting to see what could be done with the pg archives. If the archive database itself is publically available, or could be made available I'd be willing to put some time into this (solely on the bases that I'm interested in the outcome, not that I pursue that it'd be used by the pg project, though that'd ofcourse be cool if it turned out that way in the end) regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
On Thu, May 13, 2010 at 9:47 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: The following function returns the type of any JSON value. json_type as enum ('null', 'string', 'number', 'boolean', 'object', 'array') json_type(json) returns json_type Seems reasonable. Would it be a bad idea to give an enum and a function the same name (which appears to be allowed by PostgreSQL) ? If so, json_type(json) could be json_typeof(json) or something instead. No, I think that's a fine idea. I thought about having predicates like IS STRING and IS NUMBER, similar to the IS DOCUMENT predicate used for XML. However, a major problem with that approach is that it could lead to confusion involving IS NULL. By my understanding, the JSON datatype will just be a specialization of TEXT (it just validates the input). Like TEXT, a JSON string can be 'null'. 'null'::JSON is not NULL. Bear in mind that json_to_*('null') is NULL, though. Even aside from the possible semantic confusion, I don't think that we should make any changes to our core grammar (gram.y) to support JSON. It's not really necessary and it's better not to add extra stuff to the grammar unless we really need it. I also thought about having a series of json_is_* functions. I don't think it's a bad idea, but I think json_type is a better solution. I agree. == text/number/boolean conversion == These functions each convert a non-compound JSON value to its respective return type. Run-time type checking is performed; a conversion will throw an error if the input JSON is not the correct type. If the JSON value is 'null', then the return value will be NULL. json_to_text(json) returns text json_to_number(json) returns numeric json_to_bool(json) returns boolean Can '3' be converted to a number, or only if it's written without the quotes? These functions convert values to JSON. Passing NULL to any of the functions below will return 'null': text_to_json(text) returns json number_to_json(numeric) returns json bool_to_json(boolean) returns json There could be generic value_to_json(any), but not a json_to_value(json) function. See http://archives.postgresql.org/pgsql-hackers/2010-04/msg00321.php for more details. Seems OK. Conversion to/from number or boolean can also be achieved with casting. Note well that 'string'::JSON::TEXT is 'string', not the string's actual value. json_to_text is needed for this conversion. For this reason, casting JSON might seem like something to recommend against. However, IMHO, casting numbers and booleans to/from JSON is fine and dandy; the paragraphs below give some weight to this. I originally considered making json_to_number and number_to_json work with TEXT instead of NUMERIC. However, as Tom Lane pointed out in the above link, Forcing people to insert explicit coercions from text isn't going to be particularly convenient to use.. Nevertheless, NUMERIC introduces a problem. For instance, if you say: SELECT '-1e-38'::NUMERIC; This conversion knocks out the scientific notation and produces a 41-character string. I seriously doubt that all outside applications will handle 41-character numbers correctly. Maybe not, but I don't think it's your problem to fix it if they don't. If people want to have fine-grained control over the JSON that gets generated, they can always generate the value as text and cast it to JSON. Perhaps there should be individual functions for specific data types, or maybe just a handful for particular cases. There might be json_to_int, json_to_float, and json_to_numeric. In any case, converting to/from number types can be achieved quite easily with casting. Personally I'd go with just json_to_numeric for starters. We can always add the others if and when it's clear that they are useful. == array/object conversion == The json_object function converts a tuple to a JSON object. If there are duplicate column names, there will be duplicate keys in the resulting JSON object. json_object([content [AS name] [, ...]]) returns json Seems good. Likewise, the json_array function converts a tuple to a JSON array. Column names are ignored. json_array([content [AS name] [, ...]]) returns json I think this is pointless and should be omitted. The json_agg function reduces a set of JSON values to a single array containing those values. aggregate json_agg(json) returns json Very useful, I like that. json_object and json_agg can be used together to convert an entire result set to one JSON array: SELECT json_agg(json_object(*)) FROM tablename; Spiffy. json_keys gets the keys of a JSON object as a set. json_keys(json) returns setof text I would tend to make this return text[] rather than SETOF text. json_values gets the values of a JSON object or the iems of a JSON array as a set. json_values(json) returns setof json Similarly I would make this return json[]. Note that all JSON slicing and splicing operations
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Kevin Grittner wrote: Well, redoubling our current efforts to direct people to more specific lists would accomplish nothing, since doubling zero leaves you with zero. The description of -general includes: Agreed ... Given that, the fact that -admin, -novice, -sql, and -performance collectively get as many posts as -general suggests that people are, in fact, making some effort to find a list which seems a good fit. Perhaps if the description of -general was changed to suggest it *was* a catch-all for posts which don't fit the other lists, things would improve. Can you offer improvd / stronger wording on this ... ? Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
On Fri, May 14, 2010 at 1:15 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, May 13, 2010 at 9:47 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: [snip] == array/object conversion == The json_object function converts a tuple to a JSON object. If there are duplicate column names, there will be duplicate keys in the resulting JSON object. json_object([content [AS name] [, ...]]) returns json Seems good. Likewise, the json_array function converts a tuple to a JSON array. Column names are ignored. json_array([content [AS name] [, ...]]) returns json I think this is pointless and should be omitted. (I'm going to avoid the use of the term object here to reduce confusion.) I disagree with the assertion that it's pointless, and I have a specific use-case in mind for this function. I have a system that uses JSON arrays on the wire to encapsulate data, and both ends understand the positional semantics of the elements. Using JSON arrays instead of JSON objects reduces the transfer size by 40-80%, depending on how full the rows (or class instances) are and the data types of the elements, simply by removing the redundant object keys. This function would be extremely useful to me when creating or persisting raw class instances of these sorts. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] recovery consistent != hot standby
While looking through postmaster.c and xlog.c I discovered that we're being a little bit loose about our use of terminology. Maybe this was right when committed (I think, at that point, Hot Standby was always on) but it's not right any more. It appears that we only enter the PM_RECOVERY_CONSISTENT when Hot Standby is enabled; otherwise, we remain in PM_RECOVERY even after reaching consistency. I think, then, that the state, and the signal which triggers it are misnamed. For the avoidance of confusion, I'd like to propose that we rename as follows: PM_RECOVERY_CONSISTENT - PM_HOT_STANDBY PMSIGNAL_RECOVERY_CONSISTENT - PMSIGNAL_BEGIN_HOT_STANDBY Objections? Better ideas? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] fillfactor gets set to zero for toast tables
I've been able to reproduce the problem described here: http://archives.postgresql.org/pgsql-bugs/2010-05/msg00100.php Do this: create table foo(f1 text); alter table foo set (toast.autovacuum_enabled = false); insert into foo values(repeat('xyzzy',10)); vacuum verbose foo; Notice that the vacuum output shows there's only one toast tuple per toast table page. The problem is that if any reloption is set for the toast table, we build a StdRdOptions struct in which fillfactor is zero, and then all the code that actually uses fillfactor honors that. And the reason fillfactor gets left as zero is that there is no entry for it in the tables for toast-table reloptions. Clearly, this wasn't thought through carefully enough. I'm inclined to think that we should remove the notion of there being separate sets of rdoptions for heaps and toast tables --- any case in which someone tries to omit a StdRdOption for toast tables is going to fail the same way, unless we are fortunate enough that zero is a desirable default. What seems more rational is to provide toast.fillfactor but give it min/max/default = 100. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be
On Thu, May 13, 2010 at 1:28 AM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, May 13, 2010 at 12:10 PM, Robert Haas robertmh...@gmail.com wrote: Hmm, it seems this is my night to rediscover the wisdom of your previous proposals. I think that state would only be appropriate when we shutdown after reaching consistency, not any shutdown during recovery. Do you agree? No. When shutdown happens before reaching consistency, the database might be inconsistent, but which doesn't mean that some data might be corrupted. We can get consistent (not corrupted) database by applying the WAL records to inconsistent one. HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. I think that the hint message indicates the data corruption which prevents recovery from completing, rather than the inconsistency of the database. Hmm, OK, I think that makes sense. Would you care to propose a patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generalized Inverted Generalized Search Tree
On Fri, May 14, 2010 at 12:03 AM, Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote: We can index multiple scalar values per row with GIN access method, and also can index single vector value per row with GiST AM. Is it worth having a new AM to index multiple vector values per row? It will be an AM for the missing feature in below: | scalar | vector | +++ single per row | btree | gist | multi per row | gin | *HERE* | We can call the new AM gigist. Or, there might be another idea to support expression indexes for SRF functions, like =# CREATE TABLE tbl (c circle[]); =# CREATE INDEX ON tbl USING gist (unnest(c)); Comments and ideas welcome. I'm not sure I agree with your characterization of the purpose of GIST. I think of GIST as a mechanism to support index lookups on types that are not totally ordered. Still, I guess I understand the point - I think what you are trying to do is optimize queries of the form, e.g.: SELECT * FROM tbl WHERE [some circle in c overlaps a given box] I don't personally want to do that :-) but I can easily imagine that someone else might. As for adding a new AM, we could certainly do it that way, but we should at least consider the possibility of trying to modify the existing GIST code to handle this case in addition to the things it already does. If that's not possible, then we should think about how to minimize code duplication. I fear that if we're not careful we might end up having to fix a lot of bugs in two places; also, any code that is specific to gigist will certainly get less real-world use than gist itself, so sharing code will help keep our bug counts down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fillfactor gets set to zero for toast tables
Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010: The problem is that if any reloption is set for the toast table, we build a StdRdOptions struct in which fillfactor is zero, and then all the code that actually uses fillfactor honors that. And the reason fillfactor gets left as zero is that there is no entry for it in the tables for toast-table reloptions. Clearly, this wasn't thought through carefully enough. Ouch :-( We messed with this stuff after the initial commit because I didn't get it right the first time either. I'm surprised that we didn't find this problem right away. I'm inclined to think that we should remove the notion of there being separate sets of rdoptions for heaps and toast tables --- any case in which someone tries to omit a StdRdOption for toast tables is going to fail the same way, unless we are fortunate enough that zero is a desirable default. It doesn't seem like having the distinction has bought us anything. However, if we do that, we can't add a separate entry to intRelOpts to fix min/max/default to 100, so I think we should document that options for toast must match those for heaps. What seems more rational is to provide toast.fillfactor but give it min/max/default = 100. Adding an entry to intRelOpts should be enough to fix this bug, correct? -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
Recently, in preparation for migrating an application to postgres, I got to this part of the manual (which is *excellent* so far, by the way): http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html A quick check with the folks on #postgresql confirmed my understanding, which was that the locking semantics of setval() and nextval() make this unsafe: SELECT setval('my_seq', nextval('my_seq') + 500); Now, I was reminded that I could simply do this: SELECT nextval('my_seq') FROM generate_series(1, 500); But of course then I would have no guarantee that I would get a contiguous block of ids, which means if I'm using this to do a mass insert of records which refer to each others' ids (example: storing a directed, linear graph), I either have to do a correlated update on the client side, after transferring the keys (consider the cost of doing this for a few million records - 4 MB in keys per million records, for, in extreme cases, 12 MB of data to be inserted -- 33% overhead in the worst case, presuming symmetric bandwidth), or I have to insert into a temporary table, then have the db backend do the update, then insert from there to the real table. Both are imperfect options in terms of performance and complexity. Thus, before I start work on it, I propose an extension to the current nextval(): SELECT nextval('my_seq', 500); This would increment the my_seq sequence by its interval * 500, and return the first valid key. This both makes client code that needs a bunch of PKs simpler to implement, and saves in performance, since the client can just replace all its PKs (presuming they're currently a contiguous block from 1 to n) with my_starting_pk + current_pk, so this: pk | next_node +--- 0 | 1 1 | 2 2 | 0 can be easily updated like this: SELECT nextval('my_seq', (SELECT count(*) FROM my_table)); UPDATE my_table SET pk = currval('my_seq') + pk, next_node = currval('my_seq') + next_node; to something like this: pk | next_node +-- 521650 |521651 521651 |521652 521652 |521650 This is a net gain of performance and ease of implementation in many cases where a large number of ids from a sequence are needed -- with a small added benefit of the keys being guaranteed to be contiguous. I don't see any technical problems with this; postgres already can pre-allocate more than one key, but the number is semi-static (the CACHE parameter to CREATE SEQUENCE). This might break existing user code if they've defined a nextval(regclass, integer), but I don't see any way to Finally, I've checked sequence.c -- this looks pretty straightforward to implement, but I figured checking with this list was wise before starting work. Apologies if I've been overly wordy. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fillfactor gets set to zero for toast tables
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010: What seems more rational is to provide toast.fillfactor but give it min/max/default = 100. Adding an entry to intRelOpts should be enough to fix this bug, correct? I think so, but haven't tested. The docs would need some correction perhaps. BTW, I notice that the code allows people to fool with autovacuum_analyze_threshold and related parameters for toast tables, which seems rather pointless since we never analyze toast tables. So the fillfactor isn't really the only instance of the issue. Maybe a better solution is to have some kind of notion of a default-only entry, which is sufficient to insert the default into the struct but isn't accepted as a user-settable item. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
Hi Peter, All you need to do is define your own sequence with an increment of 500. Look at: http://www.postgresql.org/docs/8.4/static/sql-createsequence.html Regards, Ken On Fri, May 14, 2010 at 02:56:18PM -0400, Peter Crabtree wrote: Recently, in preparation for migrating an application to postgres, I got to this part of the manual (which is *excellent* so far, by the way): http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html A quick check with the folks on #postgresql confirmed my understanding, which was that the locking semantics of setval() and nextval() make this unsafe: SELECT setval('my_seq', nextval('my_seq') + 500); Now, I was reminded that I could simply do this: SELECT nextval('my_seq') FROM generate_series(1, 500); But of course then I would have no guarantee that I would get a contiguous block of ids, which means if I'm using this to do a mass insert of records which refer to each others' ids (example: storing a directed, linear graph), I either have to do a correlated update on the client side, after transferring the keys (consider the cost of doing this for a few million records - 4 MB in keys per million records, for, in extreme cases, 12 MB of data to be inserted -- 33% overhead in the worst case, presuming symmetric bandwidth), or I have to insert into a temporary table, then have the db backend do the update, then insert from there to the real table. Both are imperfect options in terms of performance and complexity. Thus, before I start work on it, I propose an extension to the current nextval(): SELECT nextval('my_seq', 500); This would increment the my_seq sequence by its interval * 500, and return the first valid key. This both makes client code that needs a bunch of PKs simpler to implement, and saves in performance, since the client can just replace all its PKs (presuming they're currently a contiguous block from 1 to n) with my_starting_pk + current_pk, so this: pk | next_node +--- 0 | 1 1 | 2 2 | 0 can be easily updated like this: SELECT nextval('my_seq', (SELECT count(*) FROM my_table)); UPDATE my_table SET pk = currval('my_seq') + pk, next_node = currval('my_seq') + next_node; to something like this: pk | next_node +-- 521650 |521651 521651 |521652 521652 |521650 This is a net gain of performance and ease of implementation in many cases where a large number of ids from a sequence are needed -- with a small added benefit of the keys being guaranteed to be contiguous. I don't see any technical problems with this; postgres already can pre-allocate more than one key, but the number is semi-static (the CACHE parameter to CREATE SEQUENCE). This might break existing user code if they've defined a nextval(regclass, integer), but I don't see any way to Finally, I've checked sequence.c -- this looks pretty straightforward to implement, but I figured checking with this list was wise before starting work. Apologies if I've been overly wordy. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
Robert Haas robertmh...@gmail.com writes: On Thu, May 13, 2010 at 9:47 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: [] retrieves a value of a JSON array/object by (one-based) index. In other words, value[n] is equivalent to selecting the nth row of json_values(value) (provided value is of type JSON). Examples: SELECT ('[1,2,3,4]'::JSON)[3] -- Result is '3'::JSON SELECT ('{a: 1, b: 2, c: 3, d: 4}'::JSON)[3] -- Result is '3'::JSON I think some kind of array deference and object deference mechanism is absolutely, positively 100% required. I don't know whether the particular syntax you've proposed here is best or whether we should pick another syntax or just use function notation, but I think we definitely need *something*. Trying to use array notation on something that isn't a SQL array type is guaranteed to be a mess. I strongly recommend that you not attempt that. Just define a function for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Josh Berkus wrote: First off, this is absolutely the wrong list to be discussing management of PostgreSQL lists. That belongs on pgsql-www. Actually, this is as good a list as any ... -www is for WWW related issues, not mailing list ... be as inappropriate there as it would be on sysadmins, which also doesn't cover mailing lists ... Second, regarding advocacy: no, absolutely not. -advocacy is a working list and not a virtual water cooler. BTW, and even I totally forgot about it, but we do have a virtual water cooler already: pgsql-chat ... 224 subscribers currently, just nobody uses it ... In fact, I just removed / changed to BCC -hackers so that all further discussions on this part of the thread will be on -chat ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, May 14, 2010 at 9:51 AM, Josh Berkus j...@agliodbs.com wrote: Second, regarding advocacy: no, absolutely not. -advocacy is a working list and not a virtual water cooler. +1. I would find it very difficult to manage having -advocacy thrown into -general. If folks think that information isn't getting wide enough distribution, that's one thing. But it is an important working group, even if there's not a ton of traffic all the time on it. -selena -- http://chesnok.com/daily - me -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch built on SR
On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan z...@cybertec.at wrote: If min_sync_replication_clients == 0, then the replication is async. If min_sync_replication_clients == max_wal_senders then the replication is fully synchronous. If 0 min_sync_replication_clients max_wal_senders then the replication is partially synchronous, i.e. the master can wait only for say, 50% of the clients to report back before it's considered synchronous and the relevant transactions get released from the wait. That's an interesting design and in some ways pretty elegant, but it rules out some things that people might easily want to do - for example, synchronous replication to the other server in the same data center that acts as a backup for the master; and asynchronous replication to a reporting server located off-site. One of the things that I think we will probably need/want to change eventually is the fact that the master has no real knowledge of who the replication slaves are. That might be something we want to change in order to be able to support more configurability. Inventing syntax out of whole cloth and leaving semantics to the imagination of the reader: CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback on); CREATE REPLICATION SLAVE failover_server (mode synchronous, xid_feedback off, break_synchrep_timeout 30); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Kevin Grittner wrote: Tom Lane t...@sss.pgh.pa.us wrote: I can't imagine that there's not going to need to be a catchall list for problems that don't fit into any of the subcategories. More generally, we already have most of the lists that you suggest, and we already know that people frequently don't find the most appropriate list for postings. I don't think getting rid of -general would help that in the least. The way to cut down on misposted traffic is to make the set of categories smaller and simpler, not to redouble our efforts to persuade people to use the same or even more categories. Well, redoubling our current efforts to direct people to more specific lists would accomplish nothing, since doubling zero leaves you with zero. The description of -general includes: | General discussion area for users. Apart from compile, acceptance | test, and bug problems, most new users will probably only be | interested in this mailing list Given that, the fact that -admin, -novice, -sql, and -performance collectively get as many posts as -general suggests that people are, in fact, making some effort to find a list which seems a good fit. Perhaps if the description of -general was changed to suggest it *was* a catch-all for posts which don't fit the other lists, things would improve. FYI, I usually email new people privately that cross-posting a question can cause the question to be ignored. They usually respond positively and avoid it in the future. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Bruce Momjian wrote: FYI, I usually email new people privately that cross-posting a question can cause the question to be ignored. They usually respond positively and avoid it in the future. We all have our own methods ... for instance, I just CC'd this to -chat with a -BCC to -hackers so that follow ups will go over there (since Josh is right, this thread doesn't belong on -hackers) ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add SIGCHLD catch to psql
Stephen Frost wrote: -- Start of PGP signed section. Greetings, Toying around with FETCH_COUNT today, I discovered that it didn't do the #1 thing I really wanted to use it for- query large tables without having to worry about LIMIT to see the first couple hundred records. The reason is simple- psql ignores $PAGER exiting, which means that it will happily continue pulling down the entire large table long after you've stopped caring, which means you still have to wait forever. The attached, admittedly quick hack, fixes this by having psql catch SIGCHLD's using handle_sigint. I've tested this and it doesn't appear to obviously break other cases where we have children (\!, for example), since we're not going to be running a database query when we're doing those, and if we are, and the child dies, we probably want to *stop* anyway, similar to the $PAGER issue. Another approach that I considered was fixing various things to deal cleanly with write's failing to $PAGER (I presume the writes *were* failing, since less was in a defunct state, but I didn't actually test). This solution was simpler, faster to code and check, and alot less invasive (or so it seemed to me at the time). Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the current behaviour of completely ignoring $PAGER exiting is a bug. Plesae add this to the next commit-fest: https://commitfest.postgresql.org/action/commitfest_view/inprogress Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Tom Lane wrote: I can see the need for small tightly-focused special lists. How about a list devoted to discussions about reorganizing the lists? It would get plenty of traffic, and then I could not subscribe to that and have that many less messages to read. There is only one viable solution to reduce list traffic: ban forever everyone who top-posts or doesn't trim what they quote. Maybe some other old-school Usenet rules too--can we ban those with incorrectly formatted signatures and finally add proper bozo tagging? Praise Kibo. Seriously though, I file admin/general/performance into one user oriented folder, hackers/committers into a second, and all the non-code ones (advocacy, www, docs) into a third. I don't think there's any way to restructure those lists that will improve life for people who try to keep up with most of them. I was traveling yesterday and busy today, and now I'm 350 messages behind. No amount of rijiggering the lists will change the fact that there's just that much activity happening around PostgreSQL. You can move the messages around, but the same number are going to show up, and people who want to keep up with everything will have to cope with that. The best you can do is get better support in your mail program for wiping out whole threads at once, once you've realized you're not interested in them. The only real argument to keep some more targeted lists is for the benefit of the people who subscribe to them, not we the faithful, so that they can have something that isn't a firehose of messages to sort through. Is it helpful to novices that they can subscribe to a list when they won't be overwhelmed by traffic, and can ask questions without being too concerned about being harassed for being newbies? Probably. Are there enough people interesting in performance topics alone to justify a list targeted just to them? Certainly; I was only on that list for a long time before joining any of the others. Are the marketing oriented people subscribed only to advocacy and maybe announce happy to avoid the rest of the lists? You bet. Folding, say, performance or admin into general, one idea that pops up sometimes, doesn't help those people--now they can only get the firehose--and it doesn't help me, either. If you can keep up with general, whether or not the other lists are also included in that or not doesn't really matter. Ditto for hackers and the things you might try and split out of it. It's just going to end up with more cross posting, and the only thing I hate more than a mailbox full of messages is discovering a chunk of them are dupes because of that. I might like to see, for example, a user mailing list devoted strictly to replication/clustering work with PostgreSQL. That's another topic I think that people are going to want to ask about more in the near future without getting overwhelmed. But, again, that's for their benefit. I'll have to subscribe to that, too, and in reality it will probably increase the amount of messages I read, because people will ask stuff there that's already been covered on other lists, and vice-versa. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, May 14, 2010 at 4:39 AM, Greg Smith g...@2ndquadrant.com wrote: Is it helpful to novices that they can subscribe to a list when they won't be overwhelmed by traffic, and can ask questions without being too concerned about being harassed for being newbies? Probably. Only if they aren't hoping to get answers... What percentage of the hackers and experts who trawl -general for questions to answer are subscribed to -novices? -general isn't subscriber-only posts is it? Are there enough people interesting in performance topics alone to justify a list targeted just to them? Certainly; I was only on that list for a long time before joining any of the others. If they're interested in performance topics and they're not subscribed to -general then they're missing *most* of what they're interested in which doesn't take place on -performance. And most of what's on -performance ends up being non-performance related questions anyways. I think what I'm getting at is that we shouldn't have any lists for traffic which could reasonably happen on -general. If it's a usage question about postgres then it belongs in the same place regardless of what feature or aspect of the usage it is -- otherwise it'll always be some random subset of the relevant messages. This won't actually cut down on list traffic for me and Simon but it would help get people answers since they'll be posting to the same place as everyone else. Are the marketing oriented people subscribed only to advocacy and maybe announce happy to avoid the rest of the lists? You bet. Well yeah. This is an actual discernible distinction. As evidence I'llnote that there is no advocacy traffic on -general or other mailing lists. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, 14 May 2010, Greg Stark wrote: On Fri, May 14, 2010 at 4:39 AM, Greg Smith g...@2ndquadrant.com wrote: Is it helpful to novices that they can subscribe to a list when they won't be overwhelmed by traffic, and can ask questions without being too concerned about being harassed for being newbies? Probably. Only if they aren't hoping to get answers... What percentage of the hackers and experts who trawl -general for questions to answer are subscribed to -novices? -general isn't subscriber-only posts is it? All our lists are, yes ... *but* ... the 'subscriber list' is cross list, in that if you are subscribed to one, you can post to all ... If they're interested in performance topics and they're not subscribed to -general then they're missing *most* of what they're interested in which doesn't take place on -performance. And most of what's on -performance ends up being non-performance related questions anyways. And IMHO, that is as much a fault of the 'old timers' on the lists as the newbies ... if nobody redirects / loosely enforces the mandates of the various lists, newbies aren't going to learn to post to more appropriate ones ... Personally, my experience with large lists is that if there is a smaller, more focused list, I'll post there first, to avoid being lost in the noise ... and, I will re-post to a more general list *if* and only if I'm unable to get an answer from where I posted my original ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Thu, May 13, 2010 at 11:39 PM, Greg Smith g...@2ndquadrant.com wrote: The only real argument to keep some more targeted lists is for the benefit of the people who subscribe to them, not we the faithful, so that they can have something that isn't a firehose of messages to sort through. Is it helpful to novices that they can subscribe to a list when they won't be overwhelmed by traffic, and can ask questions without being too concerned about being harassed for being newbies? Probably. Are there enough people interesting in performance topics alone to justify a list targeted just to them? Certainly; I was only on that list for a long time before joining any of the others. Are the marketing oriented people subscribed only to advocacy and maybe announce happy to avoid the rest of the lists? You bet. Folding, say, performance or admin into general, one idea that pops up sometimes, doesn't help those people--now they can only get the firehose--and it doesn't help me, either. If you can keep up with general, whether or not the other lists are also included in that or not doesn't really matter. Ditto for hackers and the things you might try and split out of it. It's just going to end up with more cross posting, and the only thing I hate more than a mailbox full of messages is discovering a chunk of them are dupes because of that. +1. I might like to see, for example, a user mailing list devoted strictly to replication/clustering work with PostgreSQL. That's another topic I think that people are going to want to ask about more in the near future without getting overwhelmed. But, again, that's for their benefit. I'll have to subscribe to that, too, and in reality it will probably increase the amount of messages I read, because people will ask stuff there that's already been covered on other lists, and vice-versa. Yep. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle
On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Florian Pflug f...@phlo.org writes: All in all, I believe that SHARE and UPDATE row-level locks should be changed to cause concurrent UPDATEs to fail with a serialization error. I don't see an argument for doing that for FOR SHARE locks, and it already happens for FOR UPDATE (at least if the row actually gets updated). AFAICS this proposal mainly breaks things, in pursuit of an unnecessary and probably-impossible-anyway goal of making FK locking work with only user-level snapshots. After giving this considerable thought and testing the behavior at some length, I think the OP has it right. One thing I sometimes need to do is denormalize a copy of a field, e.g. CREATE TABLE parent (id serial, mode integer not null, primary key (id)); CREATE TABLE child (id serial, parent_id integer not null references parent (id), parent_mode integer not null); The way I have typically implemented this in the past is: 1. Add a trigger to the parent table so that, whenever the mode column gets updated, we do an update on the parent_mode of all children. 2. Add a trigger to the child table so that, when a new child is inserted, it initializes parent_mode from its parent. I do SELECT with FOR UPDATE on the parent parent can't change under me; though FOR SHARE ought to be enough also since we're just trying to lock out concurrent updates. Suppose T1 updates the parent's mode while T2 adds a new child; then both commit. In read committed mode, this seems to work OK regardless of the order of T1 and T2. If T1 grabs the lock first, then T2 sees the updated version of the row after T1 commits. If T2 grabs the lock first, then the update on the parent blocks until the child commits. Subsequently, when the trigger fires, it apparently uses an up-to-date snapshot, so the new child is updated also. In serializable mode, things are not so good. If T1 grabs the lock first, the child waits to see whether it commits or aborts. On commit, it complains that it can't serialize and aborts, which is reasonable - transaction aborts are the price you pay for serializability. If T2 grabs the lock first, the update on the parent blocks as before, but now the update is done with the old snapshot and ignores the new child, so the new child now has a value for parent_mode that doesn't match the parent's actual mode. That is, you get the wrong answer due to a serialization anomaly that didn't existed at the read committed level. Increasing the transaction isolation level is supposed to *eliminate* serialization anomalies, not create them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote: Hi Peter, All you need to do is define your own sequence with an increment of 500. Look at: http://www.postgresql.org/docs/8.4/static/sql-createsequence.html This is often not enough. For example - I want standard increment of 1, but right now I'm importing 1 objects, and it would be simpler for me to get 1 ids. Preferably in one block. This is not achievable now. I know I can 'alter sequence set increment by' - but this will also affect concurrent sessions. which might not be a problem, but it's a side effect that I don't want. +1 for original proposition, would love to get it. depesz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
Greg Stark gsst...@mit.edu wrote: If they're interested in performance topics and they're not subscribed to -general then they're missing *most* of what they're interested in which doesn't take place on -performance. Well, I for one can't currently suck the end of the fire hose which is -general, and would be less able to do so should other lists be folded into it. So I lurk on -bugs, -performance, -admin, and others -- not to glean information so much as to attempt to respond in areas where I feel I might be able to be helpful and, with a bit of luck, take some of the burden off of those who do the most to help people on these lists. Combining lists will only make it harder for me to attempt to assist in this way. And most of what's on -performance ends up being non-performance related questions anyways. I don't believe you. Scanning this: http://archives.postgresql.org/pgsql-performance/2010-05/index.php I see a few non-performance questions, but they're clearly a small fraction of the traffic. I think what I'm getting at is that we shouldn't have any lists for traffic which could reasonably happen on -general. I think that's exactly backwards -- we shouldn't have any traffic on -general for issues which could reasonably happen in another list. You can always configure your email to combine lists into a common folder upon receipt. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash
bruce wrote: and my slave recovery.conf was: restore_command = 'cp /u/pg/archive/%f %p' # e.g. 'cp /mnt/server/archivedir/%f %p' standby_mode = 'on' primary_conninfo = 'host=localhost port=5432' # e.g. 'host=localhost port=5432' Let me know what additional information I can supply. I saw Simon's commit fixing this bug. Another good reason we didn't bundle 9.0 beta2 yesterday. I can now confirm that Simon's patch fixes the bug. (Unfortunately I had a web cast yesterday about this feature and the bug prevented me from completing the demonstration.) FYI, my presentation shows all the steps necessary to setup HS/SR: http://momjian.us/main/presentations/technical.html#hot_streaming One odd thing is we have two paramters that mention hot_standby --- on the master we have to do in postgresql.conf: wal_level = hot_standby and on the slave we do in postgresql.conf: hot_standby = on That is a little confusing. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash
Bruce Momjian br...@momjian.us wrote: One odd thing is we have two paramters that mention hot_standby --- on the master we have to do in postgresql.conf: wal_level = hot_standby and on the slave we do in postgresql.conf: hot_standby = on That is a little confusing. Why? I read that as saying that the master is writing sufficient data into the WAL for it to be usable for hot standby purposes, and that that the slave machine is going to be used as a hot standby. You'd better do the former if you're going to do the latter, but there were good reasons not to try to infer one setting from the other. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery consistent != hot standby
Robert Haas robertmh...@gmail.com writes: PM_RECOVERY_CONSISTENT - PM_HOT_STANDBY PMSIGNAL_RECOVERY_CONSISTENT - PMSIGNAL_BEGIN_HOT_STANDBY +1. From the point of view of the postmaster, whether the state transition happens immediately upon reaching consistency, or at a later time, or perhaps even earlier (if we could make that work) is not relevant. What's relevant is that it's allowed to let in hot-standby backends. So the current naming overspecifies the meaning of the state and the transition event. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
Peter Crabtree peter.crabt...@gmail.com writes: Now, I was reminded that I could simply do this: SELECT nextval('my_seq') FROM generate_series(1, 500); But of course then I would have no guarantee that I would get a contiguous block of ids, The existing cache behavior will already handle that for you, I believe. I don't really see a need for new features here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski dep...@depesz.com wrote: On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote: Hi Peter, All you need to do is define your own sequence with an increment of 500. Look at: http://www.postgresql.org/docs/8.4/static/sql-createsequence.html This is often not enough. For example - I want standard increment of 1, but right now I'm importing 1 objects, and it would be simpler for me to get 1 ids. Preferably in one block. This is not achievable now. I know I can 'alter sequence set increment by' - but this will also affect concurrent sessions. which might not be a problem, but it's a side effect that I don't want. +1 for original proposition, would love to get it. If we do this, I'm inclined to think that the extra argument to nextval() should be treated as overriding the base increment rather than specifying a multiplier for it. Other than that nitpick, it sounds like a reasonable thing to allow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
On Thu, May 13, 2010 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote: On 5/12/10 8:07 PM, Robert Haas wrote: I think that would be a good thing to check (it'll confirm whether this is the same bug), but I'm not convinced we should actually fix it that way. Prior to 8.4, we handled a smart shutdown during recovery at the conclusion of recovery, just prior to entering normal running. I'm wondering if we shouldn't revert to that behavior in both 8.4 and HEAD. This would be OK as long as we document it well. We patched the shutdown the way we did specifically because Fujii thought it would be an easy fix; if it's complicated, we should revert it and document the issue for DBAs. I don't understand this comment. Oh, and to confirm: the same issue exists, and has always existed, with Warm Standby. That's what I was thinking, but I hadn't gotten around to testing it. Thanks for the confirmation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
This would be OK as long as we document it well. We patched the shutdown the way we did specifically because Fujii thought it would be an easy fix; if it's complicated, we should revert it and document the issue for DBAs. I don't understand this comment. In other words, I'm saying that it's not critical that we troubleshoot this for 9.0. Revering Fujii's patch, if it's not working, is an option. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
On Fri, May 14, 2010 at 5:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Crabtree peter.crabt...@gmail.com writes: Now, I was reminded that I could simply do this: SELECT nextval('my_seq') FROM generate_series(1, 500); But of course then I would have no guarantee that I would get a contiguous block of ids, The existing cache behavior will already handle that for you, I believe. I don't really see a need for new features here. I don't see how that works for this case, because the cache setting is static, and also shared between sessions. So if I have 10 records one time, and 100 records the next, and 587 the third time, what should my CACHE be set to for that sequence? And if I do ALTER SEQUENCE SET CACHE each time, I have either killed concurrency (because I'm locking other sessions out of using that sequence until I'm finished with it), or I have a race condition (if someone else issues an ALTER SEQUENCE before I call nextval()). The same problem exists with using ALTER SEQUENCE SET INCREMENT BY. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski dep...@depesz.com wrote: On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote: Hi Peter, All you need to do is define your own sequence with an increment of 500. Look at: http://www.postgresql.org/docs/8.4/static/sql-createsequence.html This is often not enough. For example - I want standard increment of 1, but right now I'm importing 1 objects, and it would be simpler for me to get 1 ids. Preferably in one block. This is not achievable now. I know I can 'alter sequence set increment by' - but this will also affect concurrent sessions. which might not be a problem, but it's a side effect that I don't want. +1 for original proposition, would love to get it. If we do this, I'm inclined to think that the extra argument to nextval() should be treated as overriding the base increment rather than specifying a multiplier for it. Other than that nitpick, it sounds like a reasonable thing to allow. After giving it some thought, that sounds better. You gain some functionality that way (temporarily overriding the interval) and lose none. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
Peter Crabtree peter.crabt...@gmail.com writes: On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote: If we do this, I'm inclined to think that the extra argument to nextval() should be treated as overriding the base increment rather than specifying a multiplier for it. Â Other than that nitpick, it sounds like a reasonable thing to allow. After giving it some thought, that sounds better. You gain some functionality that way (temporarily overriding the interval) and lose none. Well, what you lose is the previous assurance that values of nextval() were always multiples of the increment. I could see that breaking applications that are using non-unity increments. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] max_standby_delay considered harmful
On Fri, May 14, 2010 at 5:51 PM, Josh Berkus j...@agliodbs.com wrote: This would be OK as long as we document it well. We patched the shutdown the way we did specifically because Fujii thought it would be an easy fix; if it's complicated, we should revert it and document the issue for DBAs. I don't understand this comment. In other words, I'm saying that it's not critical that we troubleshoot this for 9.0. Revering Fujii's patch, if it's not working, is an option. There is no patch which we could revert to fix this, by Fujii Masao or anyone else. The patch he proposed has not been committed. I am still studying the problem to try to figure out where to go with it. We could decide to punt the whole thing for 9.1, but I'd like to understand what the options are before we make that decision. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal
On Fri, May 14, 2010 at 6:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Crabtree peter.crabt...@gmail.com writes: On Fri, May 14, 2010 at 5:29 PM, Robert Haas robertmh...@gmail.com wrote: If we do this, I'm inclined to think that the extra argument to nextval() should be treated as overriding the base increment rather than specifying a multiplier for it. Other than that nitpick, it sounds like a reasonable thing to allow. After giving it some thought, that sounds better. You gain some functionality that way (temporarily overriding the interval) and lose none. Well, what you lose is the previous assurance that values of nextval() were always multiples of the increment. I could see that breaking applications that are using non-unity increments. Err, right. But those applications presumably will also not be using this new behavior. There are no versions of PG that have an extra argument to nextval but still guarantee that the values of nextval() are multiples of the increment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: One odd thing is we have two paramters that mention hot_standby --- on the master we have to do in postgresql.conf: wal_level = hot_standby and on the slave we do in postgresql.conf: hot_standby = on That is a little confusing. Why? I read that as saying that the master is writing sufficient data into the WAL for it to be usable for hot standby purposes, and that that the slave machine is going to be used as a hot standby. You'd better do the former if you're going to do the latter, but there were good reasons not to try to infer one setting from the other. My point was that the string 'hot_standby' is both a variable and a setting. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] underscore split to alias
I am not sure this is a bug, but I was surprised by the following behaviour in HEAD and 8.4.4 (instances built today, 2010.05.14): Invalid (?) values like 123_456 are split before the underscore and interpreted as 123 as 456: $ psql -p 6591 -d testdb -c select 123_456, current_setting('server_version') _456 | current_setting --+- 123 | 9.0beta1 (1 row) $ psql -p 6584 -d testdb -c select 123_456, current_setting('server_version') _456 | current_setting --+- 123 | 8.4.4 (1 row) Older versions (8.3 thru 7.4) all give an error: $ psql -p 6583 -d testdb -c select 123_456, current_setting('server_version') ERROR: syntax error at or near _456 LINE 1: select 123_456, current_setting('server_version') ^ The error given by the older servers seems justified. I assume this is a bug and not a deliberate change? thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] underscore split to alias
Erik Rijkers e...@xs4all.nl writes: I am not sure this is a bug, but I was surprised by the following behaviour in HEAD and 8.4.4 (instances built today, 2010.05.14): Invalid (?) values like 123_456 are split before the underscore and interpreted as 123 as 456: All versions of postgres will parse 123_456 as an integer (123) immediately followed by an identifier (_456). In the particular context that this is all of a top-level SELECT item, the SQL spec requires that we parse this as an integer and a column alias (with an implied AS). We failed to do that before 8.4, but now honor the spec requirement that AS can be omitted. Personally I think that's one of the stupider, more error-prone aspects of the spec's syntax, but nonetheless it's required by spec regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
On Fri, May 14, 2010 at 11:33 AM, Bruce Momjian br...@momjian.us wrote: Joseph Adams wrote: == array/object conversion == The json_object function converts a tuple to a JSON object. If there are duplicate column names, there will be duplicate keys in the resulting JSON object. json_object([content [AS name] [, ...]]) returns json Likewise, the json_array function converts a tuple to a JSON array. Column names are ignored. json_array([content [AS name] [, ...]]) returns json Do you see any problems with the fact that JSON arrays can use mixed data types, e.g.: [ 1, 2, 'hi', false] I suppose the json_object and json_array functions would determine which JSON types to employ by looking at the types of arguments given (TEXT values would become strings, INT/FLOAT/NUMERIC/etc. values would become numbers, TRUE/FALSE would become true/false, NULLS would just be null, and JSON values would just be inserted as themselves). Note that json_array('Hello'::TEXT) would yield '[\Hello\]'::JSON, while json_array('Hello'::JSON) would yield '[Hello]' . Going the other way around, values pulled out of JSON objects and arrays would just be of type JSON. This (revised) function signature says it all: json_values(JSON) returns JSON[] In short, I don't believe mixed data types in arrays will be a problem. json_to_* and *_to_json functions would be used for individual conversions. On Fri, May 14, 2010 at 1:15 PM, Robert Haas robertmh...@gmail.com wrote: json_keys gets the keys of a JSON object as a set. json_keys(json) returns setof text I would tend to make this return text[] rather than SETOF text. json_values gets the values of a JSON object or the iems of a JSON array as a set. json_values(json) returns setof json Similarly I would make this return json[]. Agreed. For those who want sets, the unnest() function can be used. - retrieves an item of a JSON object by key. [snip] [] retrieves a value of a JSON array/object by (one-based) index. [snip] I think some kind of array deference and object deference mechanism is absolutely, positively 100% required. I don't know whether the particular syntax you've proposed here is best or whether we should pick another syntax or just use function notation, but I think we definitely need *something*. If the dereferencing operations aren't available, one could work around it by using json_keys/json_values. Of course, it would be a really clunky solution, and implementing - will probably be easy compared to implementing those functions. I also think we need a function called something like json_length() which returns the length of a list or the number of keys in an object. Definitely. By the way, I'm considering making it so JSON arrays will be treated like objects when it comes to - and the json_keys function. Thus, json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and ('[1,4,9,16,25]'::JSON) - 3 would yield the third item. This would obviate the need for an array-only subscript function/operator. In general, I prefer zero-based counting, but because PostgreSQL's array indexing is one-based, one-based array keys would be better for the sake of consistency. Note that if there was a function like this in the future: -- Access a JSON object like you would in JavaScript json_path('{squares: [1,4,9,16,25]}', '.squares[2]') There could be confusion, as JavaScript uses zero-based indexing. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
On Fri, May 14, 2010 at 10:35 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: By the way, I'm considering making it so JSON arrays will be treated like objects when it comes to - and the json_keys function. Thus, json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and ('[1,4,9,16,25]'::JSON) - 3 would yield the third item. This would obviate the need for an array-only subscript function/operator. In general, I prefer zero-based counting, but because PostgreSQL's array indexing is one-based, one-based array keys would be better for the sake of consistency. Note that if there was a function like this in the future: -- Access a JSON object like you would in JavaScript json_path('{squares: [1,4,9,16,25]}', '.squares[2]') There could be confusion, as JavaScript uses zero-based indexing. I think you should take Tom's suggestion and use functional notation rather than operator notation. And then I think you should use 0-based counting to match JS. But I'm game to be outvoted if others disagree. Basically, I think you're right: it will be confusing to have two different notations, and we're certainly going to want a JS equivalent of XPath at some point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
[moved to -chat] On Fri, 14 May 2010, Kevin Grittner wrote: I think that's exactly backwards -- we shouldn't have any traffic on -general for issues which could reasonably happen in another list. You can always configure your email to combine lists into a common folder upon receipt. *Exactly* ... the thought that we should increase the volume on any one of the lists seems counter-productive, but, I guess is the @postgresql.org mailing lists are the only ones that someone participats into, maybe they hae enough time to keep up on *all* of the email ... ? Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON manipulation functions
2010/5/14 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Thu, May 13, 2010 at 9:47 PM, Joseph Adams joeyadams3.14...@gmail.com wrote: [] retrieves a value of a JSON array/object by (one-based) index. In other words, value[n] is equivalent to selecting the nth row of json_values(value) (provided value is of type JSON). Examples: SELECT ('[1,2,3,4]'::JSON)[3] -- Result is '3'::JSON SELECT ('{a: 1, b: 2, c: 3, d: 4}'::JSON)[3] -- Result is '3'::JSON I think some kind of array deference and object deference mechanism is absolutely, positively 100% required. I don't know whether the particular syntax you've proposed here is best or whether we should pick another syntax or just use function notation, but I think we definitely need *something*. Trying to use array notation on something that isn't a SQL array type is guaranteed to be a mess. I strongly recommend that you not attempt that. Just define a function for it. I agree. Maybe you can implement cast to hstore datatype. Regards Pavel Stehule regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] predefined macros for various BSD-based systems?
The recently added contrib/pg_upgrade code contains this bit: /* * scandir() is originally from BSD 4.3, which had the third argument as * non-const. Linux and other C libraries have updated it to use a const. * http://unix.derkeiler.com/Mailing-Lists/FreeBSD/questions/2005-12/msg00214.html * * Here we try to guess which libc's need const, and which don't. The net * goal here is to try to suppress a compiler warning due to a prototype * mismatch of const usage. Ideally we would do this via autoconf, but * autoconf doesn't have a suitable builtin test and it seems overkill * to add one just to avoid a warning. */ #elif defined(freebsd) || defined(bsdi) || defined(__darwin__) || defined(openbsd) /* no const */ return scandir(dirname, namelist, (int (*) (struct dirent *)) selector, NULL); #else /* use const */ return scandir(dirname, namelist, selector, NULL); This drew my attention a couple days ago because it was picking the wrong alternative on OS X, which was because the as-committed coding was defined(darwin), which is not how that symbol is spelled. I fixed that, but I am now thinking that the other three checks are equally tin-eared. In particular, I see that buildfarm members ermine (FreeBSD) and spoonbill (OpenBSD) are reporting warnings here, which proves that those two platforms don't predefine freebsd or openbsd respectively. Does anyone know if they define __freebsd__ or __freebsd etc? I'm not even too sure what bsdi is, but I'm suspicious of that branch too. A search of our code finds contrib/pg_upgrade/file.c: 248: #elif defined(freebsd) || defined(bsdi) || defined(__darwin__) || defined(openbsd) src/backend/utils/misc/ps_status.c: 67: #elif (defined(BSD) || defined(__bsdi__) || defined(__hurd__)) !defined(__darwin__) src/include/port.h: 355: #if defined(bsdi) || defined(netbsd) src/port/fseeko.c: 20: #if defined(__bsdi__) || defined(__NetBSD__) src/port/fseeko.c: 24: #ifdef bsdi src/port/fseeko.c: 47: #ifdef bsdi src/port/fseeko.c: 55: #ifdef bsdi src/port/fseeko.c: 66: #ifdef bsdi src/port/fseeko.c: 76: #ifdef bsdi src/port/fseeko.c: 87: #ifdef bsdi which leaves one with not a lot of warm fuzzies that we know how to spell the symbol for either bsdi or netbsd. (Oh, and shouldn't this pg_upgrade check be looking for netbsd too?) In the darwin case we aren't really making any assumptions, because we actually define __darwin__ in port/darwin.h. I suppose that at least some of the *BSD herd really do predefine some of the symbols being attributed to them here, but I would like to see something authoritative about which and what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] predefined macros for various BSD-based systems?
On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not even too sure what bsdi is, but I'm suspicious of that branch too. A search of our code finds It's a commercial distribution of BSD. I remember it being pretty nice when I used it 10+ years ago, but it sounds like it's dead now. Too bad. http://en.wikipedia.org/wiki/Berkeley_Software_Design -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] predefined macros for various BSD-based systems?
Robert Haas robertmh...@gmail.com writes: On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not even too sure what bsdi is, but I'm suspicious of that branch too. A search of our code finds It's a commercial distribution of BSD. I remember it being pretty nice when I used it 10+ years ago, but it sounds like it's dead now. Um ... so do you remember which symbol they predefined? It's pretty lame that we can't even spell it consistently in one source file. (Or, if BSDI is so dead that no one remembers, shouldn't we rip out those #if branches?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] predefined macros for various BSD-based systems?
On Sat, May 15, 2010 at 12:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, May 15, 2010 at 12:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not even too sure what bsdi is, but I'm suspicious of that branch too. A search of our code finds It's a commercial distribution of BSD. I remember it being pretty nice when I used it 10+ years ago, but it sounds like it's dead now. Um ... so do you remember which symbol they predefined? It's pretty lame that we can't even spell it consistently in one source file. (Or, if BSDI is so dead that no one remembers, shouldn't we rip out those #if branches?) No clue. I agree. (Yes.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] List traffic
On Fri, May 14, 2010 at 8:39 AM, Marc G. Fournier scra...@hub.org wrote: And IMHO, that is as much a fault of the 'old timers' on the lists as the newbies ... if nobody redirects / loosely enforces the mandates of the various lists, newbies aren't going to learn to post to more appropriate ones ... oh! yeah! that's easy... you say: hey maybe that list is better for your question... and suddenly you're a piece of crap that should never answer a mail most people are not prepared to understand the concept of more than one list for project... what i personally do in the spanish list is to read (and when i can) answer questions that have the less or none answers first, then those that Alvaro has not commented yet and last if i have time the other ones and then i read the subjects of the threads in the other lists if something pop up read the thread and mark as read everything else -- Jaime Casanova www.2ndQuadrant.com Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch built on SR
Robert Haas írta: On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan z...@cybertec.at wrote: If min_sync_replication_clients == 0, then the replication is async. If min_sync_replication_clients == max_wal_senders then the replication is fully synchronous. If 0 min_sync_replication_clients max_wal_senders then the replication is partially synchronous, i.e. the master can wait only for say, 50% of the clients to report back before it's considered synchronous and the relevant transactions get released from the wait. That's an interesting design and in some ways pretty elegant, but it rules out some things that people might easily want to do - for example, synchronous replication to the other server in the same data center that acts as a backup for the master; and asynchronous replication to a reporting server located off-site. No, it doesn't. :-) You didn't take into account the third knob usable in recovery.conf: synchronous_slave = on/off The off-site reporting server can be an asynchronous standby, while the on-site backup server can be synchronous. The only thing you need to take into account is that min_sync_replication_clients shouldn't ever exceed your actual number of synchronous standbys. The setup these three knobs provide is pretty flexible I think. One of the things that I think we will probably need/want to change eventually is the fact that the master has no real knowledge of who the replication slaves are. The changes I made in my patch partly changes that, the server still doesn't know who the standbys are but there's a call that returns the number of connected _synchronous_ standbys. That might be something we want to change in order to be able to support more configurability. Inventing syntax out of whole cloth and leaving semantics to the imagination of the reader: CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback on); CREATE REPLICATION SLAVE failover_server (mode synchronous, xid_feedback off, break_synchrep_timeout 30); -- Bible has answers for everything. Proof: But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil. (Matthew 5:37) - basics of digital technology. May your kingdom come - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Japanies translation breaks solaris build
On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote: Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900: Zdenek Kotala zdenek.kot...@sun.com wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_mothdt=2010-05-13%2021:06:01 The problem is that it contains mix of DOS/Unix end of lines. I removed two CRs in ja.po. Thanks. Gothic moth is green again http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_mothbr=HEAD The rest solaris machine will recover during a night. I have added a check to the admin scripts to prevent this in the future. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] predefined macros for various BSD-based systems?
On lör, 2010-05-15 at 00:15 -0400, Tom Lane wrote: I suppose that at least some of the *BSD herd really do predefine some of the symbols being attributed to them here, but I would like to see something authoritative about which and what. http://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/porting-versions.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] predefined macros for various BSD-based systems?
On lör, 2010-05-15 at 00:23 -0400, Robert Haas wrote: It's a commercial distribution of BSD. I remember it being pretty nice when I used it 10+ years ago, but it sounds like it's dead now. BSDI is the company that produced BSD/OS, which was Bruce's main development environment at some point, which is why it has left excruciating traces all over the PostgreSQL source. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Japanies translation breaks solaris build
Peter Eisentraut pete...@gmx.net writes: On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote: The problem is that it contains mix of DOS/Unix end of lines. I have added a check to the admin scripts to prevent this in the future. I wonder if we shouldn't be trying to prevent this at the CVS-checkin level. We've had problems with people committing DOS-ish newlines many times before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers