Re: [HACKERS] Need a mentor, and a project.
2009/12/16 decibel deci...@decibel.org On Dec 11, 2009, at 8:44 PM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I even have a sample patch you can use as a start, attached. IMO the hard part of the TODO item is to design a useful user interface for highlighting specific EXPLAIN entries (and NOTICE messages probably ain't it either). Getting the numbers is trivial. What about prefixing explain output with line numbers? NOTICEs (or whatever mechanism) could then reference the line numbers. +1 -- Lets call it Postgres EnterpriseDB http://www.enterprisedb.com gurjeet[.sin...@enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com Twitter: singh_gurjeet Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] Hot Standby and prepared transactions
On Wed, 2009-12-16 at 09:10 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: Investigating how that could come about, it looks like there is some fairly strange stuff going on here. StandbyRecoverPreparedTransactions() is never called at all. I told you so: http://archives.postgresql.org/message-id/4b260b5e.3010...@enterprisedb.com OK, I didn't understand that. Removed. * Skip assert when doing SubTransSetParent in RecoverPreparedTransactions() when in Hot Standby mode because some subxids will have been marked already Seems reasonable. Let's still Assert that the parent is the same as last time. Fix committed to git. --- That fixes or explains all known issues, from me. Are there any other things you know about that I haven't responded to? Do you think we have addressed every issue, except deferred items? I will be looking to commit to CVS later today; waiting on any objections. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
That fixes or explains all known issues, from me. Are there any other things you know about that I haven't responded to? Do you think we have addressed every issue, except deferred items? I will be looking to commit to CVS later today; waiting on any objections. Is following problem reported or fixed ? - 1. configure with --enable-cassert option, then make, make install 2. initdb, enable WAL archiving 3. run the server 4. run pgbench -i, with scaling factor 10 or more 5. server dies with following backtrace (gdb) backtrace #0 0x009e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00a22815 in raise () from /lib/tls/libc.so.6 #2 0x00a24279 in abort () from /lib/tls/libc.so.6 #3 0x082dbf98 in ExceptionalCondition (conditionName=0x84201d4 !(lock-nGranted == 1), errorType=0x8308dd4 FailedAssertion, fileName=0x8420fb2 lock.c, lineNumber=2296) at assert.c:57 #4 0x08231127 in GetRunningTransactionLocks (nlocks=0x0) at lock.c:2296 #5 0x0822c110 in LogStandbySnapshot (oldestActiveXid=0x0, nextXid=0x0) at standby.c:578 #6 0x080cc13f in CreateCheckPoint (flags=32) at xlog.c:6826 #7 0x08204cf6 in BackgroundWriterMain () at bgwriter.c:490 #8 0x080ec291 in AuxiliaryProcessMain (argc=2, argv=0xbff25cc4) at bootstrap.c:413 #9 0x0820b0af in StartChildProcess (type=Variable type is not available. ) at postmaster.c:4218 #10 0x0820c722 in reaper (postgres_signal_arg=17) at postmaster.c:2322 #11 signal handler called #12 0x009e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #13 0x00abcbbd in ___newselect_nocancel () from /lib/tls/libc.so.6 #14 0x0820b2b8 in ServerLoop () at postmaster.c:1360 #15 0x0820d59e in PostmasterMain (argc=3, argv=0x8579860) at postmaster.c:1065 #16 0x081b78f8 in main (argc=3, argv=0x8579860) at main.c:188 - Also, is the problem reported in http://archives.postgresql.org/pgsql-hackers/2009-12/msg01324.php fixed or deferred ? regrards, -- Hiroyuki YAMADA Kokolink Corporation yam...@kokolink.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
On Wed, 2009-12-16 at 18:08 +0900, Hiroyuki Yamada wrote: That fixes or explains all known issues, from me. Are there any other things you know about that I haven't responded to? Do you think we have addressed every issue, except deferred items? I will be looking to commit to CVS later today; waiting on any objections. Is following problem reported or fixed ? That is fixed, as of a couple of days ago. Thanks for your vigilence. Also, is the problem reported in http://archives.postgresql.org/pgsql-hackers/2009-12/msg01324.php fixed or deferred ? Wasn't aware of it. Am now. Will respond on that thread -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Tom Lane t...@sss.pgh.pa.us writes: foreach p2_member in unnest(p2) loop p1 := array(select period_except(p1_member, p2_member) from unnest(p1) p1_member); end loop; But maybe it can be done in a single SQL command. Yeah, as soon as you have LATERAL, I think. Without it there's no way to compose SRF in SQL, AFAIK. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Kevin Grittner wrote: Just to make those who care aware of it, here is Michael Cahill's Doctoral Thesis based on implementing Serializable Snapshot Isolation in InnoDB using a refined version of the techniques previously used in the Berkley DB (and previously discussed on this list): http://hdl.handle.net/2123/5353 Seriously, this post is just for the benefit of those who may be interested in following these developments -- I don't have the inclination or energy for another round of debate on the topic just now. :-/ I understand that, and thank you for the information. Although it may have seemed that I was out to shoot the idea down, I am interested in the topic. I guess my way of understanding something is trying to find holes in it... I read into the text, and I was particularly interested how he solved the problem of phantom reads. Quote: The problem [of phantom reads] was identified in (Eswaran et al., 1976), but the general purpose predicate locking solution suggested there has not been widely adopted because of the difficulty in testing mutual satisfiability of predicates. Instead, locking DBMS implementations commonly use algorithms based on next-key locking. In these, a range of key space is protected against concurrent insertion or deletion by acquiring a shared lock on the next row in order, as a scan is made to check whether rows match a predicate. The scan might be through the data records or through an index. Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication and non-blocking I/O
Fujii Masao wrote: On Tue, Dec 15, 2009 at 3:47 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Tom Lane wrote: The very, very large practical problem with this is that if you decide to change the behavior at any time, the only way to be sure that the WAL receiver is using the right libpq version is to perform a soname major version bump. The transformations done by libpq will essentially become part of its ABI, and not a very visible part at that. Not having to change the libpq API would certainly be a big advantage. Done; I replaced PQgetXLogData and PQputXLogRecPtr with PQgetCopyData and PQputCopyData. Great! The logical next step is move the handling of TimelineID and system identifier out of libpq as well. I'm thinking of refactoring the protocol along these lines: 0. Begin by connecting to the master just like a normal backend does. We don't necessarily need the new ProtocolVersion code either, though it's probably still a good idea to reject connections to older server versions. 1. Get the system identifier of the master. Slave - Master: Query message, with a query string like GET_SYSTEM_IDENTIFIER Master - Slave: RowDescription, DataRow CommandComplete, and ReadyForQuery messages. The system identifier is returned in the DataRow message. This is identical to what happens when a query is executed against a normal backend using the simple query protocol, so walsender can use PQexec() for this. 2. Another query exchange like above, for timeline ID. (or these two steps can be joined into one query, to eliminate one round-trip). 3. Request a backup history file, if needed: Slave - Master: Query message, with a query string like GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr or file name. Master - Slave: RowDescription, DataRow CommandComplete and ReadyForQuery messages as usual. The file contents are returned in the DataRow message. 4. Start replication Slave - Master: Query message, with query string START REPLICATION: , where is the RecPtr of the starting point. Master - Slave: CopyOutResponse followed by a continuous stream of CopyData messages with WAL contents. This minimizes the changes to the protocol and libpq, with a clear way of extending by adding new commands. Similar to what you did a long time ago, connecting as an actual backend at first and then switching to walsender mode after running a few queries, but this would all be handled in a separate loop in walsender instead of running as a full-blown backend. We'll still need small changes to libpq to allow sending messages back to the server in COPY_IN mode (maybe add a new COPY_IN_OUT mode for that). Thoughts? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG patch N+1, fix auto-prepare
OK, here's another approach. output_statement()'s interface is kept as the original, and not this function decides which I still think this could be solved more easily. value it uses. I also introduced static char *ecpg_statement_type_name[] for the textual names of the ECPGst_* symbols to keep the preprocessed code readable, and minimize the impact on the regression tests. So output_statement() always emits ECPGst_* symbols in the preprocessed code instead of ECPGst_normal/prepnormal and numeric value for the other two cases. This way only 7 regression tests' source has changed instead of 45... There are less 1 - ECPGst_execute and 2 - ECPGst_exec_immediate changes than ECPGst_normal - 0 changes would have been if I chose emitting the numeric value. Is it acceptable? Yes sure. I changed some small parts of your patch (see above) and will commit in a few minutes. Just running regression tests. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication and non-blocking I/O
I'm interested in abstracting out features of replication from libpq too. It would be nice if we could implement different communication bus modules. For example if you have dozens of replicas you may want to use something like spread to distribute the records using multicast. Sorry for top posting -- I haven't yet figured out how not to in this client. On 16 Dec 2009 09:54, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Fujii Masao wrote: On Tue, Dec 15, 2009 at 3:47 AM, Heikki Linnakangas heikki.linnakan...@enter... Great! The logical next step is move the handling of TimelineID and system identifier out of libpq as well. I'm thinking of refactoring the protocol along these lines: 0. Begin by connecting to the master just like a normal backend does. We don't necessarily need the new ProtocolVersion code either, though it's probably still a good idea to reject connections to older server versions. 1. Get the system identifier of the master. Slave - Master: Query message, with a query string like GET_SYSTEM_IDENTIFIER Master - Slave: RowDescription, DataRow CommandComplete, and ReadyForQuery messages. The system identifier is returned in the DataRow message. This is identical to what happens when a query is executed against a normal backend using the simple query protocol, so walsender can use PQexec() for this. 2. Another query exchange like above, for timeline ID. (or these two steps can be joined into one query, to eliminate one round-trip). 3. Request a backup history file, if needed: Slave - Master: Query message, with a query string like GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr or file name. Master - Slave: RowDescription, DataRow CommandComplete and ReadyForQuery messages as usual. The file contents are returned in the DataRow message. 4. Start replication Slave - Master: Query message, with query string START REPLICATION: , where is the RecPtr of the starting point. Master - Slave: CopyOutResponse followed by a continuous stream of CopyData messages with WAL contents. This minimizes the changes to the protocol and libpq, with a clear way of extending by adding new commands. Similar to what you did a long time ago, connecting as an actual backend at first and then switching to walsender mode after running a few queries, but this would all be handled in a separate loop in walsender instead of running as a full-blown backend. We'll still need small changes to libpq to allow sending messages back to the server in COPY_IN mode (maybe add a new COPY_IN_OUT mode for that). Thoughts? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscript...
Re: [HACKERS] An example of bugs for Hot Standby
On Tue, 2009-12-15 at 20:25 +0900, Hiroyuki Yamada wrote: Hot Standby node can freeze when startup process calls LockBufferForCleanup(). This bug can be reproduced by the following procedure. Interesting. Looks like this can happen, which is a shame cos I just removed the wait checking code after not ever having seen a wait. Thanks for the report. Must-fix item for HS. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
On Wed, 2009-12-16 at 18:08 +0900, Hiroyuki Yamada wrote: That fixes or explains all known issues, from me. Are there any other things you know about that I haven't responded to? Do you think we have addressed every issue, except deferred items? I will be looking to commit to CVS later today; waiting on any objections. Is following problem reported or fixed ? That is fixed, as of a couple of days ago. Thanks for your vigilence. I tested somewhat older patch(the RC patch in this mailing list). Sorry for annoying you. By the way, reading LogStandbySnapshot() and GetRunningTransactionLocks() raised following questions. * There is a window beween gathering lock information in GetRunningTransactionLocks() and writing WAL in LogAccessExclusiveLocks(). * In current lock redo algorithm, locks are released when the transaction holding the lock are commited or aborted. ... then what happens if any transaction holding ACCESS EXCLUSIVE lock commits in the window ? Similary, * There is a window beween writing COMMIT WAL in RecordTransactionCommit() and releasing locks in ResourceOwnerRelease() ... then what happens when GetRunningTransactionLocks() gathers ACCESS EXCLUSIVE locks whose holder has already written the COMMIT WAL ? Are there any chances of releasing locks which have no COMMIT WAL for releasing them ? regards, -- Hiroyuki YAMADA Kokolink Corporation yam...@kokolink.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
2009/12/16 Albe Laurenz laurenz.a...@wien.gv.at: Quote: The problem [of phantom reads] was identified in (Eswaran et al., 1976), but the general purpose predicate locking solution suggested there has not been widely adopted because of the difficulty in testing mutual satisfiability of predicates. Instead, locking DBMS implementations commonly use algorithms based on next-key locking. In these, a range of key space is protected against concurrent insertion or deletion by acquiring a shared lock on the next row in order, as a scan is made to check whether rows match a predicate. The scan might be through the data records or through an index. Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. That boils down to 2PL, using a granularity that is somewhere between table locks and single-row locks (note that the latter doesn't correctly enforce serializability, hence something more coarse which also locks not-yet-existing rows is needed). Disadvantages: 1. Unstable latency: Depending on whether indexes or table scans are used (i.e., the plan), other transactions may be blocked for long durations or not. 2. Unstable susceptibility to deadlocks: Idem; it is possible that once the planner starts to use another kind of scans, that your transactions start to deadlock. It seems that the proposed SIREAD method fixes at least (1), because there is no additional blocking involved. I am not sure whether the serialization failures that it may cause are dependent on the plan used. If so, that would be very similar to (2). Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
* Albe Laurenz: That sounds like it should actually work. If you have got an index, yes. It seems to me that it would make locking behavior dependent on your query plan, too. BTW, PostgreSQL could raise a different error when a unique constraint violation is detected which involves a row which is not visible at the current snapshot. At least in my limited experience, that would allow applications to recover more easily if small transactions fail (similar to what you have to do on deadlock). Right now (well, at least with 8.3, haven't checked 8.4 yet), it's not possible to tell a unique constraint violation caused by a phantom from an application bug. (We currently faking this by retrying a fixed number of times and bailing out if the error returned by PostgreSQL looks like a unique constraint violation.) -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG patch N+1, fix auto-prepare
Michael Meskes írta: OK, here's another approach. output_statement()'s interface is kept as the original, and not this function decides which I still think this could be solved more easily. Thanks very much for committing it. But I don't understand your change. My code was: = ecpg.addons: ECPG: stmtDeleteStmt block ECPG: stmtInsertStmt block ECPG: stmtSelectStmt block ECPG: stmtUpdateStmt block { output_statement($1, 1, auto_prepare ? ECPGst_prepnormal : ECPGst_normal); } = output.c: static char *ecpg_statement_type_name[] = { ECPGst_normal, ECPGst_execute, ECPGst_exec_immediate, ECPGst_prepnormal }; void output_statement(char *stmt, int whenever_mode, enum ECPG_statement_type st) { fprintf(yyout, { ECPGdo(__LINE__, %d, %d, %s, %d, , compat, force_indicator, connection ? connection : NULL, questionmarks); if (st == ECPGst_normal || st == ECPGst_prepnormal) { fprintf(yyout, %s, \, ecpg_statement_type_name[st]); output_escaped_str(stmt, false); fputs(\, , yyout); } else fprintf(yyout, %s, %s, , ecpg_statement_type_name[st], stmt); = So the ECPGst_normal vs. prepnormal is decided at the caller and output_statement() is simplified a bit vs the original. Your code is: = ecpg.addons: ECPG: stmtDeleteStmt block ECPG: stmtInsertStmt block ECPG: stmtSelectStmt block ECPG: stmtUpdateStmt block { output_statement($1, 1, ECPGst_prepnormal); } = output.c: static char *ecpg_statement_type_name[] = { ECPGst_normal, ECPGst_execute, ECPGst_exec_immediate, ECPGst_prepnormal }; void output_statement(char *stmt, int whenever_mode, enum ECPG_statement_type st) { fprintf(yyout, { ECPGdo(__LINE__, %d, %d, %s, %d, , compat, force_indicator, connection ? connection : NULL, questionmarks); if (st == ECPGst_execute || st == ECPGst_exec_immediate) { fprintf(yyout, %s, %s, , ecpg_statement_type_name[st], stmt); } else { if (st == ECPGst_prepnormal auto_prepare) fputs(ECPGst_prepnormal, \, yyout); else fputs(ECPGst_normal, \, yyout); output_escaped_str(stmt, false); fputs(\, , yyout); } = Your code in ecpg.addons calls output_statement() unconditionally with ECPGst_prepnormal and output_statement() decides what to do with the auto_prepare global variable. Your code doesn't seem more readable than mine, but does the same with the currently existing callers. value it uses. I also introduced static char *ecpg_statement_type_name[] for the textual names of the ECPGst_* symbols to keep the preprocessed code readable, and minimize the impact on the regression tests. So output_statement() always emits ECPGst_* symbols in the preprocessed code instead of ECPGst_normal/prepnormal and numeric value for the other two cases. This way only 7 regression tests' source has changed instead of 45... There are less 1 - ECPGst_execute and 2 - ECPGst_exec_immediate changes than ECPGst_normal - 0 changes would have been if I chose emitting the numeric value. Is it acceptable? Yes sure. I changed some small parts of your patch (see above) and will commit in a few minutes. Just running regression tests. Okay, I have to rebase my SQLDA and DESCRIBE patches again, since the regression tests' results may have changed beause of this patch. I will post them soon. Again, thanks for committing it. Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil. (Matthew 5:37) - basics of digital technology. May your kingdom come - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
Hiroyuki Yamada wrote: By the way, reading LogStandbySnapshot() and GetRunningTransactionLocks() raised following questions. * There is a window beween gathering lock information in GetRunningTransactionLocks() and writing WAL in LogAccessExclusiveLocks(). * In current lock redo algorithm, locks are released when the transaction holding the lock are commited or aborted. ... then what happens if any transaction holding ACCESS EXCLUSIVE lock commits in the window ? Hmm, when replying the locks record, we should be ignoring entries belonging to already-committed transactions. But I don't actually see that in the code. That needs to be fixed, I think. Similary, * There is a window beween writing COMMIT WAL in RecordTransactionCommit() and releasing locks in ResourceOwnerRelease() ... then what happens when GetRunningTransactionLocks() gathers ACCESS EXCLUSIVE locks whose holder has already written the COMMIT WAL ? This is handled in standby in the same manner as the above case. Are there any chances of releasing locks which have no COMMIT WAL for releasing them ? Yes. In normal operation, there should eventually be a commit or abort WAL record for every transaction. But as a safety valve, when a running-xacts record is replayed, we mark as aborted all transactions that are older than the oldest still-running XID according to the running-xacts record, and release their locks. See StandbyReleaseOldLocks(). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Nicolas Barbier wrote: Quote: [...] That sounds like it should actually work. That boils down to 2PL, using a granularity that is somewhere between table locks and single-row locks (note that the latter doesn't correctly enforce serializability, hence something more coarse which also locks not-yet-existing rows is needed). That's how I understood it too. Disadvantages: 1. Unstable latency: Depending on whether indexes or table scans are used (i.e., the plan), other transactions may be blocked for long durations or not. 2. Unstable susceptibility to deadlocks: Idem; it is possible that once the planner starts to use another kind of scans, that your transactions start to deadlock. It seems that the proposed SIREAD method fixes at least (1), because there is no additional blocking involved. I am not sure whether the serialization failures that it may cause are dependent on the plan used. If so, that would be very similar to (2). Well, I guess that you have to pay somehow for serializability - there will be more locks and more lock management. I did not think of that, but it is really unpleasant if your transactions suddenly start receiving serialization errors because the plan has been changed. And the thesis says that the tests did not reveal too many false positives, so maybe it is not that bad. But maybe that's a price worth paying for serializability? Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG patch N+1, fix auto-prepare
On Wed, Dec 16, 2009 at 11:54:41AM +0100, Boszormenyi Zoltan wrote: Your code in ecpg.addons calls output_statement() unconditionally with ECPGst_prepnormal and output_statement() decides what to do with the auto_prepare global variable. Your code doesn't seem more readable than mine, but does the same with the currently existing callers. It better should do the same. :-) Maybe finding it simpler this way just comes from me being used to the way this part of the source code works. It's essantially the same, I only moved that one auto_prepare test out of the parser. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
On Wed, 2009-12-16 at 19:35 +0900, Hiroyuki Yamada wrote: Sorry for annoying you. Not at all! Good to get fresh eyes on this. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby and prepared transactions
On Wed, 2009-12-16 at 19:35 +0900, Hiroyuki Yamada wrote: * There is a window beween gathering lock information in GetRunningTransactionLocks() and writing WAL in LogAccessExclusiveLocks(). * In current lock redo algorithm, locks are released when the transaction holding the lock are commited or aborted. ... then what happens if any transaction holding ACCESS EXCLUSIVE lock commits in the window ? Yes, was a problem in that code. Fixed in git. We were doing it for prepared transactions but not for normal xacts. I will look again at that code. Thanks very much for reading the code. Any more?!? Similary, * There is a window beween writing COMMIT WAL in RecordTransactionCommit() and releasing locks in ResourceOwnerRelease() ... then what happens when GetRunningTransactionLocks() gathers ACCESS EXCLUSIVE locks whose holder has already written the COMMIT WAL ? Same issue, so fixed as well. Thanks to Heikki for making that the same code path, so we have nothing to do for that. Are there any chances of releasing locks which have no COMMIT WAL for releasing them ? No -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Dec 15, 2009 at 04:16:28PM +0100, Nicolas Barbier wrote: [...] whatever and same whatever as before + the character with the lowest value in lexicographical ordering. I don't think it is possible to get anything in between those two strings. Yes, that was basically Andrew's argumentation. I was taken away too much by the similarity between strings and (decimal, base N) fractions, forgetting that the last have lots of unwritten zeros at the end... But hey, 'twas nice to learn that strings aren't as boring as I first thought... Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFLKN9cBcgs9XrR2kYRAl4LAJ9rHs/mlR3+j+79YOUtNUTCY0JOEwCZAROn WsIQoT8nCbgCOaDWraH7jVk= =vjV5 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Dec 15, 2009 at 11:49:19AM -0800, David Fetter wrote: On Tue, Dec 15, 2009 at 11:31:05AM -0800, Scott Bailey wrote: Jeff Davis wrote: On Tue, 2009-12-15 at 10:19 -0500, Tom Lane wrote: Would it be OK if we handled float timestamp ranges as continuous and int64 timestamps discrete? That sounds like a recipe for disaster. Whatever timestamp ranges are, float and int64 should be treated the same way so as not to get surprises due to implementation details. This alone would practically preclude discrete -- int and float would behave quite differently (float's granules growing towards the edges or having to choose a bigger granule for float than for int in the first place). [...] FWIW, I think it would be a good idea to treat timestamps as continuous in all cases. This would come as a corollary from the above Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFLKODXBcgs9XrR2kYRAlpLAJ9nO5f0SHwX8A4CjTn6c/xyZdim1ACdGHTq Fwn5ygKvCDFGadufOYPGrfA= =ivCP -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] An example of bugs for Hot Standby
On Wed, 2009-12-16 at 10:33 +, Simon Riggs wrote: On Tue, 2009-12-15 at 20:25 +0900, Hiroyuki Yamada wrote: Hot Standby node can freeze when startup process calls LockBufferForCleanup(). This bug can be reproduced by the following procedure. Interesting. Looks like this can happen, which is a shame cos I just removed the wait checking code after not ever having seen a wait. Thanks for the report. Must-fix item for HS. So this deadlock can happen at two places: 1. When a relation lock waits behind an AccessExclusiveLock and then Startup runs LockBufferForCleanup() 2. When Startup is a pin count waiter and a lock acquire begins to wait on a relation lock So we must put in direct deadlock detection in both places. We can't use the normal deadlock detector because in case (1) the backend might already have exceeded deadlock_timeout. Proposal: Make Startup wait on a well-known semaphore rather than on its proc-sem. This means we can skip the spinlock check in ProcSendSignal(). For (1) if Startup runs LockBufferForCleanup and can't get cleanup lock then it marks itself waiting. It then checks for any lock waiters. If there are 0 lock waiters then it waits for up to max_standby_delay and then aborts all current lock waiters, none of whom would ever wake if we continue waiting. For (2) If a normal backend goes into a lock wait in HS then it will check to see if Startup is waiting, if so, throw ERROR. This can happen immediately because if Startup is already waiting then to wait for the lock would cause deadlock. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On 12/16/09, Kurt Harriman harri...@acm.org wrote: [Please ignore the previous incomplete version of this reply, which I sent by mistake. Sorry for the list noise.] On 12/15/2009 2:09 PM, Marko Kreen wrote: Oh. Ok then. Force-inline seems better fix as we may want to use it for other reasons too (converting big macros). So it seems like a good moment to solve it for gcc too. Is ordinary inlining not sufficient? If deluxe inlining is something that might be wanted in the future, but isn't needed right now, perhaps we should wait until then. It allows the effects the __forceinline has on MSVC also observe on GCC. It should not hurt anything, unless we go overboard with inline usage. But yeah, it can be omitted. I was just wondering on code structure, to make it possible to turn forceinline for gcc too. Your worry ii) can be ignored, managing to compile on such compilers is already overachievement. I think so too. With your opinion added to mine, do we constitute a consensus of the pg community? Someone might object that a sample of two individuals is insufficiently representative of the whole, but away with the pedants: let us not quibble over trifles. Note that we are not voting here, we are on technical discussion list. Interested people can voice counter-arguments, if we can overturn them, we can proceed... The question is now what should we put into configure and what into headers. PostgreSQL seems mostly to follow the GNU tradition of using autoconf rather than thickets of #ifdefs to discover platform characteristics such as supported features, variant spellings of keywords and identifiers, and the like. This often makes it easier to support new platforms, assuming the autoconf mechanism is understood. Autoconf facilitates testing directly for the needed features: generally a better approach than hard-coding knowledge of the peculiarities of particular compilers, compiler release levels, instruction set architectures, etc. For example, instead of writing #ifdefs to decide, if this is gcc, and the version is high enough, then __funcname__ should work, it's better to let autoconf actually try to compile a program using __funcname__. That way PostgreSQL can keep up with the evolution of improved and new compilers without continually updating an #ifdef-encoded knowledge base of compiler capabilities. Well, yeah, except on one platform we are mostly bypassing the autoconf logic and have lot of duplicate logic. But I'm not opposed putting it into configure, assuming the final keyword is still called inline. Simplest would be to have plain AC_C_INLINE in configure and then in header (c.h?): #ifdef _MSC_VER #undef inline #define inline __forceinline #else #ifdef __GNUC__ #undef inline #define inline inline __attribute__((always_inline)) #endif #endif (Not compile tested :) This would force every inline. Why do we want that? The compiler is allowed to uninline a plain inline function if it feels like it. The gcc uses guesstimate instruction count (before inlining) to decide on that issue. ATM it's not a problem, but eg. if we want to turn more complex macros into inlines (heap_getattr()), it may go over limit and get uninlined. For gcc, I think the __attribute__ has to come after the function's parameter list, rather than before the return type. No. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On 12/16/09, Robert Haas robertmh...@gmail.com wrote: On Tue, Dec 15, 2009 at 10:34 PM, Kurt Harriman harri...@acm.org wrote: Your worry ii) can be ignored, managing to compile on such compilers is already overachievement. I think so too. With your opinion added to mine, do we constitute a consensus of the pg community? Someone might object that a sample of two individuals is insufficiently representative of the whole, but away with the pedants: let us not quibble over trifles. I haven't completely followed this thread, but I think there has been some discussion of making changes to inline that would cause regressions for people using old, crappy compilers, and I think we should avoid doing that unless there is some compelling benefit. I'm not sure what that benefit would be - I don't think cleaner code is enough. Seems you have not followed the thread... Hypothetical old, crappy compilers would still work, only AC_C_INLINE would turn static inline into plain static, so hypothetically they would get some warnings about unused functions. As this is all hypothetical, I don't see why that should stop us cleaning our code? -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea - new aggregates median, listagg
On Wed, Dec 16, 2009 at 1:36 AM, Pavel Stehule pavel.steh...@gmail.com wrote: b) function listagg (it is analogy of group_concat from MySQL) - it should simplify report generating and some other This is redundant, as it's equivalent to array_to_string(array_agg()). when I implement it in orafce - the almost all code was parameters checking. Implementation is trivial, because important part of work was done with array_agg support. It is wrapper over stringInfo. If it's done as syntactic sugar over that, it's fine. If it's a separate implementation, it's a bad idea. Probably best as an optional module of some kind I am not sure if implementation as syntactic sugar is best. It needs parser changes. I don't think this is important enough to change the parser for it. I don't see a concrete proposal for syntax here, but it seems like the standard function call syntax should be adequate. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Marko Kreen mark...@gmail.com writes: On 12/16/09, Kurt Harriman harri...@acm.org wrote: For gcc, I think the __attribute__ has to come after the function's parameter list, rather than before the return type. No. [ squint... ] That's nowhere documented that I can find: all the examples in the gcc docs show __attribute__ after the parameters. It does seem to work, but should we rely on it? The bigger problem though is that not all versions of gcc understand always_inline: $ gcc -Wall check.c check.c:3: warning: `always_inline' attribute directive ignored which I think is sufficient reason to put an end to this sub-thread. We have no particular need for force-inline semantics anyway, as long as the compiler behaves reasonably for unreferenced inlines, which gcc always has. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Dimitri Fontaine dfonta...@hi-media.com writes: Tom Lane t...@sss.pgh.pa.us writes: foreach p2_member in unnest(p2) loop p1 := array(select period_except(p1_member, p2_member) from unnest(p1) p1_member); end loop; But maybe it can be done in a single SQL command. Yeah, as soon as you have LATERAL, I think. Without it there's no way to compose SRF in SQL, AFAIK. Hm, how would you do it with LATERAL? The problem is not so much composition as the need for a variable number of rounds of composition. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On Wed, Dec 16, 2009 at 9:30 AM, Marko Kreen mark...@gmail.com wrote: On 12/16/09, Robert Haas robertmh...@gmail.com wrote: On Tue, Dec 15, 2009 at 10:34 PM, Kurt Harriman harri...@acm.org wrote: Your worry ii) can be ignored, managing to compile on such compilers is already overachievement. I think so too. With your opinion added to mine, do we constitute a consensus of the pg community? Someone might object that a sample of two individuals is insufficiently representative of the whole, but away with the pedants: let us not quibble over trifles. I haven't completely followed this thread, but I think there has been some discussion of making changes to inline that would cause regressions for people using old, crappy compilers, and I think we should avoid doing that unless there is some compelling benefit. I'm not sure what that benefit would be - I don't think cleaner code is enough. Seems you have not followed the thread... Hypothetical old, crappy compilers would still work, only AC_C_INLINE would turn static inline into plain static, so hypothetically they would get some warnings about unused functions. As this is all hypothetical, I don't see why that should stop us cleaning our code? I don't think that hypothetical is the right word. There is no question that such compilers exist. What seems to me to be relevant is whether anyone is still using them. Maybe with sufficient research you could demonstrate that there are no platforms we support where these are still in common use - e.g. the oldest version of AIX compiler that has this problem is version X, but PostgreSQL is only supported on versions Y and higher, where YX. But I don't see that any attempt has been made to do that research, or at least I haven't seen any attempt to go through our supported platforms and discuss the situation for each one. I'm not really sure there's enough benefit in this project to warrant that effort, but that's up to you to decide. However, I am pretty confident that there is going to be opposition to degrading performance or causing lots of compiler warnings on older platforms that are still supported. In fact, such opposition has already been registered on this thread and you can find it here: http://archives.postgresql.org/pgsql-hackers/2009-12/msg00239.php Our standard for accepting patches is that (1) they have a demonstrated benefit and (2) they don't break anything. Leaving aside the question of whether this patch makes anything worse, it would really improve the case for this patch if someone could enumerate the supported platforms and compilers where it makes things better. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast or immediate shutdown
On tis, 2009-12-15 at 17:19 +, Simon Riggs wrote: running with log_checkpoints = on pg_ctl -D foo -m fast stop log says LOG: received fast shutdown request LOG: aborting any active transactions LOG: shutting down LOG: restartpoint starting: shutdown immediate Some of us know that the immediate word refers to the restartpoint request, though that word causes conceptual conflict with the shutdown mode, which was fast, not immediate. Think we need to change the wording of this LOG: restartpoint starting: shutdown immediate so it is clearer what we mean We *do* we mean? And why are we logging it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Marko Kreen mark...@gmail.com writes: Hypothetical old, crappy compilers would still work, only AC_C_INLINE would turn static inline into plain static, so hypothetically they would get some warnings about unused functions. As this is all hypothetical, I don't see why that should stop us cleaning our code? There's nothing hypothetical about it --- I still regularly check that the code builds on an old HP compiler that doesn't have inline. I remind you that the project policy is to not require any compiler features not found in C89. If you can exploit inline on more compilers than now, fine, but assuming that everything has got it is not OK. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea - new aggregates median, listagg
Pavel Stehule pavel.steh...@gmail.com wrote: 2009/12/15 David Fetter da...@fetter.org: http://www.simple-talk.com/sql/t-sql-programming/median-workbench/ In this article the are two medians - statistical and financial. I am for both. But only one can be named median. Well, since the statistical median requires that you specify whether you want the left or right median (in case there are an even number of values), you could either have two median functions, one of which would take an extra parameter for this, or you could have median, left_median, and right_median functions. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote: Marko Kreen mark...@gmail.com writes: On 12/16/09, Kurt Harriman harri...@acm.org wrote: For gcc, I think the __attribute__ has to come after the function's parameter list, rather than before the return type. No. [ squint... ] That's nowhere documented that I can find: all the examples in the gcc docs show __attribute__ after the parameters. It does seem to work, but should we rely on it? Heh. At least in 3.0, 3.1 and 4.2 docs there is: __attribute__((noreturn)) void d0 (void), __attribute__((format(printf, 1, 2))) d1 (const char *, ...), d2 (void); describing that _att before decl list applies to all declarations, but _att after decl applies to only that declaration. That sort of explains also why all examples have _att afterwards. http://gcc.gnu.org/onlinedocs/gcc-3.1.1/gcc/Attribute-Syntax.html But I dare not to pick out sentences from Function-Attributes.html that describe that... The bigger problem though is that not all versions of gcc understand always_inline: $ gcc -Wall check.c check.c:3: warning: `always_inline' attribute directive ignored Oh, another argument against logic in headers. which I think is sufficient reason to put an end to this sub-thread. We have no particular need for force-inline semantics anyway, as long as the compiler behaves reasonably for unreferenced inlines, which gcc always has. Ok. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 4:52 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote: Kevin Grittner wrote: Just to make those who care aware of it, here is Michael Cahill's Doctoral Thesis based on implementing Serializable Snapshot Isolation in InnoDB using a refined version of the techniques previously used in the Berkley DB (and previously discussed on this list): http://hdl.handle.net/2123/5353 Seriously, this post is just for the benefit of those who may be interested in following these developments -- I don't have the inclination or energy for another round of debate on the topic just now. :-/ I understand that, and thank you for the information. Although it may have seemed that I was out to shoot the idea down, I am interested in the topic. I guess my way of understanding something is trying to find holes in it... I read into the text, and I was particularly interested how he solved the problem of phantom reads. Quote: The problem [of phantom reads] was identified in (Eswaran et al., 1976), but the general purpose predicate locking solution suggested there has not been widely adopted because of the difficulty in testing mutual satisfiability of predicates. Instead, locking DBMS implementations commonly use algorithms based on next-key locking. In these, a range of key space is protected against concurrent insertion or deletion by acquiring a shared lock on the next row in order, as a scan is made to check whether rows match a predicate. The scan might be through the data records or through an index. Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea - new aggregates median, listagg
2009/12/15 Pavel Stehule pavel.steh...@gmail.com Hello I am looking on new feature - ORDER clause in aggregate, and I thing, so we are able to effectively implement some non standard, but well known aggregates. a) function median - it is relative frequent request - with usually slow implementation b) function listagg (it is analogy of group_concat from MySQL) - it should simplify report generating and some other What is your opinion? Do you like to see these functions in core? I'm probably missing the point here, but when I originally saw MySQL's group_concat function, I found it odd that it featured ordering functionality. Shouldn't the order by determined by the query itself? Otherwise it's almost as if its separating the relationship between the result column and the resultset. Thom
Re: [HACKERS] Update on true serializable techniques in MVCC
Moin, On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote: Marko Kreen mark...@gmail.com writes: Hypothetical old, crappy compilers would still work, only AC_C_INLINE would turn static inline into plain static, so hypothetically they would get some warnings about unused functions. As this is all hypothetical, I don't see why that should stop us cleaning our code? There's nothing hypothetical about it --- I still regularly check that the code builds on an old HP compiler that doesn't have inline. Ok, good. Thus far only argument was historically they have existed, which does not sound good enough worry about them. If somebody is actually testing and caring abouth such compilers, they need to be taken more seriously. I remind you that the project policy is to not require any compiler features not found in C89. If you can exploit inline on more compilers than now, fine, but assuming that everything has got it is not OK. Note - my advanced proposal (instead duplicate macros, let 'static inline' functions fall back to being plain 'static') would still support non-inline compilers, but with potentially small performance hit. So the plain-C89 compilers would be downgraded to second-class targets, not worth getting max performance out of them. Is this OK? -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Albe Laurenz laurenz.a...@wien.gv.at wrote: Although it may have seemed that I was out to shoot the idea down, I am interested in the topic. I guess my way of understanding something is trying to find holes in it... No problem. That's how ideas are explored and improved. The brick wall was that there seemed to be overwhelming opposition to any new technique which causes serialization failures under conditions where the cause isn't blindingly obvious, regardless of the resulting improvements in data reliability. I was particularly interested how he solved the problem of phantom reads. That sounds like it should actually work. Well, that's certainly not the novel part -- those techniques were described and proven theoretically decades ago and have been in use by many popular products for almost as long. It's that non-blocking SIREAD lock which is the fun part. ;-) I just wish I had time to read all the documents referenced in the footnotes -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Nicolas Barbier nicolas.barb...@gmail.com wrote: I am not sure whether the serialization failures that it may cause are dependent on the plan used. They are. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea - new aggregates median, listagg
2009/12/16 Thom Brown thombr...@gmail.com: 2009/12/15 Pavel Stehule pavel.steh...@gmail.com Hello I am looking on new feature - ORDER clause in aggregate, and I thing, so we are able to effectively implement some non standard, but well known aggregates. a) function median - it is relative frequent request - with usually slow implementation b) function listagg (it is analogy of group_concat from MySQL) - it should simplify report generating and some other What is your opinion? Do you like to see these functions in core? I'm probably missing the point here, but when I originally saw MySQL's group_concat function, I found it odd that it featured ordering functionality. Shouldn't the order by determined by the query itself? Otherwise it's almost as if its separating the relationship between the result column and the resultset. Aggregates as group_concat or listagg are not typical SQL aggregates. With these aggregates we are able to do some reports on SQL level without stored procedures. What I know, order is determined only for non hash aggregates - and you cannot specify method of aggregation, so possibility to specify ORDER is important. But this feature isn't related to this proposal. It was commited yesterday - so you can look on discussion about this feature. Regards Pavel Stehuke Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast or immediate shutdown
On Wed, 2009-12-16 at 17:04 +0200, Peter Eisentraut wrote: On tis, 2009-12-15 at 17:19 +, Simon Riggs wrote: running with log_checkpoints = on pg_ctl -D foo -m fast stop log says LOG: received fast shutdown request LOG: aborting any active transactions LOG: shutting down LOG: restartpoint starting: shutdown immediate Some of us know that the immediate word refers to the restartpoint request, though that word causes conceptual conflict with the shutdown mode, which was fast, not immediate. Think we need to change the wording of this LOG: restartpoint starting: shutdown immediate so it is clearer what we mean We *do* we mean? And why are we logging it? The words after the colon refer to options sent to RequestCheckpoint and it is logged because we asked for it by doing log_checkpoints = on. I suggest we say smoothed when checkpoint option is !immediate. So we will remove the word immediate from the message. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL project policy compendium
On 12/16/2009 7:10 AM, Tom Lane wrote: the project policy is to not require any compiler features not found in C89. Is there somewhere a compendium of such policies which fledgling hackers should consult to avoid embarrassment? Regards, ... kurt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash
Here is a set of patches to address this issue. The first one is a small refactoring of the signal setting portability business. The second one fixes the SIGQUIT handler inadvertently unblocking SIGQUIT within itself. The third one installs an alarm so that if the ereport() call in quickdie() doesn't finish after 60 seconds, it skips it and finishes up. The precise logic of this could be debated, but it more or less appears to get the job done. From d70d637ffe60dd3c73690a2640e84ad6e2cdbb50 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut pet...@postgresql.org Date: Wed, 16 Dec 2009 14:41:10 +0200 Subject: [PATCH 1/3] If there is no sigdelset(), define it as a macro Removes some duplicate code that recreated the identical workaround. --- src/backend/postmaster/bgwriter.c |4 src/backend/postmaster/walwriter.c |4 src/backend/tcop/postgres.c|4 src/include/libpq/pqsignal.h |6 -- 4 files changed, 4 insertions(+), 14 deletions(-) diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c index a2c32a7..81c0d36 100644 --- a/src/backend/postmaster/bgwriter.c +++ b/src/backend/postmaster/bgwriter.c @@ -244,11 +244,7 @@ BackgroundWriterMain(void) pqsignal(SIGWINCH, SIG_DFL); /* We allow SIGQUIT (quickdie) at all times */ -#ifdef HAVE_SIGPROCMASK sigdelset(BlockSig, SIGQUIT); -#else - BlockSig = ~(sigmask(SIGQUIT)); -#endif /* * Initialize so that first time-driven event happens at the correct time. diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c index 90b8cfe..7d95696 100644 --- a/src/backend/postmaster/walwriter.c +++ b/src/backend/postmaster/walwriter.c @@ -126,11 +126,7 @@ WalWriterMain(void) pqsignal(SIGWINCH, SIG_DFL); /* We allow SIGQUIT (quickdie) at all times */ -#ifdef HAVE_SIGPROCMASK sigdelset(BlockSig, SIGQUIT); -#else - BlockSig = ~(sigmask(SIGQUIT)); -#endif /* * Create a resource owner to keep track of our resources (not clear that diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index c985478..c7906bf 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3292,11 +3292,7 @@ PostgresMain(int argc, char *argv[], const char *username) if (IsUnderPostmaster) { /* We allow SIGQUIT (quickdie) at all times */ -#ifdef HAVE_SIGPROCMASK sigdelset(BlockSig, SIGQUIT); -#else - BlockSig = ~(sigmask(SIGQUIT)); -#endif } PG_SETMASK(BlockSig); /* block everything except SIGQUIT */ diff --git a/src/include/libpq/pqsignal.h b/src/include/libpq/pqsignal.h index d8e791c..82414d6 100644 --- a/src/include/libpq/pqsignal.h +++ b/src/include/libpq/pqsignal.h @@ -26,7 +26,7 @@ extern sigset_t UnBlockSig, StartupBlockSig; #define PG_SETMASK(mask) sigprocmask(SIG_SETMASK, mask, NULL) -#else +#else /* not HAVE_SIGPROCMASK */ extern int UnBlockSig, BlockSig, StartupBlockSig; @@ -37,7 +37,9 @@ extern int UnBlockSig, #define PG_SETMASK(mask) pqsigsetmask(*((int*)(mask))) int pqsigsetmask(int mask); #endif -#endif + +#define sigdelset(set, signum) (*(set) = ~(sigmask(signum))) +#endif /* not HAVE_SIGPROCMASK */ typedef void (*pqsigfunc) (int); -- 1.6.5 From 96a3bc5fe46e226ff179a729baa58efc16c0aaba Mon Sep 17 00:00:00 2001 From: Peter Eisentraut pet...@postgresql.org Date: Wed, 16 Dec 2009 16:07:12 +0200 Subject: [PATCH 2/3] Don't unblock SIGQUIT in the SIGQUIT handler --- src/backend/tcop/postgres.c |1 + src/include/libpq/pqsignal.h |1 + 2 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index c7906bf..b2fb501 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -2538,6 +2538,7 @@ drop_unnamed_stmt(void) void quickdie(SIGNAL_ARGS) { + sigaddset(BlockSig, SIGQUIT); /* prevent nested calls */ PG_SETMASK(BlockSig); /* diff --git a/src/include/libpq/pqsignal.h b/src/include/libpq/pqsignal.h index 82414d6..4659ae2 100644 --- a/src/include/libpq/pqsignal.h +++ b/src/include/libpq/pqsignal.h @@ -38,6 +38,7 @@ extern int UnBlockSig, int pqsigsetmask(int mask); #endif +#define sigaddset(set, signum) (*(set) |= (sigmask(signum))) #define sigdelset(set, signum) (*(set) = ~(sigmask(signum))) #endif /* not HAVE_SIGPROCMASK */ -- 1.6.5 From bebb95abe7a55173cab0558da3373d6a3631465b Mon Sep 17 00:00:00 2001 From: Peter Eisentraut pet...@postgresql.org Date: Wed, 16 Dec 2009 17:19:14 +0200 Subject: [PATCH 3/3] Time out the ereport() call in quickdie() after 60 seconds --- src/backend/tcop/postgres.c | 25 + 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index b2fb501..ab6805a 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -191,6 +191,7 @@ static bool IsTransactionExitStmtList(List *parseTrees); static bool IsTransactionStmtList(List *parseTrees);
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Marko Kreen mark...@gmail.com writes: So the plain-C89 compilers would be downgraded to second-class targets, not worth getting max performance out of them. Hm? Failing to inline is already a performance hit, which is why Kurt got interested in this in the first place. I think you're way overthinking this. Where we started was just a proposal to try to expand the set of inline-ing compilers beyond gcc only. I don't see why we need to do anything but that. The code is fine as-is except for the control #ifdefs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL project policy compendium
On ons, 2009-12-16 at 07:44 -0800, Kurt Harriman wrote: On 12/16/2009 7:10 AM, Tom Lane wrote: the project policy is to not require any compiler features not found in C89. Is there somewhere a compendium of such policies which fledgling hackers should consult to avoid embarrassment? The installation instructions say that you need a C89 compiler. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL project policy compendium
On Wed, Dec 16, 2009 at 10:44 AM, Kurt Harriman harri...@acm.org wrote: On 12/16/2009 7:10 AM, Tom Lane wrote: the project policy is to not require any compiler features not found in C89. Is there somewhere a compendium of such policies which fledgling hackers should consult to avoid embarrassment? Well the best way to avoid embarrassment is to just not be embarrassed. I don't think anyone will think ill of you for suggesting something that turns out not to be where the project wants to go, as long as you're not too dogmatic about it. This particular point is mentioned in the documentation: http://www.postgresql.org/docs/8.4/static/install-requirements.html The rest of chapter 15 is worth a read, too. I do think your basic point is well-taken, though. There are a lot of details about how we do things that I have had to learn through osmosis, and maybe if I can figure out what they all are, I'll try to write them down somewhere, if someone else hasn't done it already. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash
Peter Eisentraut escribió: Here is a set of patches to address this issue. The first one is a small refactoring of the signal setting portability business. This one looks like should be applied immediately to get some buildfarm coverage (and alpha3) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL project policy compendium
Robert Haas robertmh...@gmail.com writes: On Wed, Dec 16, 2009 at 10:44 AM, Kurt Harriman harri...@acm.org wrote: Is there somewhere a compendium of such policies which fledgling hackers should consult to avoid embarrassment? I do think your basic point is well-taken, though. There are a lot of details about how we do things that I have had to learn through osmosis, and maybe if I can figure out what they all are, I'll try to write them down somewhere, if someone else hasn't done it already. Yeah, this is something that is difficult for the senior hackers to undertake because we've forgotten what it was we had to learn the hard way ;-). Maybe some of the newer project members could collaborate on scribbling down such stuff (wiki page?). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Tom Lane escribió: I think you're way overthinking this. Where we started was just a proposal to try to expand the set of inline-ing compilers beyond gcc only. I don't see why we need to do anything but that. The code is fine as-is except for the control #ifdefs. IIRC Kurt was also on about getting rid of some ugly macros that could instead be coded as inline functions (fastgetattr for example) -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea - new aggregates median, listagg
Thom Brown wrote: 2009/12/15 Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com Hello I am looking on new feature - ORDER clause in aggregate, and I thing, so we are able to effectively implement some non standard, but well known aggregates. a) function median - it is relative frequent request - with usually slow implementation b) function listagg (it is analogy of group_concat from MySQL) - it should simplify report generating and some other What is your opinion? Do you like to see these functions in core? I'm probably missing the point here, but when I originally saw MySQL's group_concat function, I found it odd that it featured ordering functionality. Shouldn't the order by determined by the query itself? Otherwise it's almost as if its separating the relationship between the result column and the resultset. For xmlagg in particular, it is quite useful to be able order the results. And sorting the query doesn't work for ordering the agg unless you do it in a subquery. Oracle has this functionality and it is quite handy. It would be nice to see listagg with the option to order as well. Scott -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Marko Kreen mark...@gmail.com writes: On 12/15/09, Kurt Harriman harri...@acm.org wrote: Attached is a revised patch, offered for the 2010-01 commitfest. It's also available in my git repository in the submitted branch: http://git.postgresql.org/gitweb?p=users/harriman/share.git;a=shortlog;h=refs/heads/submitted -1. The PG_INLINE is ugly. FWIW, I think the patch is largely OK, except for the autoconf hackery which I'm not the best-qualified person to opine on. I would only suggest that the cleanest coding would be #ifdef USE_INLINE static inline foo(...) ... #else ... non-inline definition of foo #endif ie, go ahead and rely on autoconf's definition (if any) of inline and add a policy symbol USE_INLINE to determine whether to use it. The proposed PG_INLINE coding conflates the symbol needed in the code with the policy choice. Another possibility would be to call the policy symbol HAVE_INLINE, but that (a) risks collision with a name defined by autoconf built-in macros, and (b) looks like it merely indicates whether the compiler *has* inline, not that we have made a choice about how to use it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote: Marko Kreen mark...@gmail.com writes: So the plain-C89 compilers would be downgraded to second-class targets, not worth getting max performance out of them. Hm? Failing to inline is already a performance hit, which is why Kurt got interested in this in the first place. I think you're way overthinking this. Where we started was just a proposal to try to expand the set of inline-ing compilers beyond gcc only. I don't see why we need to do anything but that. The code is fine as-is except for the control #ifdefs. My proposal is basically about allowing more widespread use of static inline. That is - static inline does not need to be paired with equivalent macro. But if C89 compilers are still project's primary target, then this cannot be allowed. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash
Peter Eisentraut pete...@gmx.net writes: Here is a set of patches to address this issue. The first one is a small refactoring of the signal setting portability business. OK The second one fixes the SIGQUIT handler inadvertently unblocking SIGQUIT within itself. OK The third one installs an alarm so that if the ereport() call in quickdie() doesn't finish after 60 seconds, it skips it and finishes up. The precise logic of this could be debated, but it more or less appears to get the job done. I'm not too happy with depending on alarm(), which according to the pgbench sources is not portable to Windows. The postmaster does something equivalent using enable_sig_alarm(); can we use that? A different line of thought is that this still doesn't fix the problem because you're depending on a fair amount of libc infrastructure that might not be too reentrant (and the same objection could probably be mounted against enable_sig_alarm). It would be better for the postmaster to be in charge of enforcing the timeout, and have it issue SIGKILL against children that don't die fast enough to suit it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane escribió: I think you're way overthinking this. Where we started was just a proposal to try to expand the set of inline-ing compilers beyond gcc only. I don't see why we need to do anything but that. The code is fine as-is except for the control #ifdefs. IIRC Kurt was also on about getting rid of some ugly macros that could instead be coded as inline functions (fastgetattr for example) I'd just bounce that as useless activity. If they are macros now, and work, the only possible effects of changing them are negative. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL project policy compendium
Kurt Harriman wrote: On 12/16/2009 7:10 AM, Tom Lane wrote: the project policy is to not require any compiler features not found in C89. Is there somewhere a compendium of such policies which fledgling hackers should consult to avoid embarrassment? The list of suggestions at http://wiki.postgresql.org/wiki/Developer_FAQ and http://wiki.postgresql.org/wiki/Submitting_a_Patch are what we've got now. There's a number of us who try to record any interesting discussion bits like this one into one of those, to shorten the future learning curve for others. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Sun, 2009-12-13 at 23:49 -0800, Scott Bailey wrote: So basically I have an anyrange pseudo type with the functions prev, next, last, etc defined. So instead of hard coding range types, we would allow the user to define their own range types. Basically if we are able to determine the previous and next values of the base types we'd be able to define a range type. I'm envisioning in a manner much like defining an enum type. After an off-list discussion with Scott, I think there may be a solution here that works for everyone if we don't try so hard to unify the implementation of discrete and continuous ranges. The API should be very similar, of course, but the implementation doesn't need to be. Continuous ranges absolutely require the following information: start, end, and inclusivity information. But discrete ranges can instead be stored by counting the number of granules from the start point. For instance, it could be stored as: start, num_granules. That has a lot of benefits for discrete ranges of time. First of all, it allows the algebra to work reasonably well for the days and months part of the interval, so we can allow a granule of 1 day/week/month/year for a timestamp range. For output of the range, we can then just multiply the granule by the number of granules, and add that to the start time; thus avoiding the incremental addition problem with date math. I think this works reasonably well for timestamp/date ranges -- let me know if there is a problem here (aside from timestamptz, which I address below). Secondly, in the case of a timestamp range, we can use 7 bytes for storing the number of granules rather than another full 8-byte timestamp, leaving one byte for flags to represent NULL boundaries, infinite boundaries, etc. For timestamps that would still mean that an interval could be 2000 years long with '1 microsecond' granularity. For dates, 3 bytes is sufficient for a date range 45000 years long with granules of '1 day'. That means that we can get back down to a 16 byte representation for timestamp ranges, or 8 byte representation for date ranges. There are a few details, like infinite ranges, but those can be pretty easily solved with flags as well. There's one problem, and that's for timestamptz ranges with intervals that include days and months. Timezone adjustments are just not well-defined for that kind of granule (nor would it be particularly useful even if it magically worked), so this would have to be blocked somehow. I think that's a special case, and we could provide the user with a nice error message telling the user to use a date or timestamp range instead. So, the idea is to default to a continuous range type, but if the user supplies a granule, prior and next functions, and other necessary details, then it becomes a discrete range type. * continuous ranges can still have everything that everyone wants, including flags to indicate special values. * discrete range granule is specified explicitly, so it's not an implementation detail * discrete ranges can have a compact representation * discrete ranges would still have room for flags to indicate special values Comments? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, Dec 16, 2009 at 12:31 PM, Jeff Davis pg...@j-davis.com wrote: There's one problem, and that's for timestamptz ranges with intervals that include days and months. Timezone adjustments are just not well-defined for that kind of granule (nor would it be particularly useful even if it magically worked), so this would have to be blocked somehow. I think that's a special case, and we could provide the user with a nice error message telling the user to use a date or timestamp range instead. This seems like a fairly special-purpose type. You'd be targeting it at people who are very concerned with storing large numbers of these (so they really care about space consumption) but for some reason don't need to mix days and months (actually, the current interval representation stores days, months, and seconds separately). I certainly think this might be useful to some people but it doesn't really sounds like a general range type facility, since it seems to involve some hacks that are fairly datatype-specific. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Jeff Davis pg...@j-davis.com writes: [ hacky special-case representation for discrete timestamp ranges ] I'm still not exactly clear on what the use-case is for discrete timestamp ranges, and I wonder how many people are going to be happy with a representation that can't handle a range that's open-ended on the left. So, the idea is to default to a continuous range type, but if the user supplies a granule, prior and next functions, and other necessary details, then it becomes a discrete range type. Huh? You're not going to be able to have a special case data representation for one or two data types at the same time as you have a function-based datatype-independent concept of a parameterized range type. Well, maybe you could have special code paths for just date and timestamp but it'd be horrid. More importantly, the notion of a representation granule is still 100% wishful thinking for any inexact-representation datatype, which is going to be a severe crimp in getting this accepted for timestamp, let alone defining it in a way that would allow users to try to apply it to floats. Float timestamps might not be the default case anymore but they are still supported. I think you should let go of the feeling that you have to shave bytes off the storage format. You're creating a whole lot of work for yourself and a whole lot of user-visible corner cases in return for what ultimately isn't much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Jeff Davis wrote: On Sun, 2009-12-13 at 23:49 -0800, Scott Bailey wrote: So basically I have an anyrange pseudo type with the functions prev, next, last, etc defined. So instead of hard coding range types, we would allow the user to define their own range types. Basically if we are able to determine the previous and next values of the base types we'd be able to define a range type. I'm envisioning in a manner much like defining an enum type. After an off-list discussion with Scott, I think there may be a solution here that works for everyone if we don't try so hard to unify the implementation of discrete and continuous ranges. The API should be very similar, of course, but the implementation doesn't need to be. Continuous ranges absolutely require the following information: start, end, and inclusivity information. But discrete ranges can instead be stored by counting the number of granules from the start point. For instance, it could be stored as: start, num_granules. That has a lot of benefits for discrete ranges of time. First of all, it allows the algebra to work reasonably well for the days and months part of the interval, so we can allow a granule of 1 day/week/month/year for a timestamp range. For output of the range, we can then just multiply the granule by the number of granules, and add that to the start time; thus avoiding the incremental addition problem with date math. I think this works reasonably well for timestamp/date ranges -- let me know if there is a problem here (aside from timestamptz, which I address below). Secondly, in the case of a timestamp range, we can use 7 bytes for storing the number of granules rather than another full 8-byte timestamp, leaving one byte for flags to represent NULL boundaries, infinite boundaries, etc. For timestamps that would still mean that an interval could be 2000 years long with '1 microsecond' granularity. For dates, 3 bytes is sufficient for a date range 45000 years long with granules of '1 day'. That means that we can get back down to a 16 byte representation for timestamp ranges, or 8 byte representation for date ranges. There are a few details, like infinite ranges, but those can be pretty easily solved with flags as well. There's one problem, and that's for timestamptz ranges with intervals that include days and months. Timezone adjustments are just not well-defined for that kind of granule (nor would it be particularly useful even if it magically worked), so this would have to be blocked somehow. I think that's a special case, and we could provide the user with a nice error message telling the user to use a date or timestamp range instead. So, the idea is to default to a continuous range type, but if the user supplies a granule, prior and next functions, and other necessary details, then it becomes a discrete range type. * continuous ranges can still have everything that everyone wants, including flags to indicate special values. * discrete range granule is specified explicitly, so it's not an implementation detail * discrete ranges can have a compact representation * discrete ranges would still have room for flags to indicate special values Comments? As I pointed out off-list, I think the granularity for timestamp range should be limited to hours and smaller. Anything larger is asking for trouble. And quite honestly if they wanted day granularity, they should use date range. Also, I think the granule should be same type as returned when subtracting two subtypes. So granule of date range should be int not interval. And if user wanted something with month granularity, perhaps an enum range of 'MM' would be better. Quite honestly the following 3 cases would probably meet 99% of need: CREATE TYPE period AS RANGE(timestamptz(0), interval '1 s'); CREATE TYPE period AS RANGE(timestamptz(3), interval '1 ms'); CREATE TYPE period AS RANGE(timestamptz, interval '1 us'); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Robert Haas escribió: On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... So you'd have to disable HOT updates when true serializability was active? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: [ hacky special-case representation for discrete timestamp ranges ] I'm still not exactly clear on what the use-case is for discrete timestamp ranges, and I wonder how many people are going to be happy with a representation that can't handle a range that's open-ended on the left. They wouldn't. But the timestamp data would be the anchor, not necessarily the start point. As long as we ranges unbounded on both ends we'd be ok. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... So you'd have to disable HOT updates when true serializability was active? I thought about that, but I don't think so. HOT only applies to updates, and predicate locking only applies to inserts. Unless I have my head in the sand? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... So you'd have to disable HOT updates when true serializability was active? I thought about that, but I don't think so. HOT only applies to updates, and predicate locking only applies to inserts. Unless I have my head in the sand? Err, no, wait. Predicate locking can apply to updates, but since HOT updates never update an indexed column, I think we might still be OK? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Robert Haas írta: On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... So you'd have to disable HOT updates when true serializability was active? I thought about that, but I don't think so. HOT only applies to updates, and predicate locking only applies to inserts. Unless I have my head in the sand? Err, no, wait. Predicate locking can apply to updates, but since HOT updates never update an indexed column, I think we might still be OK? A predicate can include columns from an index plus others. Am I missing something? ...Robert -- Bible has answers for everything. Proof: But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil. (Matthew 5:37) - basics of digital technology. May your kingdom come - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
Alvaro Herrera alvhe...@commandprompt.com wrote: So you'd have to disable HOT updates when true serializability was active? I wouldn't think so; but someone familiar with HOT logic could probably determine whether the unmodified algorithm could be used by reviewing the simplifying assumptions near the bottom of page 42, and the page about Generalizing to other database engines (section 4.8). I think those portions might stand on their own without reading the rest of the document. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Ok, silly question here. But how do you determine the length of a continuous range? By definition length of [a, b) and (a, b] = b-a. But what about (a,b) and [a,b]? Are we saying that because they are continuous, the difference between values included in the range and those excluded are so infinitesimally small so as not to matter? Thus length (a,b) == length [a,b] == length [a,b)? And if that is the case, does the inclusiveness of the range really even matter? And can anyone point me to a reference for working with continuous ranges? Google just insists that I spelled contiguous wrong. Scott -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Scott Bailey arta...@comcast.net writes: As I pointed out off-list, I think the granularity for timestamp range should be limited to hours and smaller. Anything larger is asking for trouble. And quite honestly if they wanted day granularity, they should use date range. I'm still not real clear on what the expected use-case is for this. You're evidently envisioning applications where the allowed form of an interval is constrained, but in the cases I can think of, the constraints are a lot more convoluted than what you're proposing. For example, if you're trying to do classroom scheduling, it might be useful to constrain the periods to start and end on hour boundaries --- but the next thing you'll want is to have it know that the next slot after 5pm Friday is 8am Monday. Except on holidays. And then there's the fact that my alma mater starts most hour-long classes on the half hour. I think that wiring such constraints into the low-level datatype is doomed to failure. What you'd be better off with is a function that finds the next period given a current period and some suitable representation of the off-limits intervals. The argument for having granularity wired into the datatype seems to boil down to just space savings. I don't find that compelling enough to justify code contortions and user-visible restrictions on functionality. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XLogInsert
2009/12/15 Greg Smith g...@2ndquadrant.com Jaime Casanova wrote: So in this extreme case avg tps is just 6 transactions better Great job trying to find the spot where the code worked better. I'm not so sure I trust pgbench results where the TPS was so low though. Which leads us right back to exactly how Jeff measured his original results. As I said already, I think we need more insight into Jeff's performance report, a way to replicate that test, to look a bit at the latency as reported by the updated LWLock patch that Pierre submitted. Tweaking your test to give more useful results is a nice second opinion on top of that. But we're out of time for now, so this patch is getting returned with feedback. I encourage Jeff to resubmit the same patch or a better one with a little more data on performance measurements to our final 8.5 CommitFest in hopes we can confirm this an improvement worth committing. Last week I worked on a FUSE based filesystem, which I call BlackholeFS. Its similar to /dev/null, but for directories. Basically it simply returns success for all the writes, but doesn't do any writes on the files under it. Would moving the pg_xlog/ (and possibly table data too) to such a filesystem exercise this patch better? Best regards, -- Lets call it Postgres EnterpriseDB http://www.enterprisedb.com gurjeet[.sin...@enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com Twitter: singh_gurjeet Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] XLogInsert
On Wednesday 16 December 2009 20:07:07 Gurjeet Singh wrote: 2009/12/15 Greg Smith g...@2ndquadrant.com Jaime Casanova wrote: So in this extreme case avg tps is just 6 transactions better Great job trying to find the spot where the code worked better. I'm not so sure I trust pgbench results where the TPS was so low though. Which leads us right back to exactly how Jeff measured his original results. As I said already, I think we need more insight into Jeff's performance report, a way to replicate that test, to look a bit at the latency as reported by the updated LWLock patch that Pierre submitted. Tweaking your test to give more useful results is a nice second opinion on top of that. But we're out of time for now, so this patch is getting returned with feedback. I encourage Jeff to resubmit the same patch or a better one with a little more data on performance measurements to our final 8.5 CommitFest in hopes we can confirm this an improvement worth committing. Last week I worked on a FUSE based filesystem, which I call BlackholeFS. Its similar to /dev/null, but for directories. Basically it simply returns success for all the writes, but doesn't do any writes on the files under it. I doubt that it will be faster than a tmpfs - the additional context switches et al probably will hurt already. If you constrain the checkpoint_segments to something sensible it shouldnt use too much memory. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 12:42 -0500, Robert Haas wrote: On Wed, Dec 16, 2009 at 12:31 PM, Jeff Davis pg...@j-davis.com wrote: There's one problem, and that's for timestamptz ranges with intervals that include days and months. Timezone adjustments are just not well-defined for that kind of granule (nor would it be particularly useful even if it magically worked), so this would have to be blocked somehow. I think that's a special case, and we could provide the user with a nice error message telling the user to use a date or timestamp range instead. This seems like a fairly special-purpose type. You'd be targeting it at people who are very concerned with storing large numbers of these (so they really care about space consumption) but for some reason don't need to mix days and months (actually, the current interval representation stores days, months, and seconds separately). I certainly think this might be useful to some people but it doesn't really sounds like a general range type facility, since it seems to involve some hacks that are fairly datatype-specific. My statement should have read days or months. In other words, you can't have a timestamptz range with a granularity of '3 days'. But if that's your granularity, as Scott says, you should be using a date range, not a timestamptz range. Timestamptz ranges are only really useful when you have a granularity measured in seconds (or some fraction or multiple thereof). Otherwise, the timezone adjustment doesn't make any sense. So this isn't a case of limited functionality, just that we need to inform the user that a timestamptz range with granularity '1 day' or '1 month' makes no sense. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PATCH: Add hstore_to_json()
I just realized that this was easy to do, and despite my complete lack of C skillz was able to throw this together in a couple of hours. It might be handy to some, though the possible downsides are: * No json_to_hstore(). * Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc. * Andrew Gierth said “no” when I suggested it. But it's kind of handy, too. Thoughts? Best, David hstore_to_json.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pinging Dano
Hi, who is the main editor named Dano of the wiki page about Parallel Query Execution (http://wiki.postgresql.org/wiki/Parallel_Query_Execution), please speak up. Is there any code or patch available ATM? What discussion with Tom and Simon is that page referring to? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 12:50 -0500, Tom Lane wrote: I'm still not exactly clear on what the use-case is for discrete timestamp ranges, and I wonder how many people are going to be happy with a representation that can't handle a range that's open-ended on the left. Huh? We're miscommunicating somewhere. Discrete ranges are values, and those values can be displayed a number of different ways. That's one of the biggest advantages. The very same discrete range can be displayed as open-open, closed-open, open-closed, or closed-closed. There are edge cases, like how infinity is never closed, and overflow conditions. But generally speaking, you have more options for presenting a discrete range than a continuous range. The range [5, 7) is equal to the set {5, 6} and equal to the ranges: (4,7), (4,6], [5,7), and [5,6]. One application can insert it as [5,7) and another can read it as (4,6]. That's the use case: the application's preferences don't have to match. It's OK to mix various representation preferences, because you can convert between them. The on disk format happens to hint at one particular canonical form, but doesn't enforce that on anyone. Huh? You're not going to be able to have a special case data representation for one or two data types at the same time as you have a function-based datatype-independent concept of a parameterized range type. Well, maybe you could have special code paths for just date and timestamp but it'd be horrid. They aren't supposed to be exactly the same API, I said that from the start. There are API differences between continuous and discrete ranges, and we shouldn't ignore them. One important differences is that (barring overflow conditions and special values) prior, first, last, and next are defined for all discrete range values, but not for all continuous range values. For instance, the discrete range [5,7) has prior=4, first=5, last=6, next=7. Whereas the continuous range [5,7) has prior=undef, first=5, last=undef, next=7. We could define one API, that treats discrete and continuous ranges differently. But you'll never be able to transform a continuous range to a different representation, while you can do so with a discrete range. More importantly, the notion of a representation granule is still 100% wishful thinking for any inexact-representation datatype, which is going to be a severe crimp in getting this accepted for timestamp, let alone defining it in a way that would allow users to try to apply it to floats. Float timestamps might not be the default case anymore but they are still supported. If the only objection is that a superuser can confuse the system by poorly defining a range type on a non-default build, I think that objection can be overcome. I think you should let go of the feeling that you have to shave bytes off the storage format. You're creating a whole lot of work for yourself and a whole lot of user-visible corner cases in return for what ultimately isn't much. This isn't just to shave bytes. It's also because I like the semantics of discrete ranges for many cases. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PATCH: Spurious 22 in hstore.sgml
*** a/doc/src/sgml/hstore.sgml --- b/doc/src/sgml/hstore.sgml *** *** 278,284 entryget typehstore/'s keys as a set/entry entryliteralskeys('a=gt;1,b=gt;2')/literal/entry entry ! 22programlisting a b /programlisting/entry --- 278,284 entryget typehstore/'s keys as a set/entry entryliteralskeys('a=gt;1,b=gt;2')/literal/entry entry ! programlisting a b /programlisting/entry -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions
On ons, 2009-12-16 at 10:49 -0500, Tom Lane wrote: Marko Kreen mark...@gmail.com writes: So the plain-C89 compilers would be downgraded to second-class targets, not worth getting max performance out of them. Hm? Failing to inline is already a performance hit, which is why Kurt got interested in this in the first place. I think you're way overthinking this. Where we started was just a proposal to try to expand the set of inline-ing compilers beyond gcc only. I don't see why we need to do anything but that. The code is fine as-is except for the control #ifdefs. I think the ifdefs should just be HAVE_INLINE !MSVC, right? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml
On Wed, Dec 16, 2009 at 20:34, David E. Wheeler da...@kineticode.com wrote: *** a/doc/src/sgml/hstore.sgml --- b/doc/src/sgml/hstore.sgml Heh, interesting. That clearly shouldn't be there. Applied. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
to...@tuxteam.de wrote: (and as Andrew Dunstan pointed out off-list: I was wrong with my bold assertion that one can squeeze infinitely many (arbitrary length) strings between two given. This is not always the case). Of course you can do that if you assume lexicographical order, or any other arbitrary order. The interesting point is whether there exists some ordering on which this does not happen. And in fact there is: order strings by length first, and then lexicographically. If you do this then you have next() and prev() for any given string. If you use ASCII only, you have a.next = b, and so on. There is the argument that some languages do not sort lexicographically but this is also besides the point -- you only need to find *some* way to sort the characters in the alphabet. If you dictate that in your ordering á comes before à and both after a, and all of them before b, then you know that a.next = á and á.next = à and à.next = b. (Note that I have also dictated that there is no other character that sorts after a and before b, which is perfectly possible because the alphabet is fixed for any given language. You could use the complete list of characters coded in a given set of Unicode planes, or even extant all planes, to obtain the same result). Defining strings with this ordering means you can have some useful ranges like [a-z], but then you cannot meaningfully use ranges for things like [aardvark - zulu] --- note that in this particular example, the order is reversed, because zulu comes before aardvark which is probably not what you want. zzz.next = In short, I think that while it is possible to define ranges of strings, it is not as useful as one would like. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Does verbose Need to be Reserved?
Hey All, I was just getting a new version of pgTAP ready for release, and while testing it on HEAD, I got this error: + psql:pgtap.sql:5789: ERROR: syntax error at end of input + LINE 28: IF verbose THEN RETURN NEXT diag(tests[i] ||... +^ I asked on IRC, and Andrew “RhodiumToad” Gierth pointed out that it became a reserved word at some point. I'm fine to rename my variable, but Andew and I were wondering if it's really necessary for verbose to be reserved, since it's not in the spec. Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Tue, Dec 15, 2009 at 04:29:26PM -0800, Jeff Davis wrote: On Tue, 2009-12-15 at 18:06 -0600, decibel wrote: Now that varlena's don't have an enormous fixed overhead, perhaps it's worth looking at using them. Obviously some operations would be slower, but for your stated examples of auditing and history, I suspect that you're not going to notice the overhead that much. For most varvarlena types, you only get stuck with the full alignment burden if you get unlucky. In this case, we're moving from 16 bytes to 17, which really means 24 bytes with alignment. Try creating two tables: create table foo(i int8, t1 timestamp, t2 timestamp); create table bar(i int8, c char, t1 timestamp, t2 timestamp); But a period type will take just one or two more bytes if you don't require alignment. Alignment on a varlena type seems silly anyway, since you'll be aligning the header byte rather than the content. In the implementation you may need to copy the content before processing to satisfy the alignment of the contained type, but that's just a SMOP. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [HACKERS] Does verbose Need to be Reserved?
David == David E Wheeler da...@kineticode.com writes: David Hey All, David I was just getting a new version of pgTAP ready for release, and while testing it on HEAD, I got this error: David + psql:pgtap.sql:5789: ERROR: syntax error at end of input David + LINE 28: IF verbose THEN RETURN NEXT diag(tests[i] ||... David +^ David I asked on IRC, and Andrew “RhodiumToad” Gierth pointed out David that it became a reserved word at some point. I'm fine to David rename my variable, but Andew and I were wondering if it's David really necessary for verbose to be reserved, since it's not David in the spec. Looking at it more closely, this is likely to be fallout from the plpgsql lexer/parser changes; it probably worked before only because plpgsql was doing its own thing rather than using the main lexer. VERBOSE has been reserved all along in order to distinguish 'vacuum verbose;' from 'vacuum tablename;' and so on; but it's still an interesting pitfall for plpgsql users. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, Dec 16, 2009 at 10:57:19AM -0800, Scott Bailey wrote: Ok, silly question here. But how do you determine the length of a continuous range? By definition length of [a, b) and (a, b] = b-a. But what about (a,b) and [a,b]? Are we saying that because they are continuous, the difference between values included in the range and those excluded are so infinitesimally small so as not to matter? Thus length (a,b) == length [a,b] == length [a,b)? And if that is the case, does the inclusiveness of the range really even matter? Short answer: Yes Longer answer: You need to decide on your definition of length and what you usually use is the measure. And yes, the difference between the two is so called measure 0 and thus has no effect on the length. Note the measure has to be done considering the intervals as intervals on a real line. The integers by themselves have no measure (they are countable). So for the length of a set of integers you might consider the count of the set. http://planetmath.org/encyclopedia/ProofThatTheOuterLebesgueMeasureOfAnIntervalIsItsLength.html http://en.wikipedia.org/wiki/Outer_measure As for continuous, as you use it above is not a way I recognise. There are contiguous sets, but they are something else. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 1:29 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Robert Haas írta: On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote: On Wednesday 16 December 2009 16:24:42 Robert Haas wrote: Inserts and deletes follow the same protocol, obtaining an exclusive lock on the row after the one being inserted or deleted. The result of this locking protocol is that a range scan prevents concurrent inserts or delete within the range of the scan, and vice versa. That sounds like it should actually work. Only if you can guarantee that the database will access the rows using some particular index. If it gets to the data some other way it might accidentally circumvent the lock. That's kind of a killer in terms of making this work for PostgreSQL. Isnt the whole topic only relevant for writing access? There you have to access the index anyway. Yeah, I guess you have to insert the new tuple. I guess while you were at it you might check whether the next tuple is locked... So you'd have to disable HOT updates when true serializability was active? I thought about that, but I don't think so. HOT only applies to updates, and predicate locking only applies to inserts. Unless I have my head in the sand? Err, no, wait. Predicate locking can apply to updates, but since HOT updates never update an indexed column, I think we might still be OK? A predicate can include columns from an index plus others. Am I missing something? Hmm, interesting point. In that case you couldn't use the index to enforce predicate locking under MVCC without disabling HOT. But there will be other cases where that wouldn't help anyway - a predicate could also include unindexed columns exclusively. For those, the traditional approach (not the one discussed in this paper) probably requires locking against any heap insert, or checking each new heap insert against the constraint, or... something. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Alvaro Herrera alvhe...@commandprompt.com writes: In short, I think that while it is possible to define ranges of strings, it is not as useful as one would like. Note it is not the *range* that is the problem, it is the assumption that there's a unique next string. There's no unique next in the reals or rationals either, but we have no problem considering intervals over those sets. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on true serializable techniques in MVCC
On Wed, Dec 16, 2009 at 1:30 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Alvaro Herrera alvhe...@commandprompt.com wrote: So you'd have to disable HOT updates when true serializability was active? I wouldn't think so; but someone familiar with HOT logic could probably determine whether the unmodified algorithm could be used by reviewing the simplifying assumptions near the bottom of page 42, and the page about Generalizing to other database engines (section 4.8). I think those portions might stand on their own without reading the rest of the document. This thread veered off into a discussion of the traditional technique, rather than the one in the paper. I think that's the part Alvaro was responding to. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Does verbose Need to be Reserved?
David E. Wheeler da...@kineticode.com writes: I asked on IRC, and Andrew RhodiumToad Gierth pointed out that it became a reserved word at some point. Some point would have been around the time VACUUM VERBOSE got invented, ie January 1997 according to the CVS logs. We can't unreserve it until you're ready to break that syntax (is VERBOSE a table name or a flag for a database-wide vacuum?). Don't hold your breath. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote: The argument for having granularity wired into the datatype seems to boil down to just space savings. I don't find that compelling enough to justify code contortions and user-visible restrictions on functionality. The argument (at least from me) is that discrete ranges have better semantics. The counterargument was that the granularity of a timestamp is an implementation detail. So I countered by making it explicit. Space savings is not crucial, but it would be frustrating to needlessly waste space. I still have not seen an answer to the problem of changing the representation of a continuous range. If you have the continuous range [5, 10], you're pretty much stuck with that representation, even if the application is expecting things in the form [ ). From an application's standpoint, you probably want to get the information about a range as separate columns (as opposed to parsing a string). So an application expecting data in [) format might use a query like: select ..., first(myperiod), next(myperiod) from mytable; That gives the application complete information about the range. You can even make a view over a table like that to make it even more transparent to the application. It's not entirely unreasonable that many such applications exist; there are many presentations and tutorials that have been telling people to use a start and end column, and assume that the start is inclusive and the end is exclusive. If there is some other application that expects data in (] format, you just use the query: select ..., prior(myperiod), last(myperiod) from mytable; With discrete ranges, that all just works (barring overflow or special values). With continuous ranges, first() or next() might fail on some values that were produced by some other application. Really, for continuous ranges, you'd need to have a query like: select ..., start(myperiod), start_inclusive(myperiod), end(myperiod), end_inclusive(myperiod) from mytable; in order to have all of the information. And that means that the application needs a full implementation of a range type to understand the inclusivity and produce a correct result. And to further make the case for allowing user-defined discrete ranges, what about ip4r? That's a discrete range, and it's user-defined. And that probably means other useful discrete ranges will be invented, as well. If we want to say that there is no discrete TIMESTAMP range by default, and that the superuser has to define it, that's one thing. But if we say that the only acceptable base types for discrete ranges will be hard-wired, that's way too restrictive. If nothing else, it makes some system types special which we have not done very many other places. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: In short, I think that while it is possible to define ranges of strings, it is not as useful as one would like. Note it is not the *range* that is the problem, it is the assumption that there's a unique next string. There's no unique next in the reals or rationals either, but we have no problem considering intervals over those sets. Yeah, agreed. It's easy (I think) to define more useful ranges of strings if you don't insist in having next. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Martijn van Oosterhout klep...@svana.org writes: But a period type will take just one or two more bytes if you don't require alignment. Alignment on a varlena type seems silly anyway, since you'll be aligning the header byte rather than the content. You might still end up paying the alignment overhead after the field, of course. But avoiding embedding the alignment in the type itself seems worth doing. One idea that might be interesting to consider in this regard is force-packing varlena range values. Consider a text range('a', 'b'). The datums are likely to come in with 4-byte headers requiring alignment. If we have the smarts to force them to 1-byte header form inside the varlena range value, not only do we save bytes right there, but we don't have to expend cycles to copy them somewhere to re-align them before we can pass them to the datatype-specific functions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Jeff Davis pg...@j-davis.com writes: On Wed, 2009-12-16 at 12:50 -0500, Tom Lane wrote: I'm still not exactly clear on what the use-case is for discrete timestamp ranges, and I wonder how many people are going to be happy with a representation that can't handle a range that's open-ended on the left. Huh? We're miscommunicating somewhere. Yeah, apparently. By open-ended I meant -infinity left bound, or null left bound if you prefer. Not sure if there's a better term. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Jeff Davis pg...@j-davis.com writes: On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote: The argument for having granularity wired into the datatype seems to boil down to just space savings. I don't find that compelling enough to justify code contortions and user-visible restrictions on functionality. The argument (at least from me) is that discrete ranges have better semantics. The counterargument was that the granularity of a timestamp is an implementation detail. So I countered by making it explicit. Making it explicit doesn't fix the fact that you can't rely on the arithmetic to be exact. I still have not seen an answer to the problem of changing the representation of a continuous range. If you have the continuous range [5, 10], you're pretty much stuck with that representation, even if the application is expecting things in the form [ ). That is not our problem. It's the application's problem if it can't handle the concept. You might as well be complaining that type numeric is broken because it can represent values that will fail to fit into float8 when some application tries to force them into that form. And to further make the case for allowing user-defined discrete ranges, what about ip4r? What about it? I don't have a problem with the concept that next() is well defined for some datatypes. I just have a problem with the concept that it's well-defined for timestamps. It's not, and I don't believe that forcing it to have some definition is useful in the real world (which, as a rule, isn't going to fit the simplifying assumptions you have to make to make it even sort-of work). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, Dec 16, 2009 at 03:57:44PM -0500, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: I still have not seen an answer to the problem of changing the representation of a continuous range. If you have the continuous range [5, 10], you're pretty much stuck with that representation, even if the application is expecting things in the form [ ). That is not our problem. It's the application's problem if it can't handle the concept. You might as well be complaining that type numeric is broken because it can represent values that will fail to fit into float8 when some application tries to force them into that form. However, it does seem reasonable to allow people to restrict, either by typmod or a check constraint the kinds of values that can be stored in a particular column. Then an application can decide which way they want their intervals to work and have the database enforce it. (Intermediate values may become a different kind, just as long as the result being stored it the right kind.) Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [HACKERS] Update on true serializable techniques in MVCC
Robert, Please forgive a couple editorial inserts to your statement -- I hope it clarifies. If I've distorted your meaning, feel free to straighten me out. :-) Robert Haas robertmh...@gmail.com wrote: This thread veered off into a discussion of the traditional [predicate locking] technique, rather than the [serializable] one in the paper. I think that's the part Alvaro was responding to. If you're right about Alvaro's concern -- my rough understanding is that HOT creates a linked lists of tuples which are mutations of one another without altering any value which is part of an index. If that's anywhere near a correct understanding, I can't see how there would be a problem with using the described locking techniques with any tuple on the list. As an aside, the thesis mentions smart use of multiple locking granularities only under the future work section. I can't see an implementation being considered production quality without that, as without it there would be no way to constrain the space required to track the locks. But there is no shortage of literature on how to do that, so I view such discussions as more appropriate to low-level implementation discussions should we ever get serious about using the techniques which are the main thrust of Dr. Cahill's thesis. If anyone is interested in reviewing recent literature on these techniques, Dr. Cahill seemed to like (Hellerstein et al., 2007), to the point where I may well track it down when I'm done pondering the work which I referenced at the start of this thread. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 15:46 -0500, Tom Lane wrote: Huh? We're miscommunicating somewhere. Yeah, apparently. By open-ended I meant -infinity left bound, or null left bound if you prefer. Not sure if there's a better term. But my proposal allowed both of those things with various flag settings (because it has a flag byte in the latest proposal). I said so explicitly. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Martijn van Oosterhout klep...@svana.org writes: However, it does seem reasonable to allow people to restrict, either by typmod or a check constraint the kinds of values that can be stored in a particular column. Then an application can decide which way they want their intervals to work and have the database enforce it. Sure --- the range datatype should absolutely provide inquiry functions that let you determine all the properties of a range, so something like CHECK (is_open_on_right(col)) would work for that. I'm of the opinion that we must not usurp typmod for range behavior --- the right thing is to pass that through to the contained type, just as we do with arrays. (Note that a range over timestamp(0) would eliminate at least some of the platform dependencies we've been arguing about. I'm still quite dubious that next timestamp is anything except evidence that you've misformulated your problem, though.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 15:57 -0500, Tom Lane wrote: Making it explicit doesn't fix the fact that you can't rely on the arithmetic to be exact. Can't rely on what arithmetic to be exact? Int64 timestamps should clearly work for granules of 1 second. If the administrator can choose a timestamp format that can't accurately represent whole seconds, that's not an argument that modeling based on whole seconds is worthless. We can restrict float timestamps for ranges as a special case, if we're that worried about people misusing them. I still have not seen an answer to the problem of changing the representation of a continuous range. If you have the continuous range [5, 10], you're pretty much stuck with that representation, even if the application is expecting things in the form [ ). That is not our problem. It's the application's problem if it can't handle the concept. You might as well be complaining that type numeric is broken because it can represent values that will fail to fit into float8 when some application tries to force them into that form. ...except that we support float8. So if applications like to work float8 float8, we let them. You're arguing that we should not support discrete time ranges (or even allow such a type to be defined by a superuser) even though applications and users may happen to model time that way. Who are we to argue that all of those people are so wrong that we won't even support their type? Especially when they may have just finished a couple books on the subject[1][2] which told them to model it that way? And to further make the case for allowing user-defined discrete ranges, what about ip4r? What about it? I don't have a problem with the concept that next() is well defined for some datatypes. Ok, so we'll allow users to specify user-defined types for discrete ranges? How should that be specified, and how will that differ from earlier proposals? Earlier you proposed that we hard-wire the set of types that are allowed to be used for discrete ranges: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01278.php Regards, Jeff Davis [1] Temporal Data and the Relational Model by C.J. Date, et al. [2] Developing Time-Oriented Database Applications in SQL by Richard Snodgrass. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
Tom Lane wrote: Martijn van Oosterhout klep...@svana.org writes: However, it does seem reasonable to allow people to restrict, either by typmod or a check constraint the kinds of values that can be stored in a particular column. Then an application can decide which way they want their intervals to work and have the database enforce it. Sure --- the range datatype should absolutely provide inquiry functions that let you determine all the properties of a range, so something like CHECK (is_open_on_right(col)) would work for that. I'm of the opinion that we must not usurp typmod for range behavior --- the right thing is to pass that through to the contained type, just as we do with arrays. (Note that a range over timestamp(0) would eliminate at least some of the platform dependencies we've been arguing about. I'm still quite dubious that next timestamp is anything except evidence that you've misformulated your problem, though.) regards, tom lane Well our work is based on over 15 years of temporal research (not by us) and numerous books from Snodgrass, Date and Celko; as well as partial implementations in other databases. So its not like we took a blue pill this weekend and woke up with this hair-brained idea. I understand your concern. But I think the objections are based more on implementation details with float timestamp rather than conceptually. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] determine snapshot after obtaining locks for first statement
The Cahill thesis mentions an interesting optimization -- they defer determination of the snapshot until after any locks required for the first statement have been acquired. Where the first statement was, for example, an UPDATE, this reduced re-reads or rollbacks in the face of concurrent modifications. Does PostgreSQL currently do this? If not, would it make sense? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls and pg_migrator
Takahiro Itagaki wrote: KaiGai Kohei kai...@kaigai.gr.jp wrote: Can SELECT lo_create(16385); help this situation? SELECT lo_create(loid) FROM (SELECT DISTINCT loid FROM pg_largeobject) AS t would work for pg_migrator. I'm not clear whether we also check pg_largeobejct has chunks with same LOID on lo_create(). In the regular operation, it shall never happen. I think the omission is a reasonable optimization. Thanks, I have updated pg_migrator to use your suggested method. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types
On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote: For example, if you're trying to do classroom scheduling, it might be useful to constrain the periods to start and end on hour boundaries --- but the next thing you'll want is to have it know that the next slot after 5pm Friday is 8am Monday. Except on holidays. And then there's the fact that my alma mater starts most hour-long classes on the half hour. Data types are only a first-level constraint -- a domain of reasonable values. The class isn't going to start on a fraction-of-a-minute boundary, so it would be reasonable to reject those values early. I never suggested that next() should be such a highly business-dependent function as you suggest above (skipping holidays, etc); it should just return the next value in the domain (if it's discrete). Surely you wouldn't suggest that the ipv4 data type's next() function should skip over addresses that aren't in a valid subnet on your network. But you seem to think those make useful discrete ranges. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Largeobject Access Controls (r2460)
On Thu, Dec 10, 2009 at 10:41 PM, Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote: KaiGai Kohei kai...@ak.jp.nec.com wrote: What's your opinion about: long desc: When turned on, privilege checks on large objects perform with backward compatibility as 8.4.x or earlier releases. I updated the description as your suggest. Applied with minor editorialization, mainly around tab-completion support in psql. The documentation in this patch needs work. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers